{"id":13823754,"url":"https://github.com/k1LoW/utsusemi","last_synced_at":"2025-07-08T18:30:57.952Z","repository":{"id":66825456,"uuid":"92459273","full_name":"k1LoW/utsusemi","owner":"k1LoW","description":"A tool to generate a static website by crawling the original site.","archived":false,"fork":false,"pushed_at":"2017-11-28T02:21:19.000Z","size":195,"stargazers_count":30,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-17T06:55:00.852Z","etag":null,"topics":["api","aws","aws-lambda","crawler","s3-website","serverless","serverless-framework"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/k1LoW.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-05-26T01:27:25.000Z","updated_at":"2022-08-19T18:09:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"98b26a93-2662-471d-86c0-55c680976481","html_url":"https://github.com/k1LoW/utsusemi","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/k1LoW/utsusemi","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k1LoW%2Futsusemi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k1LoW%2Futsusemi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k1LoW%2Futsusemi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k1LoW%2Futsusemi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/k1LoW","download_url":"https://codeload.github.com/k1LoW/utsusemi/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/k1LoW%2Futsusemi/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264323868,"owners_count":23590755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","aws","aws-lambda","crawler","s3-website","serverless","serverless-framework"],"created_at":"2024-08-04T09:00:43.133Z","updated_at":"2025-07-08T18:30:57.640Z","avatar_url":"https://github.com/k1LoW.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# utsusemi [![Build Status](https://travis-ci.org/k1LoW/utsusemi.svg?branch=master)](https://travis-ci.org/k1LoW/utsusemi)\n\n![logo](logo.png)\n\nutsusemi = \"[空蝉](http://ffxiclopedia.wikia.com/wiki/Utsusemi)\"\n\nA tool to generate a static website by crawling the original site.\n\n## Using framework\n\n- Serverless Framework :zap:\n\n## How to deploy\n\n### :octocat: STEP 1. Clone\n\n```console\n$ git clone https://github.com/k1LoW/utsusemi.git\n$ cd utsusemi\n$ npm install\n```\n\n### :pencil: STEP 2. Set environment variables OR Edit config.yml\n\nSet environment variables.\n\nOR\n\nCopy [`config.example.yml`](config.example.yml) to `config.yml`. And edit.\n\nEnvironment / config.yml Document is [here](docs/env.md) :book: .\n\n### :rocket: STEP 3. Deploy to AWS\n\n```console\n$ AWS_PROFILE=XXxxXXX npm run deploy\n```\n\nAnd get endpoints URL and `UtsusemiWebsiteURL`\n\n#### :bomb: Destroy utsusemi\n\nRun following command.\n\n```console\n$ AWS_PROFILE=XXxxXXX npm run destroy\n```\n\n## Usage\n\n### Start crawling `/in?path={startPath}\u0026depth={crawlDepth}`\n\nStart crawling to targetHost.\n\n```console\n$ curl https://xxxxxxxxxx.execute-api.ap-northeast-1.amazonaws.com/v0/in?path=/\u0026depth=3\n```\n\nAnd, access `UtsusemiWebsiteURL`.\n\n#### `force` option\n\nDisable cache\n\n```console\n$ curl https://xxxxxxxxxx.execute-api.ap-northeast-1.amazonaws.com/v0/in?path=/\u0026depth=3\u0026force=1\n```\n\n### Purge crawling queue `/purge`\n\nCancel crawling.\n\n```console\n$ curl https://xxxxxxxxxx.execute-api.ap-northeast-1.amazonaws.com/v0/purge\n```\n\n### Delete object of utsusemi content `/delete?prefix={objectPrefix}`\n\nDelete S3 object.\n\n```console\n$ curl https://xxxxxxxxxx.execute-api.ap-northeast-1.amazonaws.com/v0/delete?path=/\n```\n\n### Show crawling queue status `/status`\n\n```console\n$ curl https://xxxxxxxxxx.execute-api.ap-northeast-1.amazonaws.com/v0/status\n```\n\n### Set N crawling action `POST /nin`\n\nStart crawling to targetHost with N crawling action.\n\n```console\n$ curl -X POST -H \"Content-Type: application/json\" -d @nin-sample.json https://xxxxxxxxxx.execute-api.ap-northeast-1.amazonaws.com/v0/nin\n```\n\n## Architecture\n\n![Architecture](architecture.png)\n\n### Crawling rule\n\n- HTML -\u003e `depth = depth - 1`\n- CSS -\u003e The source request in the CSS does not consume `depth`.\n- Other contents -\u003e End ( `depth = 0` )\n- 403, 404, 410 -\u003e Delete S3 object\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fk1LoW%2Futsusemi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fk1LoW%2Futsusemi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fk1LoW%2Futsusemi/lists"}