{"id":18552476,"url":"https://github.com/andrejewski/slinky","last_synced_at":"2025-04-09T22:31:55.928Z","repository":{"id":19342863,"uuid":"22581926","full_name":"andrejewski/slinky","owner":"andrejewski","description":" web crawler just for links","archived":false,"fork":false,"pushed_at":"2014-08-07T19:43:58.000Z","size":144,"stargazers_count":11,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-05T00:02:40.344Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"isc","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andrejewski.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-08-03T19:25:35.000Z","updated_at":"2019-03-28T16:39:31.000Z","dependencies_parsed_at":"2022-08-28T03:21:09.910Z","dependency_job_id":null,"html_url":"https://github.com/andrejewski/slinky","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrejewski%2Fslinky","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrejewski%2Fslinky/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrejewski%2Fslinky/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrejewski%2Fslinky/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andrejewski","download_url":"https://codeload.github.com/andrejewski/slinky/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248123755,"owners_count":21051526,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T21:14:20.284Z","updated_at":"2025-04-09T22:31:55.550Z","avatar_url":"https://github.com/andrejewski.png","language":"JavaScript","readme":"Slinky\n======\n\nSlinky is a web crawler, but just for the links between webpages. Slinky is intended to be used to visualize the routes and structure behind a website by collecting hyperlinks.\n\nIf you decide to print out the source code and drop it down a flight of stairs, you may not be disappointed either.\n\n## Installation\n\n```bash\nnpm install slinky\n```\n\n## Usage\n\nSlinky is straightforward to use. Give Slinky a URL and it will index the webpages in that domain.\n\n```javascript\nvar slinky = require('slinky');\nslinky.index('http://example.com', function(error, links) {\n\tif(error) throw error;\n\tArray.isArray(links); // true\n\tconsole.dir(links); \n\t/*\n\t\t[\n\t\t\t\"http://example.com/\", \n\t\t\t\"http://example.com/about.html\",\n\t\t\t...\n\t\t]\n\t*/\n});\n```\n\n## Slinky Class\n\nSlinky is a class that accepts optional configuration options.\n\n```javascript\nvar Slinky = require('slinky').Slinky;\n\nnew Slinky({ // `new` is optional\n\t// default options\n\tlimit: 100,\t\t// limit the number of links returned\n\tdepth: 3,\t\t// limit recursion of the index \n\trestrict: true,\t// limit indexing to the domain of the url\n\tconcurrency: 5\t// how many async.queue workers to use\n});\n```\n\n### Slinky#index()\n- `#index(\n\turl String,\n\tdone Callback(error Error, links Array[String]))`\n- `#index(\n\turl String,\n\teach Callback(link String), \n\tdone Callback(error Error, links Array[String]))`\n\nThe `each` callback will receive each scraped link as they are processed. This is a method of streaming the links instead of waiting for the `done` callback.\n\nThe `#index()` is the only method that actually does anything. The other methods of the Slinky class are exposed purely for customization of Slinky. \n\nWhile the source is there to be read, some overridable methods to note are `#scrapeLinks()` if anchor tags are not what you are targeting and `#validResponse()` if webpages do not have to be HTML. Again, everything is configurable.\n\n## Contributing\n\nContributions are incredibly welcome as long as they are standardly applicable and pass the tests (or break bad ones). Tests are written in Mocha and assertions are done with the Node.js core `assert` module.\n\n```bash\n# running tests\nnpm run test\nnpm run test-spec # spec reporter\n```\n\nFollow me on [Twitter](https://twitter.com/compooter) for updates or just for the lolz and please check out my other [repositories](https://github.com/andrejewski) if I have earned it. I thank you for reading.\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrejewski%2Fslinky","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrejewski%2Fslinky","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrejewski%2Fslinky/lists"}