https://github.com/codesandbox/site-image-crawler-with-node
https://github.com/codesandbox/site-image-crawler-with-node
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/codesandbox/site-image-crawler-with-node
- Owner: codesandbox
- Created: 2022-11-15T18:57:34.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-07-11T17:57:04.000Z (almost 3 years ago)
- Last Synced: 2025-04-05T17:46:52.112Z (about 1 year ago)
- Language: TypeScript
- Size: 27.3 KB
- Stars: 5
- Watchers: 4
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Site-Image-crawler
This is web crawler built using cheerio js and node-fetch.
## What is a web crawler?
This is a program or automated script which browses the World Wide Web in a methodical, automated manner.
## What does this particular web crawler do?
It goes through a site, identifies all the link paths and gets back the images on each link page. Right now these links are logged to the console and the files are created in you filesystem. You can extend this application and save to a downloadable folder.
## Usage
Add the site URL you want to crawl images for at the end of the functions
```
crawl({
url: "", // Add Site URL here
});
```
## Resources
- Add your [configuration](https://codesandbox.io/docs/projects/learn/setting-up/tasks) to optimize it for [CodeSandbox](https://codesandbox.io/p/dashboard).
- [CodeSandbox Projects — Docs](https://codesandbox.io/docs/projects)
- [CodeSandbox — Discord](https://discord.gg/Ggarp3pX5H)