Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chrisvilches/extensible-image-scraper
https://github.com/chrisvilches/extensible-image-scraper
async-await express javascript nodejs ramda react redux scraper scraping
Last synced: 5 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/chrisvilches/extensible-image-scraper
- Owner: ChrisVilches
- Created: 2018-06-25T14:38:15.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-07-30T03:22:06.000Z (over 6 years ago)
- Last Synced: 2024-11-22T07:26:50.071Z (2 months ago)
- Topics: async-await, express, javascript, nodejs, ramda, react, redux, scraper, scraping
- Language: JavaScript
- Size: 241 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Extensible Image Scraper
## How to run
Run a minio docker:
```bash
docker run -p 9000:9000 --name minio1 -v /mnt/data:/data -v /mnt/config:/root/.minio minio/minio server /data
```No need to download anything, as this command will do so automatically and it'll then start Minio on port 9000.
Install each app:
```bash
npm install
```Start the backend app (by default `localhost:3000`):
```bash
# These keys are printed on the console when you run a minio Docker container
export minioAccessKey=15NTOWZG8EKEDSW7J6UW
export minioSecretKey=FEC9qsLIYhKDT7Shns7NY8uImRVBsMqNJO5tBdVr
npm start
```Start the frontend (`localhost:8080`):
```
npm start
```## Examples
### Example 1
```
localhost:3000/?url=https://twitter.com/aamir_khan
```↓↓
```json
{
"appleTouchIcon": "https://abs.twimg.com/icons/apple-touch-icon-192x192.png",
"allImages": [
"https://pbs.twimg.com/profile_banners/88856792/1479290034/1500x500",
"https://pbs.twimg.com/profile_images/798826399725297664/4awXtggx_400x400.jpg",
"https://pbs.twimg.com/profile_images/798826399725297664/4awXtggx_normal.jpg",
"https://pbs.twimg.com/profile_images/798826399725297664/4awXtggx_bigger.jpg",
"https://pbs.twimg.com/media/DfJBh8VX4Ac5Zak.jpg",
"https://pbs.twimg.com/media/Deb6oI3W4AALui2.jpg",
"https://pbs.twimg.com/media/Deb6zzcW4AAuEZ3.jpg",
"https://pbs.twimg.com/media/Deb60rlXUAAorjE.jpg",
"https://pbs.twimg.com/media/Deb61amX0AIzrvv.jpg",
"https://abs.twimg.com/emoji/v2/72x72/2764.png",
"https://pbs.twimg.com/media/DcG4Sd4X0AEEBzE.jpg",
"https://pbs.twimg.com/media/Db9Qa7cXUAES2Wd.jpg"
],
"twitterIcon": "https://pbs.twimg.com/profile_images/798826399725297664/4awXtggx_400x400.jpg",
"favicon": "https://abs.twimg.com/favicons/favicon.ico"
}
```### Example 2 (persist into minio)
When you add the `save` query param with a `true` value, it will choose the image it thinks
represents the website best (using some algorithm I've probably not even implemented yet),
and it will save that image into minio and return its key to the client.```
localhost:3000/?url=https://twitter.com/aamir_khan&save=true
```↓↓
```json
{
"imageUrl": "http://localhost:3000/img/2018624829333.jpg",
"etag": "3758470345ff768aaaa27ea5892039c3"
}
```The image will be available at the given URL `http://localhost:3000/img/2018624829333.jpg`.
## To do
* Convert `.ico` files to `.jpg` in case the selected image is an `.ico` file.
* Turn hardcoded things into config files.