Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yoannchb-pro/google-img-scrap
Scrap images from google images with customs pre filled google dork options
https://github.com/yoannchb-pro/google-img-scrap
dork google google-dorks google-image-search google-images image inverse-search options query scraper
Last synced: 3 months ago
JSON representation
Scrap images from google images with customs pre filled google dork options
- Host: GitHub
- URL: https://github.com/yoannchb-pro/google-img-scrap
- Owner: yoannchb-pro
- License: mit
- Created: 2021-12-28T00:55:03.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2024-01-31T23:03:12.000Z (about 1 year ago)
- Last Synced: 2024-10-24T22:11:07.422Z (3 months ago)
- Topics: dork, google, google-dorks, google-image-search, google-images, image, inverse-search, options, query, scraper
- Language: TypeScript
- Homepage:
- Size: 1.15 MB
- Stars: 18
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Google-img-scrap
Scrap images from google images with customs pre filled dorking options
## Update
- See [changelog](CHANGELOG.md)
## Found a bug ?
- Tell it in my github issues dont be afraid :)
## Installation
```
npm i google-img-scrap
```## Import
```js
const {
GOOGLE_IMG_SCRAP,
GOOGLE_IMG_INVERSE_ENGINE_URL,
GOOGLE_IMG_INVERSE_ENGINE_UPLOAD,
GOOGLE_QUERY,
} = require("google-img-scrap");
// OR
import {
GOOGLE_IMG_SCRAP,
GOOGLE_IMG_INVERSE_ENGINE_URL,
GOOGLE_IMG_INVERSE_ENGINE_UPLOAD,
GOOGLE_QUERY,
} from "google-img-scrap";
```## Options definition
- "search" `string` what you want to search
- "proxy" `AxiosProxyConfig` configure a proxy with axios proxy
- "excludeWords" `string[]` exclude some words from the search
- "domains" `string[]` filter by domains
- "excludeDomains" `string[]` exclude some domains
- "safeSearch" `boolean` active safe search or not for nsfw for example
- "custom" `string` add extra query
- "urlMatch" `string[][]` get image when an url match a string (example: "cdn") | `example below`
- "filterByTitles" `string[][]` filter images by titles | `example below`
- "query" `GoogleQuery` set a query (can be [TYPE, DATE, COLOR, SIZE, LICENCE, EXTENSION]) (use GOOGLE_QUERY items | `example below`
- "limit" `number` to limit the size of the results## Result
```js
{
url: 'https://images.google.com/search?tbm=isch&tbs=&q=cats',
search: "cats",
result: [
{
id: 'K6Qd9XWnQFQCoM',
title: 'Domestic cat',
url: 'https://i.natgeofe.com/n/548467d8-c5f1-4551-9f58-6817a8d2c45e/NationalGeographic_2572187_2x1.jpg',
originalUrl: 'https://www.nationalgeographic.com/animals/mammals/facts/domestic-cat',
height: 1536,
width: 3072
},
{
id: 'HkevFQZ5DYu7oM',
title: 'Cat - Wikipedia',
url: 'https://upload.wikimedia.org/wikipedia/commons/1/15/Cat_August_2010-4.jpg',
originalUrl: 'https://en.wikipedia.org/wiki/Cat',
height: 2226,
width: 3640
},
...
]
}
```## How to use ?
## Simple example
Search cats images
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
});console.log(test);
```## Reverse search engine
The second parameter is like GOOGLE_IMG_SCRAP it include all type of options omitting search. (Omit)
### With an url (cost: 2 request)
```js
const test = await GOOGLE_IMG_INVERSE_ENGINE_URL(
"https://upload.wikimedia.org/wikipedia/commons/1/15/Cat_August_2010-4.jpg",
{ limit: 5 }
);console.log(test);
```### With a local image (cost: 3 request)
```js
const imageBuffer = fs.readFileSync("demonSlayer.png");
const test = await GOOGLE_IMG_INVERSE_ENGINE_UPLOAD(imageBuffer, {
limit: 5,
});console.log(test);
```## Custom query
All query options are optional (see below for all the options) and need to be in uppercase. You can combine as much as you want.
Find all possible query options below.```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
query: {
TYPE: GOOGLE_QUERY.TYPE.CLIPART,
LICENCE: GOOGLE_QUERY.LICENCE.COMMERCIAL_AND_OTHER,
EXTENSION: GOOGLE_QUERY.EXTENSION.JPG,
},
});console.log(test);
```## Limit result size
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
limit: 5,
});console.log(test);
```## Proxy
See axios documentation to setup the proxy
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
proxy: {
protocol: "https",
host: "example.com",
port: 8080,
},
});console.log(test);
```## Domains
Only scrap from a specific domain
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
domains: ["alamy.com", "istockphoto.com", "vecteezy.com"],
});console.log(test);
```## Exclude domains
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
excludeDomains: ["istockphoto.com", "alamy.com"],
});console.log(test);
```## Exclude words
If you don' like black cats and white cats
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
excludeWords: ["black", "white"], //If you don't like black cats and white cats
});console.log(test);
```## Safe search (no nsfw)
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
safeSearch: false,
});console.log(test);
```## Custom query params
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
custom: "name=content&name2=content2",
});console.log(test);
```## How urlMatch and filterByTitles work ?
```js
const test = await GOOGLE_IMG_SCRAP({
search: "cats",
//will build something like this "(draw and white) or (albino and white)"
filterByTitles: [
["draw", "white"],
["albino", "white"],
],
//will build something like this "(cdn and wikipedia) or (cdn istockphoto)"
urlMatch: [
["cdn", "wikipedia"],
["cdn", "istockphoto"],
],
});console.log(test);
```## Google query
```js
{
SIZE: {
LARGE,
MEDIUM,
ICON
},
COLOR: {
BLACK_AND_WHITE,
TRANSPARENT,
RED,
BLUE,
PURPLE,
ORANGE,
YELLOW,
GREEN,
TEAL,
PINK,
WHITE,
GRAY,
BLACK,
BROWN
},
TYPE: {
CLIPART,
DRAW,
GIF
},
EXTENSION: {
JPG,
GIF,
BMP,
PNG,
SVG,
WEBP,
ICO,
RAW
},
DATE: {
DAY,
WEEK,
MONTH,
YEAR
},
LICENCE: {
CREATIVE_COMMONS,
COMMERCIAL_AND_OTHER
}
}
```