https://github.com/project-polymorph/web_downloader
Scripts to search and download from websites
https://github.com/project-polymorph/web_downloader
Last synced: 5 months ago
JSON representation
Scripts to search and download from websites
- Host: GitHub
- URL: https://github.com/project-polymorph/web_downloader
- Owner: project-polymorph
- License: mit
- Created: 2024-11-11T00:48:35.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-04T09:18:12.000Z (over 1 year ago)
- Last Synced: 2025-03-24T17:05:48.929Z (over 1 year ago)
- Language: Python
- Size: 23.6 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# downloader
This is part of the chinese transgender digital archive project.
Scripts and results for searching and downloading webpages.
## Search
- puppeteer: search for webpages using puppeteer.
- serper: search for webpages using serper
- googlecustom: search for webpages using google custom search json API
- google: search for webpages using google python library
Run ./gen_links to summary all links into a yml file.
## download
See download.
Currently, support webpages and pdfs.
## LICENSE
MIT