https://github.com/justanotherarchivist/qwarc
A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc
https://github.com/justanotherarchivist/qwarc
Last synced: about 1 year ago
JSON representation
A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc
- Host: GitHub
- URL: https://github.com/justanotherarchivist/qwarc
- Owner: JustAnotherArchivist
- License: gpl-3.0
- Created: 2019-04-22T22:59:53.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2021-05-09T16:34:18.000Z (about 5 years ago)
- Last Synced: 2025-03-19T01:44:44.959Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 67.4 KB
- Stars: 27
- Watchers: 2
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# qwarc
qwarc is a framework for rapidly archiving a large number of URLs with little overhead. This is achieved primarily by using many parallel connections (including across multiple processes) and not employing any HTML parsing or other processing.
***Use qwarc responsibly. It can easily overwhelm web servers.***
## License
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see .