Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/webrecorder/pywb
Core Python Web Archiving Toolkit for replay and recording of web archives
https://github.com/webrecorder/pywb
python pywb wayback web-archives web-archiving
Last synced: 2 days ago
JSON representation
Core Python Web Archiving Toolkit for replay and recording of web archives
- Host: GitHub
- URL: https://github.com/webrecorder/pywb
- Owner: webrecorder
- License: gpl-3.0
- Created: 2013-12-09T03:30:31.000Z (about 11 years ago)
- Default Branch: main
- Last Pushed: 2024-08-19T14:03:14.000Z (4 months ago)
- Last Synced: 2024-10-29T15:03:57.519Z (about 1 month ago)
- Topics: python, pywb, wayback, web-archives, web-archiving
- Language: JavaScript
- Homepage: https://pypi.python.org/pypi/pywb
- Size: 32.8 MB
- Stars: 1,377
- Watchers: 61
- Forks: 216
- Open Issues: 166
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGES.rst
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-digital-preservation - pywb - Core Python Web Archiving Toolkit for replay and recording of web archives (Replay tools / Crawlers)
- awesome-network-stuff - **605**星
- awesome-starred - webrecorder/pywb - Core Python Web Archiving Toolkit for replay and recording of web archives (python)
README
Webrecorder pywb 2.8
====================.. image:: https://raw.githubusercontent.com/webrecorder/pywb/main/pywb/static/pywb-logo.png
.. image:: https://github.com/webrecorder/pywb/workflows/CI/badge.svg
:target: https://github.com/webrecorder/pywb/actions
.. image:: https://codecov.io/gh/webrecorder/pywb/branch/main/graph/badge.svg
:target: https://codecov.io/gh/webrecorder/pywbWeb Archiving Tools for All
---------------------------`View the full pywb documentation `_
**pywb** is a Python 3 web archiving toolkit for replaying web archives large and small as accurately as possible.
The toolkit now also includes new features for creating high-fidelity web archives.This toolset forms the foundation of Webrecorder project, but also provides a generic web archiving toolkit
that is used by other web archives, including the traditional "Wayback Machine" functionality.New Features
^^^^^^^^^^^^The 2.x release included a major overhaul of pywb and introduces many new features, including the following:
* Dynamic multi-collection configuration system with no-restart updates.
* New recording capability to create new web archives from the live web or other archives.
* Componentized architecture with standalone Warcserver, Recorder and Rewriter components.
* Support for Memento API aggregation and fallback chains for querying multiple remote and local archival sources.
* HTTP/S Proxy Mode with customizable certificate authority for proxy mode recording and replay.
* Flexible rewriting system with pluggable rewriters for different content-types.
* Standalone, modular `client-side rewriting system (wombat.js) `_ to handle most modern web sites.
* Improved 'calendar' query UI with incremental loading, grouping results by year and month, and updated replay banner.
* Extensible UI customizations system for modifying all aspects of the UI.
* Robust access control system for blocking or excluding URLs, by prefix or by exact match.
* New in 2.6: Access Control embargo and http-header control access settings.
* New in 2.6: Support for localization and multi-language deployment.
* New in 2.7: New banner/calendar UI written in `Vue `_, with interactive timeline and easier theming of colors and logo via ``config.yaml``.
Please see the `full documentation `_ for more detailed info on all these features.
Installation for Deployment
---------------------------To install pywb for usage, you can use:
``pip install pywb``
Note: depending on your Python installation, you may have to use `pip3` instead of `pip`.
Installation from local copy
----------------------------``git clone https://github.com/webrecorder/pywb``
To install from a locally cloned copy, install with ``pip install -e .`` or ``python setup.py install``.
To run tests, we recommend installing ``pip install tox tox-current-env`` and then running ``tox --current-env`` to test in your current Python environment.
To Build docs locally, run: ``cd docs; make html``. (The docs will be built in ``./_build/html/index.html``)
Running
-------After installation, you can run ``pywb`` or ``wayback``.
Consult the local or `online docs `_ for latest usage and configuration details.
Documentation
-------------The pywb documentation is extensive. Some links to a few key guides:
* `Getting Started Guide `_
* `Embargo and Access Control Guide `_
* `Localization and Multi-Language Guide `_
* `Deployment Guide `_
* `OpenWayback Transition Guide `_
Contributions & Bug Reports
---------------------------Users are encouraged to fork and contribute to this project to keep improving web archiving tools. Please consult the `contributing guide `_ for information on how to contribute to pywb.