https://github.com/claromes/waybacktweets
Retrieves archived tweets from Wayback Machine in HTML, CSV, and JSON
https://github.com/claromes/waybacktweets
internet-archive osint osint-tools socmint twitter wayback-machine wayback-tweets x
Last synced: about 1 month ago
JSON representation
Retrieves archived tweets from Wayback Machine in HTML, CSV, and JSON
- Host: GitHub
- URL: https://github.com/claromes/waybacktweets
- Owner: claromes
- License: gpl-3.0
- Created: 2023-05-11T04:02:02.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-02T22:46:02.000Z (2 months ago)
- Last Synced: 2025-04-03T11:15:31.378Z (about 1 month ago)
- Topics: internet-archive, osint, osint-tools, socmint, twitter, wayback-machine, wayback-tweets, x
- Language: Python
- Homepage: https://claromes.github.io/waybacktweets/
- Size: 6.63 MB
- Stars: 100
- Watchers: 4
- Forks: 28
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE.md
- Citation: CITATION.cff
Awesome Lists containing this project
README
# Wayback Tweets
[](https://pypi.org/project/waybacktweets) [](https://doi.org/10.5281/zenodo.12528447) [](https://waybacktweets.streamlit.app) [](https://colab.research.google.com/drive/1tnaM3rMWpoSHBZ4P_6iHFPjraWRQ3OGe?usp=sharing)
Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see [Field Options](https://claromes.github.io/waybacktweets/field_options.html)), and saves the data in HTML, for easy viewing of the tweets using the iframe tags, CSV, and JSON formats.
## Installation
```shell
pip install waybacktweets
```## Quickstart
### Using Wayback Tweets as a standalone command line tool
waybacktweets [OPTIONS] USERNAME
```shell
waybacktweets --from 20150101 --to 20191231 --limit 250 jack
```### Using Wayback Tweets as a Web App
[Open the application](https://waybacktweets.streamlit.app), a prototype written in Python with the Streamlit framework and hosted on Streamlit Cloud.
### Using Wayback Tweets as a Python Module
```python
from waybacktweets import WaybackTweets, TweetsParser, TweetsExporterUSERNAME = "jack"
api = WaybackTweets(USERNAME)
archived_tweets = api.get()if archived_tweets:
field_options = [
"archived_timestamp",
"original_tweet_url",
"archived_tweet_url",
"archived_statuscode",
]parser = TweetsParser(archived_tweets, USERNAME, field_options)
parsed_tweets = parser.parse()exporter = TweetsExporter(parsed_tweets, USERNAME, field_options)
exporter.save_to_csv()
```## Documentation
- [Wayback Tweets documentation](https://claromes.github.io/waybacktweets)
- [Wayback CDX Server API (Beta) documentation](https://archive.org/developers/wayback-cdx-server.html)## Acknowledgements
- Tristan Lee (Bellingcat's Data Scientist) for the idea of the application.
- Jessica Smith (Snowflake's Community Growth Specialist) and Streamlit/Snowflake team for the additional server resources on Streamlit Cloud.
- OSINT Community for recommending the application.> [!NOTE]
> If the Streamlit application is down, please check the [Streamlit Cloud Status](https://www.streamlitstatus.com/).