Projects in Awesome Lists by pushshift
A curated list of projects in awesome lists by pushshift .
https://github.com/pushshift/reddit_sse_stream
A Server Side Event stream to deliver Reddit comments and submissions in near real-time to a client.
bigdata flask reddit server-side-events sse stream
Last synced: 30 Jun 2025
https://github.com/pushshift/zreader
Read compressed NDJSON .zst files easily
Last synced: 02 May 2025
https://github.com/pushshift/rinzler
A high performance indexing and search system for managing big data
Last synced: 07 Jul 2025
https://github.com/pushshift/parallel-ndjson-reader
Parallel NDJSON Reader for Python
json multiprocessing ndjson newline parallel parallel-processing python
Last synced: 02 May 2025
https://github.com/pushshift/reddit-bot-detector
Script to extract highly probable bots for further analysis
Last synced: 10 Sep 2025
https://github.com/pushshift/imdb_to_json
Fetch movie data from IMDB and output in JSON format.
imdb imdb-dataset imdb-webscrapping json python3
Last synced: 02 May 2025
https://github.com/pushshift/trump_tweets
Code example and data for all available Trump Tweets
Last synced: 02 May 2025
https://github.com/pushshift/python-zstandard-compression-test
Python script to test the zstandard module
compression compression-dictionaries decompression zst zstandard
Last synced: 02 May 2025
https://github.com/pushshift/token_manager
Code to handle multiple Twitter user access tokens when making requests
Last synced: 02 May 2025
https://github.com/pushshift/ndjson_processor
High Speed multiprocessing ndjson processor
Last synced: 02 May 2025
https://github.com/pushshift/gab_mastodon
Ingest scripts and Elasticsearch Mapping for Gab's new Mastodon Site
Last synced: 02 May 2025
https://github.com/pushshift/big-data-scripts
Miscellaneous Python and Perl Scripts for working with Big Data files that are new-line delimited JSON Objects
Last synced: 13 Jul 2025
https://github.com/pushshift/us_election_data
Code to grab election data from CNN's election data API
Last synced: 01 Mar 2025
https://github.com/pushshift/docker-reddit-sphinxsearch
Docker container for sphinxsearch -- used for adding Reddit comments for search
Last synced: 17 Oct 2025
https://github.com/pushshift/tweet-id_components
Go program to extract tweet id components
Last synced: 01 Mar 2025
https://github.com/pushshift/extract_json_from_html
This script will make it much easier to extract a JSON object from HTML (e.g. getting Tiktok data)
Last synced: 10 Sep 2025
https://github.com/pushshift/realdonaldtrump
Archive of tweets from the @realdonaldtrump Twitter account
Last synced: 02 Feb 2026
https://github.com/pushshift/officer_dot_com
Example code to start parsing data from the website officer.com
Last synced: 01 Mar 2025
https://github.com/pushshift/ap_story_fetcher
Associated Press Story Fetcher
Last synced: 01 Mar 2025
https://github.com/pushshift/archiver
Server to handle POST requests with raw data which then archives the data permanently
Last synced: 01 Mar 2025
https://github.com/pushshift/parse_wiki_tables
Simple Example to parse out data from Wikipedia tables using selectolax
Last synced: 01 Mar 2025
https://github.com/pushshift/browser_extension_parser
Parser module for Facebook observations returned from the browser extension
Last synced: 01 Mar 2025
https://github.com/pushshift/binary_search
Example of a binary search implementation using real data (Reddit author info)
binarysearch python3 reddit search
Last synced: 01 Mar 2025
https://github.com/pushshift/meetup
Code for ingesting meetup.com streams (comments, photos, etc.)
Last synced: 08 Sep 2025
https://github.com/pushshift/matplotlib_samples
Example Code for Matplotlib
Last synced: 23 Nov 2025
https://github.com/pushshift/json-flatten
This code shows how to flatten nested keys which can help convert a nest JSON object into CSV, etc.
Last synced: 12 Jun 2025