Projects in Awesome Lists by TextCorpusLabs
A curated list of projects in awesome lists by TextCorpusLabs .
https://github.com/textcorpuslabs/wikimedia
Walk through to convert WikiMedia into a text corpus
Last synced: 22 Mar 2025
https://github.com/textcorpuslabs/building-blocks
Building blocks for text pre-processing
Last synced: 22 Mar 2025
https://github.com/textcorpuslabs/vlngramcounter
NGram counter for large datasets
Last synced: 22 Mar 2025
https://github.com/textcorpuslabs/getting-started
Getting started at Text Corpus Labs
Last synced: 22 Mar 2025
https://github.com/textcorpuslabs/njgovnews
Web scraping of the New Jersey news feeds
Last synced: 22 Mar 2025
https://github.com/textcorpuslabs/covid19
Walk through to convert Kaggle's COVID-19 Open Research Dataset Challenge into a text corpus
Last synced: 10 May 2026
https://github.com/textcorpuslabs/congressional-votes
Walk through to convert congressional roll call votes into a text corpus
congress-votes python3 text-corpus us-congress
Last synced: 16 May 2026
https://github.com/textcorpuslabs/oas
Walk through to convert PMC OAS Dataset into a text corpus
Last synced: 23 Feb 2026