An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by TextCorpusLabs

A curated list of projects in awesome lists by TextCorpusLabs .

https://github.com/textcorpuslabs/wikimedia

Walk through to convert WikiMedia into a text corpus

python3 text-corpus wikimedia

Last synced: 22 Mar 2025

https://github.com/textcorpuslabs/building-blocks

Building blocks for text pre-processing

python3 text-processing

Last synced: 22 Mar 2025

https://github.com/textcorpuslabs/vlngramcounter

NGram counter for large datasets

ngrams python

Last synced: 22 Mar 2025

https://github.com/textcorpuslabs/getting-started

Getting started at Text Corpus Labs

Last synced: 22 Mar 2025

https://github.com/textcorpuslabs/edgar

Create a corpus from EDGAR data

corpus edgar-scraper python3

Last synced: 16 May 2026

https://github.com/textcorpuslabs/njgovnews

Web scraping of the New Jersey news feeds

newsfeed python3 text-corpus

Last synced: 22 Mar 2025

https://github.com/textcorpuslabs/covid19

Walk through to convert Kaggle's COVID-19 Open Research Dataset Challenge into a text corpus

covid-19 python3 text-corpus

Last synced: 10 May 2026

https://github.com/textcorpuslabs/congressional-votes

Walk through to convert congressional roll call votes into a text corpus

congress-votes python3 text-corpus us-congress

Last synced: 16 May 2026

https://github.com/textcorpuslabs/oas

Walk through to convert PMC OAS Dataset into a text corpus

oas python3 text-corpus

Last synced: 23 Feb 2026