Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with wikipedia-corpus

A curated list of projects in awesome lists tagged with wikipedia-corpus .

https://github.com/GermanT5/wikipedia2corpus

Wikipedia text corpus for self-supervised NLP model training

corpus german-nlp machine-learning nlp somajo wikipedia wikipedia-corpus

Last synced: 31 Oct 2024

https://github.com/macbre/mediawiki-dump

Python package for working with MediaWiki XML content dumps

fandom mediawiki-dump python python3-library wikia wikipedia wikipedia-corpus wikipedia-dump xml-dump

Last synced: 02 Nov 2024

https://github.com/ayushidalmia/wikipedia-search-engine

Involves building a search engine on the Wikipedia Data Dump using the data dump of 2013 of size 43 GB. The search results returns in real time.

information-retrieval python search-engine wikipedia-corpus

Last synced: 09 Nov 2024

https://github.com/macbre/faroese-corpus

Some Faroese language statistics taken from fo.wikipedia.org content dump

corpus-linguistics faroe faroese faroese-language linguistic-analysis linguistics python3-script wikipedia-corpus wikipedia-dump

Last synced: 09 Nov 2024