Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/amitness/ml-datasets

Machine Learning datasets for Nepal
https://github.com/amitness/ml-datasets

dataset machine-learning-datasets nepalese-researchers

Last synced: 9 days ago
JSON representation

Machine Learning datasets for Nepal

Awesome Lists containing this project

README

        

# ml-datasets
Curated list of Machine Learning datasets from Nepalese Researchers.

## Audio
- [Devanagiri Numbers(०-९) Spoken Audio](https://drive.google.com/drive/folders/15g57Qa1TQa4Ix6-MiC6v1wieouqp0XAl)
- [Nepali ASR training data set](http://www.openslr.org/54): Nepali ASR training data set containing ~157K utterances
- Nepali Text to Speech: [Dataset 1](https://github.com/meamit/nepali-text-to-speech/tree/master/speechdb), [Dataset 2](https://github.com/anuragregmi/speak_nepali/tree/master/sounds), [Dataset 3](https://github.com/hcoebct069/nepali-asr/tree/master/recordings)
- [Devanagiri Characters Speech](https://github.com/tsumansapkota/Devanagari_Characters_Speech)

## Disaster
- [Earthquake Building Damage Levels](https://www.drivendata.org/competitions/57/nepal-earthquake/page/136/)

## Finance
- [Nepal Rastra Bank Forex Rate API](https://www.nrb.org.np/exportForexJSON.php?YY=2019&MM=08&DD=01&YY1=2019&MM1=08&DD1=02)
- [Nepali Stock Market Dataset (2012 - 2020)](https://www.kaggle.com/sagyamthapa/nepali-stock-market-form-2012-to-2020-till-march#2019-01-01.csv)
- [Kaggle: Nepal Stock Exchange Data till 2019](https://www.kaggle.com/qramkrishna/nepal-stock-exchange-data)

## Geography
- [Metadata from Open Street Maps](https://github.com/sharad461/nepal-openstreetmap-extract)
- [Nepal travel distance between cities (km)](https://data.world/hdx/d1d0c217-8c6b-4747-ab1e-1069e2ff3e6b)
- [Pokhara weather data from 2009 to 2023](https://www.kaggle.com/datasets/gauravneupane/pokhara-weather-data-from-2009-to-2023)

## Health
- [Health Diseases in Nepali](https://github.com/sanjaalcorps/NepaliDataClassifiers/blob/master/HealthClassifiers.txt)

## Real Time Sensor Data
- Air Pollution: [EPA Air Pollution Data](https://github.com/hbvj99/EPAAirPollution), [Nepal Government Air Pollution Data](https://github.com/hbvj99/NPGovAirPollution), [Dristhi Air Pollution Data](https://github.com/hbvj99/DristhiAirPollution)
- [River Level Data](http://www.hydrology.gov.np)
- [Daily Vegetable/Fruit Price Information](http://kalimatimarket.gov.np/daily-price-information)
- [Location of Mahanagar Yatayat in Realtime](https://github.com/theonlyNischal/Track-Mahanagar-Yatayat)
- Tribhuwan International Airport: [Realtime Flight Arrival List](http://tiairport.com.np/flight_details), [Realtime Flight Departure List](http://tiairport.com.np/flight_details_2)

## Image
- [Corn Leaf Infection Dataset](https://www.kaggle.com/qramkrishna/corn-leaf-infection-dataset)
- [Voting Ballot Paper Dataset](https://github.com/rajshreeee/image_classification_for_voting_system_using_cnn)
- Nepalese currency: [Nepali Currency Notes](https://drive.google.com/file/d/1pDF0hx6pvgx4DJTCHL4EeDdCT4wlfnGW/view?usp=sharing), [Cash Dataset](https://drive.google.com/drive/folders/1GxITXrk13ehKMEMEbpi8mRsFSr4LUR55), [Images of 10, 50 & 100 rupee notes](https://github.com/mmanishh/nrscurrencyrecognizer/tree/master/data/train)
- [Faces of Famous People from Nepal](https://www.thefamouspeople.com/nepal.php)
- [DHCD dataset](https://github.com/Prasanna1991/DHCD_Dataset): A dataset of Devnagari (Nepali) handwritten characters
- [License Plate Recognition (LPR) dataset](https://github.com/Prasanna1991/LPR): Nepali Motorbike License plate dataset
- [Nepali Characters Dataset](https://github.com/InspiringLab/NCD)
- [Nepali Fonts OCR Dataset](https://github.com/BasantaChaulagain/Nepscan/tree/master/resources)
- [Nepali Handwritten Digits](https://github.com/kcnishan/Nepali_handwritten_digits_recognition/tree/master/dataset)
- [Nepali Potraits](https://www.kaggle.com/sumansid/nepali-portraits-dataset)
- [Vehicles Dataset](https://github.com/sdevkota007/vehicles-nepal-dataset): 4800 images of two-wheeler and four-wheeler vehicles from Nepal

## Text
- [16NepaliNews Corpus](https://github.com/sndsabin/Nepali-News-Classifier): 14,364 Nepali language news documents
- [A LARGE SCALE NEPALI TEXT CORPUS](https://ieee-dataport.org/open-access/large-scale-nepali-text-corpus)
- [65K Nepali Sentences](https://github.com/sanjaalcorps/NepaliDataSets/blob/master/raw_sentences_np_65k.csv)
- [350K Nepali Sentences](https://github.com/Team-Naya/nlp-doko)
- [39K Nepali Wikipedia Articles](https://www.kaggle.com/disisbig/nepali-wikipedia-articles)
- [nepal-brihat-sabdakosh-json](https://github.com/bikashpadhikari/nepali-brihat-sabdakosh-json): A structured JSON dump of all 122,000 words of the Nepali Brihat Sabdakosh
- [1000 Sport News](https://github.com/Aryal007/nepali_text_generation/blob/master/data/sports_news_nepali_1000.txt)
- [Nepali Translation Parallel Corpus](https://drive.google.com/file/d/1UThfJKJFvDgTu263DNbz-WPNLqoARZ_0/view)
- [Nepali English Machine Translation Corpus](https://github.com/facebookresearch/flores)
- [Nepali Abstractive Summarization Corpus: 286k article-title pairs from news](https://drive.google.com/file/d/1L56k0zonMk6XpelKAXPm45wCmt-9pS3x/view)
- [Nepal Earthquake Tweets](https://crisisnlp.qcri.org/lrec2016/content/2015_nepal_eq.html)
- [Nepali Chat Corpus](https://github.com/itsmeashutosh43/create-a-Open-Source-Nepali-Chat-corpus-)
- [Nagarik News Corpus](https://github.com/ashmitbhattarai/Nepali-Language-Modeling-Using-LSTM/tree/master/Nepali_Corpus/Nagarik)
- [Setopati News Corpus](https://github.com/ashmitbhattarai/Nepali-Language-Modeling-Using-LSTM/tree/master/Nepali_Corpus/SetoPati)
- [Nepali News in English Corpus](https://github.com/sharad461/english-corpus-nepal)
- [Nepali News Dataset](https://github.com/kamalacharya2044/NepaliNewsDataset)
- [Laxmi Prasad Devkota Poems](https://github.com/devkotasawal1/Poem-Generator/blob/master/lspd.txt): Collection of poems of Laxmi Prasad Devkota and contains 119161 characters.
- [Nepali Names](https://github.com/datafiction/oya-nepali-nlp/blob/master/data/names/Nepali.txt)
- [Dummy Nepali People Information](https://github.com/bibhuticoder/dummydata/blob/master/data.csv)
- [Nepali News Classification Dataset](https://drive.google.com/drive/folders/1Vm0UJ3FfWP-3guSan3FZsOV4q7rYuJIG)
- [Nepali Ngram](https://github.com/virtualanup/nepalingram)
- [Nepali Stopwords](https://github.com/sanjaalcorps/NepaliStopWords/blob/master/NepaliStopWords.txt)
- [Nepali Wikipedia Articles Dataset](https://drive.google.com/open?id=1Yh8BlJ5bydbvZaOQEmRPlTEDZjIIoAYN)
- [Nepali Word List](https://github.com/tesseract-ocr/langdata/blob/master/nep/nep.wordlist)
- [Nepali transliteration](https://github.com/AchillesKarki/NepaliLipi)
- [Nepali Textbooks](https://ecommons.cornell.edu/handle/1813/24179): Collection of school textbooks from Nepal assembled by Professor of Anthropology Kathryn March over the last 30 years.
- [Nepali Textbooks from grade 1 to 12](http://lib.moecdc.gov.np/catalog/opac_css/index.php?lvl=cmspage&pageid=6&id_rubrique=105)
- [Nepali Word2Vec](https://github.com/rabindralamsal/Word2Vec-Embeddings-for-Nepali-Language)
- [Nepali Spelling Correction Dataset](https://github.com/tnagorra/nspell/tree/master/data)
- [Nepali Contemporary Dictionary](http://ltk.org.np/nepalisabdakos/dict/np_dictionary_db.sql.gz)
- [80,00,000+ Nepali Wordlist](https://github.com/prabinzz/nepali-wordlist)
- [English to Nepali dictionary](https://github.com/nirooj56/Nepdict/blob/master/database/data.csv)
- [Nepali Movies on IMDB](https://github.com/NISH1001/nepalimdb/blob/master/data/nepali-movies.json)
- [SentiWordNet](https://github.com/wannamit/nep-SentiWord-py)
- [Misspelling Correction Dictionary](https://github.com/sarojdhakal/Bhasha)
- [Nepali Lemmatizer](https://github.com/dpakpdl/NepaliLemmatizer)
- [CC100](http://data.statmt.org/cc-100/ne.txt.xz)
- [LINCE](https://ritual.uh.edu/lince/): Nepali-English Code Switching Dataset
- [Wordlists in Selected Languages of Nepal](https://github.com/lexibank/halenepal)
- [Languages Resources for Nepal](https://language-resources-nepal.github.io/)
- [Nepali National Corpus](https://www.sketchengine.eu/nepali-national-corpus/)