Projects in Awesome Lists tagged with llm-datasets
A curated list of projects in awesome lists tagged with llm-datasets .
https://github.com/dsdanielpark/open-llm-datasets
Repository for organizing datasets and papers used in Open LLM.
datasets large-language-models llm llm-datasets llm-training natural-language-processing
Last synced: 04 Mar 2025
https://github.com/altunenes/rustysozluk
Efficiently fetch and perform sentiment analysis (Turkish Only) on eksisozluk.com entries using Rust
duyguanalizi eksi-sozluk eksisozluk llm-datasets llm-training reqwest rust rust-lang rust-scraping scraper sentiment-analysis turkish webscraping
Last synced: 17 Jan 2025
https://github.com/neuralwork/audio2chat
Convert multi-speaker audio files to structured chat data for LLMs
chat llm llm-datasets speaker-diarization transcription whisper
Last synced: 04 Mar 2025
https://github.com/definetlynotai/llm_data
A bunch of very famous repos source code's in python as pure localdocs all in this repo to train CODE AI
c code-examples cpp cuda data data-dum jupyter-notebook llm llm-code llm-datasets programming-data programming-data-sets python3
Last synced: 26 Jan 2025
https://github.com/bot08/aiua-20k
dataset generated huggingface-datasets llm-datasets ukrainian-language
Last synced: 16 May 2025