An open API service indexing awesome lists of open source software.

"datasets" Awesome Lists

awesome-json-datasets

A curated list of awesome JSON datasets that don't require authentication.

awesome awesome-list data dataset datasets json json-dataset list

3,486 stars
384 forks
300 projects

Last updated: 15 Sep 2025

Awesome-Code-LLM

[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.

ai awesome datasets llm nlp papers software-engineering survey tmlr

2,908 stars
190 forks
2,416 projects

Last updated: 24 Sep 2025

LLM4Rec-Awesome-Papers

A list of awesome papers and resources of recommender system on large language model (LLM).

awesome datasets large-language-models llm4rec recommender-system survey

2,085 stars
151 forks
124 projects

Last updated: 14 Oct 2025

Awesome-Cybersecurity-Datasets

A curated list of amazingly awesome Cybersecurity datasets

attack cybersecurity data dataframe datasets deep deeplearning events ids ips

1,773 stars
305 forks
54 projects

Last updated: 30 Aug 2025

awesome-yolo-object-detection

🚀🚀🚀 A collection of some awesome public YOLO object detection series projects and the related object detection datasets.

cuda datasets deepseek few-shot-object-detection gui llama llm mllm object-detection object-detection-datasets

1,599 stars
219 forks
1,704 projects

Last updated: 30 Sep 2025

awesome-transit

Community list of transit APIs, apps, datasets, research, and software :bus::star2::train::star2::steam_locomotive:

awesome awesome-list bus datasets gtfs gtfs-analysis gtfs-converters gtfs-feed gtfs-files gtfs-libraries

1,575 stars
218 forks
412 projects

Last updated: 28 Sep 2025

awesome-robotics

A curated list of awesome links and software libraries that are useful for robots.

awesome awesome-list datasets deep-learning image-segmentation list lists machine-learning optimization optimization-algorithms

1,235 stars
168 forks
72 projects

Last updated: 24 Oct 2025

awesome-instruction-dataset

A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)

awsome-lists datasets gpt-3 gpt-4 instruction-following instruction-tuning language-model llama

1,132 stars
57 forks
89 projects

Last updated: 17 Sep 2025

awesome-public-real-time-datasets

A list of publicly available datasets with real-time data maintained by the team at bytewax.io

awesome-list data data-science data-visualization datasets real-time streaming

985 stars
71 forks
69 projects

Last updated: 24 Sep 2025

awesome-dataset-tools

🔧 A curated list of awesome dataset tools

annotation-tool annotations awsome awsome-list datasets machine-learning

911 stars
129 forks
87 projects

Last updated: 12 Sep 2025

awesome-autonomous-vehicle

无人驾驶的资源列表中文版

autonomous-vehicles awesome-list car-driving computer-vision datasets intelligent-vehicle self-driving-car tutorial

870 stars
221 forks
181 projects

Last updated: 18 Sep 2025

awesome-llm-and-aigc

🚀🚀🚀A collection of some awesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applications.

ai4s ai4science aigc awesome-list cuda datasets deepseek gpt langchain llama

766 stars
64 forks
1,402 projects

Last updated: 01 Oct 2025

awesome-instruction-datasets

A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。

chatgpt datasets instruction llama llm prompts self-instruct

703 stars
37 forks
132 projects

Last updated: 13 Oct 2025

awesome-mobile-robotics

Useful links of different content related to AI, Computer Vision, and Robotics.

autonomous-robots autonomous-systems awesome-list books companies datasets jobs lab labs localization

651 stars
103 forks
541 projects

Last updated: 09 Oct 2025

awesome-holistic-3d

A list of papers and resources (data,code,etc) for holistic 3D reconstruction in computer vision

3d-reconstruction awesome computer-vision datasets deep-learning machine-learning

639 stars
93 forks
129 projects

Last updated: 28 Sep 2025

awesome-segmentation-saliency-dataset

A collection of some datasets for segmentation / saliency detection. Welcome to PR...:smile:

dataset datasets deep-learning deeplearning machine-learning machinelearning saliency-detection

590 stars
97 forks
338 projects

Last updated: 24 Sep 2025

awesome-instruction-learning

Papers and Datasets on Instruction Tuning and Following. ✨✨✨

awesome-list datasets in-context-learning instruction instruction-learning instruction-tuning large-language-models paper-list pretrained-language-model prompt

499 stars
24 forks
177 projects

Last updated: 21 Aug 2025

Awesome-Domain-Generalization

Awesome things about domain generalization, including papers, code, etc.

awesome awesome-list datasets deep-learning domain-generalization libraries papers

483 stars
49 forks
343 projects

Last updated: 13 Oct 2025

machine-learning-resources

A curated list of awesome machine learning frameworks, libraries, courses, books and many more.

awesome-list conference data-analysis data-science datasets handbook machine-learning natural-language-processing nlp-machine-learning paper

430 stars
128 forks
56 projects

Last updated: 06 Oct 2025

Text-Summarization-Repo

텍스트 요약 분야의 주요 연구 주제, Must-read Papers, 이용 가능한 model 및 data 등을 추천 자료와 함께 정리한 저장소입니다.

awesome curated datasets nlp paper summary text-summarization

346 stars
49 forks
113 projects

Last updated: 14 Aug 2025

awesome-forests

🌳 A curated list of ground-truth forest datasets for the machine learning and forestry community.

biodiversity carbon climate-change datasets deep-learning ecosystems forestry machine-learning

339 stars
41 forks
64 projects

Last updated: 24 Oct 2025

Awesome-3D-LiDAR-Datasets

This reposiotry is the collection for public 3D LiDAR datasets

awesome-lists datasets lidar

332 stars
30 forks
59 projects

Last updated: 12 Sep 2025

awesome-nlp-polish

A curated list of resources dedicated to Natural Language Processing (NLP) in polish. Models, tools, datasets.

datasets nlp nlp-machine-learning polish-language

305 stars
36 forks
44 projects

Last updated: 16 Oct 2025

awesome-synthetic-datasets

awesome synthetic (text) datasets

ai awesome-list datasets llms synthetic-data synthetic-dataset-generation

297 stars
12 forks
31 projects

Last updated: 19 Sep 2025

awesome-colour

Curated list of awesome colour science resources 😎

awesome awesome-list color color-science color-space color-spaces colorspace colorspaces colour colour-science

293 stars
23 forks
86 projects

Last updated: 16 Sep 2025

awesome-rgbd-datasets

This repository contains information for the paper "A Survey on RGB-D Datasets" and is a collaborative initiative to update the datasets list faster.

awesome awesome-list datasets depth depth-estimation lidar rgb-d survey

268 stars
16 forks
232 projects

Last updated: 07 Oct 2025

Graph-Neural-Networks-With-Heterophily

This repository contains the resources on graph neural network (GNN) considering heterophily.

awesome datasets graph-data graph-neural-networks heterophily homophily

263 stars
22 forks
334 projects

Last updated: 09 Oct 2025

Awesome-Earth-Artificial-Intelligence

A curated list of Earth Science's Artificial Intelligence (AI) tutorials, notebooks, software, datasets, courses, books, video lectures and papers. Contributions most welcome.

air-quality awesome-list biosphere datasets deep-learning dust earth-science earthquakes geosphere glacier

228 stars
58 forks
99 projects

Last updated: 05 Sep 2025

awesome-taxonomy

A curated resource for taxonomy research

datasets hypernymy-detection taxonomy-construction taxonomy-learning

215 stars
28 forks
172 projects

Last updated: 26 Aug 2025

awesome-lidar-place-recognition

A curated list of Place Recognition methods, datasets, and various algorithms for LiDAR

awesome awesome-list datasets lidar place-recognition point-cloud robotics slam

190 stars
6 forks
75 projects

Last updated: 03 Oct 2025

Awesome-Deepfakes

A list of datasets, tools, papers and code related to Deepfakes.

awesome datasets deepfakes image paper-with-code paperlist tools video

170 stars
9 forks
72 projects

Last updated: 06 Oct 2025

awesome-dynamic-graphs

A collection of resources on dynamic/streaming/temporal/evolving graph processing systems, databases, data structures, datasets, and related academic and industrial work

awesome awesome-list awesome-lists datasets dynamic-graph-processing dynamic-graphs evolving-graphs graph graph-analytics graph-databases

141 stars
16 forks
71 projects

Last updated: 16 Sep 2025

awesome-Iran-datasets

Iranian/Persian Datasets. دیتاست‌های فارسی و ایرانی

awesome data-science datasets machine-learning persian persiandataset

128 stars
12 forks
82 projects

Last updated: 22 Sep 2025

awesome-legal-data

A collection of datasets and other resources for legal text processing.

datasets legal legal-tech nlp

124 stars
19 forks
133 projects

Last updated: 22 Sep 2025

awesome-object-detection-datasets

A collection of some awesome public object detection and recognition datasets.

aerial-imagery autonomous-driving awesome-list chatgpt coco dataset datasets infrared large-language-models llm

117 stars
9 forks
143 projects

Last updated: 23 Sep 2025

awesome-scene-text-detection

Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized with code and dataset

charmve dataset datasets detection irregular-text-recognition level-annotation ocr recognition scene-text-detection scene-text-recognition

90 stars
17 forks
45 projects

Last updated: 28 Jul 2025

Data-Science-and-Machine-Learning-Resources

List of Data Science and Machine Learning Resource that I frequently use

algorithms awesome awesome-list blog blogs collections datascience datasets deep-learning ebooks

72 stars
21 forks
264 projects

Last updated: 09 Sep 2025

awesome-data-chile

Lista curada de datasets públicos sobre Chile.

awesome awesome-list chile data datasets opendata

69 stars
3 forks
44 projects

Last updated: 02 Sep 2025

awesome-data-analysis

🚀📊 400+ curated resources for data analysis and data science: Python, SQL, ML, Visualization, Dashboards, Cheatsheets, Roadmaps, and Interview Prep. Perfect for beginners and pros!

awesome big-data cheatsheets data-analysis data-analytics data-science datasets eda jupyter learning-resources

66 stars
5 forks
753 projects

Last updated: 11 Sep 2025

awesome-datasets

A comprehensive list of annotated training datasets classified by use case.

annotation awesome-data-science awesome-datasets awesome-public-datasets corpora data dataset datasets document-processing entity-extraction

35 stars
6 forks
87 projects

Last updated: 04 Oct 2025

awesome-swedish-nlp

A curated list of resources for natural language processing (NLP) in Swedish

awesome-list corpora corpus dataset datasets natural-language-generation natural-language-processing nlp resource-list swedish

25 stars
2 forks
68 projects

Last updated: 22 Jul 2025

awesome-marine-hacking

Awesome Resources for Ocean Hacking

awesome awesome-list dataset datasets hackathon ocean ocean-hacking oceanography

16 stars
4 forks
35 projects

Last updated: 15 Sep 2025

awesome-malware-benign-datasets

🪲 A curated list of Malware and Benign datasets for security researchers

awesome-list datasets machine-learning malware-analysis malware-researchers security

12 stars
1 forks
32 projects

Last updated: 23 Sep 2025

awesome-nba-data

A curated list of awesome NBA Data and resources.

awesome-list data datasets nba nba-data nba-stats

11 stars
1 forks
45 projects

Last updated: 05 Oct 2025

awesome-italian-public-datasets

A selection of interesting Open dataset from the Italian Public Administration and Civic Data use cases

civic-hacking civic-tech data-science datasets government-data opendata

10 stars
3 forks
43 projects

Last updated: 04 Sep 2025

awesome-ai-for-gui-agents

Awesome resources about AI for GUI Agents.

ai awesome awesome-list datasets gui models papers

9 stars
0 forks
30 projects

Last updated: 11 Aug 2025

awesome-turkish-vlm

A curated list of models, datasets and other useful resources for Turkish Vision-Language Models (VLM).

awesome awesome-list computer-vision datasets deep-learning fine-tuning multimodal nlp pretrained-models turkish

3 stars
0 forks
37 projects

Last updated: 29 Jul 2025

awesome-transit

copy of https://github.com/CUTR-at-USF/awesome-transit

awesome awesome-list bus datasets gtfs gtfs-analysis gtfs-converters gtfs-feed gtfs-files gtfs-libraries

0 stars
0 forks
296 projects

Last updated: 25 Jan 2022

Search
Keywords
awesome-list 3,718 awesome 3,375 awesome-lists 477 machine-learning 430 list 408 deep-learning 365 resources 313 hacktoberfest 235 ai 221 python 209 lists 201 javascript 187 llm 172 security 163 programming 159 artificial-intelligence 146 computer-vision 136 blockchain 130 nlp 114 tools 112 open-source 110 data-science 108 large-language-models 106 react 103 chatgpt 96 android 89 natural-language-processing 85 linux 83 learning 81 curated-list 79 css 78 awesome-readme 77 papers 76 devops 75 ios 74 awesome-resources 73 ethereum 73 reinforcement-learning 70 cybersecurity 63 nodejs 62 computer-science 62 robotics 60 playground 60 rust 59 kubernetes 59 free-resources 58 game-development 58 getvm 57 design 57 collection 57 macos 57 free 54 java 54 tutorials 54 privacy 53 golang 52 tutorial 52 survey 50 web 50 hacking 49 bitcoin 49 frontend 49 openai 49 paper 48 datasets 48 go 47 swift 45 cloud 45 labex 45 hands-on 45 deep-neural-networks 44 web3 44 php 44 education 44 llms 44 books 44 development 43 html 43 dataset 43 data 42 developer-tools 42 github 42 gpt 42 security-tools 41 exercises 41 cryptocurrency 41 typescript 40 data-visualization 39 vue 39 gamedev 38 opensource 38 documentation 37 docker 36 collections 36 generative-ai 35 database 35 ml 35 software 35 research 34 courses 33 neural-network 33 aws 33 reactjs 33 paper-list 33 iot 33 web-development 33 testing 32 automation 32 framework 32 game 31 diffusion-models 31 graph-neural-networks 30 projects 30 community 29 productivity 29 ruby 29 algorithms 29 cpp 29 solidity 29 hardware 29 video 29 transformer 28 smart-contracts 28 flutter 28 libraries 28 architecture 28 games 28 multimodal 27 tensorflow 27 agent 27 api 27 dotnet 27 links 27 bioinformatics 27 crypto 26 react-native 26 windows 26 csharp 26 mcp 26 software-engineering 26 music 25 angular 25 computer-graphics 25 defi 25 penetration-testing 24 self-supervised-learning 24 slam 24 object-detection 24 microsoft 24 pentesting 24 git 24 django 24 osint 24 science 24 pytorch 24 coding 23 library 23 awsome-list 23 hacktoberfest-accepted 23 unity 23 embedded 23 markdown 23 cryptography 23 r 23 engineering 23 serverless 23 ui 22 roadmap 22 prompt-engineering 22 sql 22 neural-networks 22 mobile 22 gpt-4 22 ai-agents 22 self-hosted 22 mathematics 22 data-analysis 22 infosec 22 3d 22 best-practices 22 game-engine 22 python3 22 visualization 21 awesomeness 21 audio 21 free-tutorials 21 startups 21 cli 21 opendata 21 knowledge-graph 21 segmentation 21 gpt-3 21 blog 21 interview 21 kotlin 21 apple 21 jobs 20 deeplearning 20 data-mining 20 website 20 c 20 chinese 20 autonomous-driving 20 programming-language 20 federated-learning 20 blogs 20 mlops 20 cloud-computing 20 awsome 20 azure 20 youtube 20 apps 20 guidelines 20 learning-resources 20 reasoning 19 time-series 19 marketing 19 graph 19 foundation-models 19 chatbot 19 js 19 saas 19 agents 19 statistics 19 raspberry-pi 19 ai-tools 19 llama 18 graphql 18 bash 18 machinelearning 18 curated 18 nextjs 18 front-end 18 startup 18 bugbounty 18 representation-learning 18 aigc 18 reverse-engineering 18 transformers 17 stable-diffusion 17 generative-art 17 knowledge 17 software-development 17 containers 17 decentralized 17 image-processing 17 backend 17 cloud-native 17 code 17 language 17 algorithm 17 optimization 17 cheatsheet 17 cms 16 microservices 16 analytics 16 seo 16 networking 16 articles 16 webgl 16 data-structures 16 big-data 16 nerf 16 bert 16 resource 15 open-data 15 autonomous-vehicles 15 ros 15 icons 15 dart 15 mysql 15 detection 15 graphics 15 monitoring 15 technology 15 pentest 15 animation 15 readme 15 webassembly 14 telegram 14 large-language-model 14 language-model 14 node 14 art 14 awesome-ai 14 android-library 14 android-development 14 prompt 14 selfhosted 14 data-engineering 14 finance 14 code-generation 14 p2p 14 videos 14 unicorns 14 foss 14 devsecops 14 performance 14 distributed-systems 14 mac 14 react-components 14 oss 14 multimodal-deep-learning 14 diffusion 14 deepseek 14 recommender-system 14 terminal 14 vuejs 14 frameworks 13 speech-recognition 13 dotnet-core 13 pose-estimation 13 malware-analysis 13 microservice 13 podcast 13 command-line 13 webdevelopment 13 agentic-ai 13 wordpress 13 search 13 embodied-ai 13 multimodal-large-language-models 13 swiftui 13 transfer-learning 13 generative-adversarial-network 13 vscode 13 evm 13 remote-sensing 13 quantum-computing 13 text-to-image 13 wasm 13 fuzzing 13 laravel 13 deep-reinforcement-learning 13 gis 13 postgresql 13 plugins 13 gaming 12 infrastructure-as-code 12 rag 12 image 12 interview-questions 12 svelte 12 gan 12 image-generation 12 generative-model 12 multimodal-learning 12 es6 12 sustainability 12 leetcode 12 guide 12 cross-platform 12 vision-and-language 12 developer 12 vlm 12 chain-of-thought 12 search-engine 12 graphics-programming 12 pwa 12 mcp-server 12 storage 12 static-site-generator 12 discord 12 anomaly-detection 12 rails 12 elasticsearch 12 static-analysis 12 dapp 12 npm 12 font 12 webapp 12 video-generation 12 planning 11 creative-coding 11 article 11 programming-languages 11 sysadmin 11 rl 11 free-software 11 application 11 point-cloud 11 webpack 11 3d-graphics 11 open-science 11 databases 11 cuda 11 climate-change 11 cvpr 11 ecommerce 11 vibe-coding 11 course 11 hacktoberfest2020 11 threat-intelligence 11 google 11 question-answering 11 few-shot-learning 11 beginner-friendly 11 objective-c 11 claude 11 ctf 11 book 11 unity3d 11 design-systems 11 hosting 11 rest-api 11 automl 11 front-end-development 11 mllm 11 benchmark 11 yolo 11 haskell 11 ui-design 11 ux 11 network 10 cv 10 fpga 10 interpretability 10 email 10 students 10 machine-learning-algorithms 10 semantic-segmentation 10 ide 10 malware 10 geospatial 10 model-compression 10 webdesign 10 utilities 10 golang-library 10 solana 10 vr 10 arduino 10 gpts 10 langchain 10 functional-programming 10 hacktoberfest2021 10 medical-imaging 10 time-series-analysis 10 reading-list 10 operating-system 10 bot 10 android-application 10 vision-language-model 10 flutter-apps 10 movies 10 datascience 10 chatgpt-api 10 bookmarks 10 augmented-reality 10 writing 10 cyber-security 10 unsupervised-learning 10 deepseek-r1 10 docs 10 neuroscience 10 management 10 system-design 10 text-mining 10 social-network 10 bug-bounty 10 3d-reconstruction 10 explainable-ai 10 workflow 10 semantic-web 10 ansible 10 in-context-learning 10 prompts 10 speech-processing 10 templates 10 podcasts 10 android-app 10 authentication 10 hacking-tools 10 leadership 9 privacy-tools 9 swift-library 9 sre 9 software-architecture 9 sentiment-analysis 9 flask 9 trading 9 mobile-development 9 webdev 9 llmops 9 websites 9 flutter-examples 9 classification 9 lua 9 gpu 9 exploit 9 indonesia 9 infrastructure 9 linux-desktop 9 applications 9 data-analytics 9 reference 9 python-library 9 jupyter-notebook 9 erlang 9 microservices-architecture 9 blockchain-technology 9