An open API service indexing awesome lists of open source software.

"datasets" Awesome Lists

awesome-json-datasets

A curated list of awesome JSON datasets that don't require authentication.

awesome awesome-list data dataset datasets json json-dataset list

3,405 stars
383 forks
300 projects

Last updated: 10 Apr 2025

Awesome-Code-LLM

[TMLR] A curated list of language modeling researches for code and related datasets.

ai awesome datasets llm nlp papers software-engineering survey tmlr

1,607 stars
106 forks
2,033 projects

Last updated: 30 Oct 2024

Awesome-Cybersecurity-Datasets

A curated list of amazingly awesome Cybersecurity datasets

attack cybersecurity data dataframe datasets deep deeplearning events ids ips

1,572 stars
285 forks
54 projects

Last updated: 12 Apr 2025

awesome-transit

Community list of transit APIs, apps, datasets, research, and software :bus::star2::train::star2::steam_locomotive:

awesome awesome-list bus datasets gtfs gtfs-analysis gtfs-converters gtfs-feed gtfs-files gtfs-libraries

1,485 stars
215 forks
403 projects

Last updated: 16 Apr 2025

awesome-transit

Community list of transit APIs, apps, datasets, research, and software :bus::star2::train::star2::steam_locomotive:

awesome awesome-list bus datasets gtfs gtfs-analysis gtfs-converters gtfs-feed gtfs-files gtfs-libraries

1,485 stars
215 forks
403 projects

Last updated: 16 Apr 2025

awesome-yolo-object-detection

🚀🚀🚀 A collection of some awesome public YOLO object detection series projects and the related object detection datasets.

cuda datasets deepseek few-shot-object-detection gui llama llm mllm object-detection object-detection-datasets

1,448 stars
202 forks
1,701 projects

Last updated: 12 Apr 2025

awesome-robotics

A curated list of awesome links and software libraries that are useful for robots.

awesome awesome-list datasets deep-learning image-segmentation list lists machine-learning optimization optimization-algorithms

1,082 stars
159 forks
72 projects

Last updated: 09 Apr 2025

awesome-instruction-dataset

A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)

awsome-lists datasets gpt-3 gpt-4 instruction-following instruction-tuning language-model llama

1,077 stars
59 forks
89 projects

Last updated: 28 Oct 2024

awesome-autonomous-vehicle

无人驾驶的资源列表中文版

autonomous-vehicles awesome-list car-driving computer-vision datasets intelligent-vehicle self-driving-car tutorial

848 stars
219 forks
181 projects

Last updated: 11 Apr 2025

LLM4Rec-Awesome-Papers

A list of awesome papers and resources of recommender system on large language model (LLM).

awesome datasets large-language-models llm4rec recommender-system survey

841 stars
72 forks
124 projects

Last updated: 29 May 2024

awesome-dataset-tools

🔧 A curated list of awesome dataset tools

annotation-tool annotations awsome awsome-list datasets machine-learning

808 stars
120 forks
87 projects

Last updated: 20 May 2024

awesome-public-real-time-datasets

A list of publicly available datasets with real-time data maintained by the team at bytewax.io

awesome-list data data-science data-visualization datasets real-time streaming

745 stars
35 forks
69 projects

Last updated: 01 Apr 2025

awesome-llm-and-aigc

🚀🚀🚀A collection of some wesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applications.

ai4s ai4science aigc awesome-list chatgpt cuda datasets deepseek gpt langchain

657 stars
59 forks
1,376 projects

Last updated: 14 Apr 2025

awesome-instruction-datasets

A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。

chatgpt datasets instruction llama llm prompts self-instruct

652 stars
33 forks
132 projects

Last updated: 18 Apr 2025

awesome-holistic-3d

A list of papers and resources (data,code,etc) for holistic 3D reconstruction in computer vision

3d-reconstruction awesome computer-vision datasets deep-learning machine-learning

607 stars
89 forks
129 projects

Last updated: 29 Apr 2024

awesome-mobile-robotics

Useful links of different content related to AI, Computer Vision, and Robotics.

autonomous-robots autonomous-systems awesome-list books companies datasets jobs lab labs localization

597 stars
98 forks
541 projects

Last updated: 12 Apr 2025

awesome-segmentation-saliency-dataset

A collection of some datasets for segmentation / saliency detection. Welcome to PR...:smile:

dataset datasets deep-learning deeplearning machine-learning machinelearning saliency-detection

566 stars
96 forks
326 projects

Last updated: 08 Apr 2025

machine-learning-resources

A curated list of awesome machine learning frameworks, libraries, courses, books and many more.

awesome-list conference data-analysis data-science datasets handbook machine-learning natural-language-processing nlp-machine-learning paper

425 stars
124 forks
56 projects

Last updated: 16 Apr 2025

awesome-instruction-learning

Papers and Datasets on Instruction Tuning and Following. ✨✨✨

awesome-list datasets in-context-learning instruction instruction-learning instruction-tuning large-language-models paper-list pretrained-language-model prompt

408 stars
21 forks
177 projects

Last updated: 21 May 2024

Text-Summarization-Repo

텍스트 요약 분야의 주요 연구 주제, Must-read Papers, 이용 가능한 model 및 data 등을 추천 자료와 함께 정리한 저장소입니다.

awesome curated datasets nlp paper summary text-summarization

342 stars
48 forks
113 projects

Last updated: 05 Apr 2025

awesome-forests

🌳 A curated list of ground-truth forest datasets for the machine learning and forestry community.

biodiversity carbon climate-change datasets deep-learning ecosystems forestry machine-learning

309 stars
39 forks
63 projects

Last updated: 24 Mar 2025

awesome-nlp-polish

A curated list of resources dedicated to Natural Language Processing (NLP) in polish. Models, tools, datasets.

datasets nlp nlp-machine-learning polish-language

298 stars
34 forks
44 projects

Last updated: 04 Apr 2025

Awesome-Domain-Generalization

Awesome things about domain generalization, including papers, code, etc.

awesome awesome-list datasets deep-learning domain-generalization libraries papers

297 stars
36 forks
343 projects

Last updated: 23 May 2024

awesome-colour

Curated list of awesome colour science resources 😎

awesome awesome-list color color-science color-space color-spaces colorspace colorspaces colour colour-science

280 stars
22 forks
85 projects

Last updated: 02 Apr 2025

Awesome-3D-LiDAR-Datasets

This reposiotry is the collection for public 3D LiDAR datasets

awesome-lists datasets lidar

265 stars
23 forks
55 projects

Last updated: 31 Mar 2025

Graph-Neural-Networks-With-Heterophily

This repository contains the resources on graph neural network (GNN) considering heterophily.

awesome datasets graph-data graph-neural-networks heterophily homophily

248 stars
21 forks
334 projects

Last updated: 02 Apr 2025

awesome-rgbd-datasets

This repository contains information for the paper "A Survey on RGB-D Datasets" and is a collaborative initiative to update the datasets list faster.

awesome awesome-list datasets depth depth-estimation lidar rgb-d survey

224 stars
13 forks
232 projects

Last updated: 26 Mar 2025

Awesome-Earth-Artificial-Intelligence

A curated list of Earth Science's Artificial Intelligence (AI) tutorials, notebooks, software, datasets, courses, books, video lectures and papers. Contributions most welcome.

air-quality awesome-list biosphere datasets deep-learning dust earth-science earthquakes geosphere glacier

217 stars
56 forks
99 projects

Last updated: 15 Mar 2025

awesome-taxonomy

A curated resource for taxonomy research

datasets hypernymy-detection taxonomy-construction taxonomy-learning

198 stars
30 forks
172 projects

Last updated: 29 May 2024

awesome-lidar-place-recognition

A curated list of Place Recognition methods, datasets, and various algorithms for LiDAR

awesome awesome-list datasets lidar place-recognition point-cloud robotics slam

159 stars
5 forks
75 projects

Last updated: 30 Mar 2025

awesome-dynamic-graphs

A collection of resources on dynamic/streaming/temporal/evolving graph processing systems, databases, data structures, datasets, and related academic and industrial work

awesome awesome-list awesome-lists datasets dynamic-graph-processing dynamic-graphs evolving-graphs graph graph-analytics graph-databases

137 stars
17 forks
71 projects

Last updated: 04 Mar 2025

awesome-object-detection-datasets

A collection of some awesome public object detection and recognition datasets.

aerial-imagery autonomous-driving awesome-list chatgpt coco dataset datasets infrared large-language-models llm

94 stars
9 forks
143 projects

Last updated: 15 Apr 2025

awesome-scene-text-detection

Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized with code and dataset

charmve dataset datasets detection irregular-text-recognition level-annotation ocr recognition scene-text-detection scene-text-recognition

86 stars
17 forks
45 projects

Last updated: 04 Apr 2025

Data-Science-and-Machine-Learning-Resources

List of Data Science and Machine Learning Resource that I frequently use

algorithms awesome awesome-list blog blogs collections datascience datasets deep-learning ebooks

65 stars
22 forks
250 projects

Last updated: 11 Apr 2025

Awesome-Deepfakes

A list of datasets, tools, papers and code related to Deepfakes.

awesome datasets deepfakes image paper-with-code paperlist tools video

64 stars
2 forks
72 projects

Last updated: 13 May 2024

awesome-data-chile

Lista curada de datasets públicos sobre Chile.

awesome awesome-list chile data datasets opendata

61 stars
3 forks
44 projects

Last updated: 16 Mar 2025

awesome-Iran-datasets

Iranian/Persian Datasets. دیتاست‌های فارسی و ایرانی

awesome data-science datasets machine-learning persian persiandataset

45 stars
4 forks
80 projects

Last updated: 29 May 2024

awesome-datasets

A comprehensive list of annotated training datasets classified by use case.

annotation awesome-data-science awesome-datasets awesome-public-datasets corpora data dataset datasets document-processing entity-extraction

33 stars
6 forks
87 projects

Last updated: 29 Mar 2025

awesome-swedish-nlp

A curated list of resources for natural language processing (NLP) in Swedish

awesome-list corpora corpus dataset datasets natural-language-generation natural-language-processing nlp resource-list swedish

24 stars
2 forks
68 projects

Last updated: 28 Jan 2025

awesome-marine-hacking

Awesome Resources for Ocean Hacking

awesome awesome-list dataset datasets hackathon ocean ocean-hacking oceanography

15 stars
3 forks
35 projects

Last updated: 20 Dec 2024

awesome-italian-public-datasets

A selection of interesting Open dataset from the Italian Public Administration and Civic Data use cases

civic-hacking civic-tech data-science datasets government-data opendata

9 stars
3 forks
43 projects

Last updated: 12 Mar 2022

awesome-nba-data

A curated list of awesome NBA Data and resources.

awesome-list data datasets nba nba-data nba-stats

9 stars
1 forks
37 projects

Last updated: 25 Sep 2024

awesome-ai-for-gui-agents

Awesome resources about AI for GUI Agents.

ai awesome awesome-list datasets gui models papers

6 stars
0 forks
30 projects

Last updated: 11 Mar 2025

awesome-malware-benign-datasets

🪲 A list of malware and benign datasets for malware research

awesome-list datasets malware-analysis malware-researchers security

5 stars
0 forks
32 projects

Last updated: 04 Apr 2025

awesome-transit

copy of https://github.com/CUTR-at-USF/awesome-transit

awesome awesome-list bus datasets gtfs gtfs-analysis gtfs-converters gtfs-feed gtfs-files gtfs-libraries

0 stars
0 forks
296 projects

Last updated: 25 Jan 2022

Search
Keywords
awesome-list 3,469 awesome 3,152 machine-learning 408 list 381 awesome-lists 380 deep-learning 351 resources 301 hacktoberfest 241 python 201 lists 194 ai 178 javascript 177 programming 158 security 158 artificial-intelligence 134 llm 133 computer-vision 131 blockchain 129 nlp 108 open-source 104 tools 104 data-science 102 react 99 chatgpt 90 large-language-models 90 natural-language-processing 84 android 83 linux 78 learning 77 awesome-readme 76 papers 74 ethereum 72 css 72 curated-list 72 ios 71 devops 70 awesome-resources 69 nodejs 62 computer-science 61 playground 60 reinforcement-learning 59 free-resources 58 courses 58 cybersecurity 57 getvm 57 macos 57 game-development 56 rust 56 kubernetes 56 privacy 55 robotics 54 golang 52 tutorial 52 collection 52 java 52 design 51 paper 48 frontend 48 hands-on 47 labex 47 bitcoin 47 go 47 web 47 swift 46 hacking 46 datasets 45 education 45 cryptocurrency 43 openai 43 survey 43 php 43 web3 43 books 43 dataset 42 html 42 deep-neural-networks 42 security-tools 42 development 41 cloud 41 opensource 40 exercises 40 data 40 gpt 40 free 39 tutorials 38 vue 38 typescript 37 data-visualization 37 docker 36 gamedev 36 github 35 llms 34 developer-tools 34 documentation 34 database 34 research 33 paper-list 33 collections 33 automation 32 reactjs 32 iot 32 aws 32 neural-network 31 software 31 framework 31 game 30 hardware 30 cpp 29 community 29 ruby 29 testing 29 algorithms 28 ml 28 graph-neural-networks 28 solidity 28 architecture 28 smart-contracts 28 libraries 28 video 28 generative-ai 28 transformer 27 tensorflow 27 flutter 27 windows 26 games 26 bioinformatics 26 web-development 26 links 26 defi 25 software-engineering 25 git 25 dotnet 25 crypto 25 csharp 25 api 25 music 25 computer-graphics 24 diffusion-models 24 pentesting 24 penetration-testing 24 django 24 angular 24 react-native 24 r 23 agent 23 embedded 23 unity 23 pytorch 23 cryptography 23 self-supervised-learning 23 science 23 best-practices 22 gpt-4 22 microsoft 22 youtube 22 infosec 22 productivity 22 gpt-3 22 coding 22 markdown 22 object-detection 22 awsome-list 22 library 22 kotlin 22 self-hosted 22 osint 21 engineering 21 apple 21 awesomeness 21 knowledge-graph 21 audio 21 data-analysis 21 projects 21 visualization 21 hacktoberfest-accepted 21 ui 20 roadmap 20 cloud-computing 20 programming-language 20 slam 20 multimodal 20 segmentation 20 apps 20 game-engine 20 c 20 mathematics 20 serverless 20 startups 20 opendata 20 neural-networks 20 mobile 19 sql 19 awsome 19 graph 19 guidelines 19 website 19 jobs 19 federated-learning 19 prompt-engineering 19 3d 19 data-mining 19 python3 19 azure 19 marketing 18 raspberry-pi 18 llama 18 blogs 18 bugbounty 18 deeplearning 18 time-series 18 interview 18 startup 18 chinese 18 blog 18 front-end 18 graphql 18 chatbot 18 knowledge 17 decentralized 17 code 17 language 17 generative-art 17 autonomous-driving 17 nextjs 17 representation-learning 17 bash 17 containers 17 cli 17 saas 17 algorithm 17 networking 16 microservices 16 articles 16 image-processing 16 stable-diffusion 16 aigc 16 data-structures 16 webgl 16 learning-resources 16 cheatsheet 16 curated 16 mlops 16 foundation-models 16 optimization 16 statistics 16 resource 16 bert 16 machinelearning 16 transformers 15 devsecops 15 technology 15 distributed-systems 15 nerf 15 ros 15 big-data 15 analytics 15 pentest 15 monitoring 15 reverse-engineering 15 graphics 15 detection 15 cloud-native 15 software-development 15 autonomous-vehicles 15 prompt 14 mysql 14 terminal 14 telegram 14 speech-recognition 14 react-components 14 readme 14 multimodal-deep-learning 14 data-engineering 14 vuejs 14 icons 14 selfhosted 14 recommender-system 14 dart 14 performance 14 backend 14 animation 14 mac 14 remote-sensing 13 webassembly 13 dotnet-core 13 guide 13 open-data 13 swiftui 13 mcp 13 pose-estimation 13 evm 13 finance 13 foss 13 unicorns 13 android-library 13 malware-analysis 13 node 13 generative-adversarial-network 13 deep-reinforcement-learning 13 p2p 13 microservice 13 seo 13 gis 13 dapp 13 postgresql 12 gan 12 pwa 12 art 12 oss 12 plugins 12 command-line 12 search-engine 12 search 12 fuzzing 12 planning 12 cms 12 vision-and-language 12 ecommerce 12 videos 12 discord 12 podcast 12 beginner-friendly 12 reasoning 12 objective-c 12 rails 12 storage 12 gaming 12 font 12 image 12 wordpress 12 android-development 12 transfer-learning 12 elasticsearch 12 large-language-model 12 point-cloud 12 graphics-programming 12 svelte 12 language-model 12 laravel 12 haskell 11 static-analysis 11 quantum-computing 11 creative-coding 11 bookmarks 11 ansible 11 automl 11 anomaly-detection 11 databases 11 front-end-development 11 hacktoberfest2020 11 application 11 free-software 11 webpack 11 open-science 11 3d-graphics 11 design-systems 11 ai-agents 11 multimodal-learning 11 speech-processing 11 wasm 11 sysadmin 11 programming-languages 11 text-to-image 11 ui-design 11 network 11 course 11 question-answering 11 leetcode 11 frameworks 11 cvpr 11 sustainability 11 js 11 unity3d 11 vscode 11 yolo 10 edge-computing 10 code-generation 10 arduino 10 vr 10 docs 10 bot 10 rag 10 climate-change 10 movies 10 cross-platform 10 unsupervised-learning 10 deepseek 10 cuda 10 authentication 10 indonesia 10 ctf 10 operating-system 10 writing 10 npm 10 image-generation 10 datascience 10 cv 10 medical-imaging 10 time-series-analysis 10 gpu 10 chain-of-thought 10 guides 10 hacking-tools 10 golang-library 10 model-compression 10 chatgpt-api 10 explainable-ai 10 developer 10 management 10 threat-intelligence 10 gpts 10 students 10 text-mining 10 privacy-tools 10 interpretability 10 benchmark 10 hosting 10 geospatial 10 infrastructure-as-code 10 rest-api 10 hacktoberfest2021 10 reading-list 10 malware 10 semantic-web 10 diffusion 10 blockchain-technology 10 interview-questions 10 in-context-learning 9 jupyter-notebook 9 anime 9 article 9 video-generation 9 sre 9 webapp 9 webcomponents 9 exploit 9 knowledge-base 9 sentiment-analysis 9 ros2 9 flask 9 ux 9 ide 9 static-site-generator 9 news 9 machine-learning-algorithms 9 email 9 simulation 9 bug-bounty 9 swift-library 9 microservices-architecture 9 arkit 9 leadership 9 classification 9 semantic-segmentation 9 ionic 9 cicd 9 reference 9 fpga 9 few-shot-learning 9 math 9 websites 9 book 9 trading 9 conference 9 erlang 9 electronics 9 google 9 quantization 9 applications 9 flutter-apps 9 android-application 9 android-app 9 system-design 9 neuroscience 9 agents 9 unix 9 reddit 9 fairness 9 cyber-security 9 solana 9 3d-reconstruction 9 social-network 9 augmented-reality 9 healthcare 8 material-design 8 conferences 8 vision-transformer 8 multimodal-large-language-models 8 distributed-database 8 incident-response 8 powershell 8 earth-observation 8 lidar 8 bots 8 redux 8 ui-components 8 data-analytics 8 mobile-development 8 opencv 8 shell 8