Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with datasets
A curated list of projects in awesome lists tagged with datasets .
https://github.com/humansignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
annotation annotation-tool annotations boundingbox computer-vision data-labeling dataset datasets deep-learning image-annotation image-classification image-labeling image-labelling-tool label-studio labeling labeling-tool mlops semantic-segmentation text-annotation yolo
Last synced: 16 Dec 2024
https://github.com/huggingface/datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
computer-vision datasets deep-learning hacktoberfest machine-learning natural-language-processing nlp numpy pandas pytorch speech tensorflow
Last synced: 16 Dec 2024
https://github.com/HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
annotation annotation-tool annotations boundingbox computer-vision data-labeling dataset datasets deep-learning image-annotation image-classification image-labeling image-labelling-tool label-studio labeling labeling-tool mlops semantic-segmentation text-annotation yolo
Last synced: 30 Oct 2024
https://github.com/heartexlabs/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
annotation annotation-tool annotations boundingbox computer-vision data-labeling dataset datasets deep-learning image-annotation image-classification image-labeling image-labelling-tool label-studio labeling labeling-tool mlops semantic-segmentation text-annotation yolo
Last synced: 08 Nov 2024
https://github.com/HumanSignal/label-studio?fbclid=IwAR30j2OmVMcB-TenAczkNwwUsObi8JAOpTNxGFzrmMrJ2pd4-gg_S0D3S78
Label Studio is a multi-type data labeling and annotation tool with standardized output format
annotation annotation-tool annotations boundingbox computer-vision data-labeling dataset datasets deep-learning image-annotation image-classification image-labeling image-labelling-tool label-studio labeling labeling-tool mlops semantic-segmentation text-annotation yolo
Last synced: 11 Nov 2024
https://github.com/tonybeltramelli/pix2code
pix2code: Generating Code from a Graphical User Interface Screenshot
datasets deep-learning deep-neural-networks front-end-development graphical-user-interface
Last synced: 16 Dec 2024
https://github.com/cleanlab/cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
active-learning annotation data-centric-ai data-cleaning data-curation data-labeling data-profiling data-quality data-science data-validation dataops dataquality datasets exploratory-data-analysis labeling llms noisy-labels out-of-distribution-detection outlier-detection weak-supervision
Last synced: 17 Dec 2024
https://github.com/akfamily/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
academic akshare asset-pricing bond currency data data-analysis data-science datasets economic-data economics finance finance-api financial-data fundamental futures option quant stock
Last synced: 16 Dec 2024
https://github.com/simonw/datasette
An open source multi-tool for exploring and publishing data
asgi automatic-api csv datasets datasette datasette-io docker json python sql sqlite
Last synced: 16 Dec 2024
https://github.com/doccano/doccano
Open source annotation tool for machine learning practitioners.
annotation-tool data-labeling dataset datasets machine-learning natural-language-processing nuxt nuxtjs python text-annotation vue vuejs
Last synced: 16 Dec 2024
https://github.com/satellite-image-deep-learning/techniques
Techniques for deep learning with satellite & aerial imagery
convolutional-neural-networks dataset datasets deep-learning deep-neural-networks earth-observation image-classification keras machine-learning object-detection python pytorch remote-sensing satellite-data satellite-imagery satellite-images sentinel tensorflow
Last synced: 16 Dec 2024
https://github.com/activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
ai computer-vision cv data-science datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops multi-modal python pytorch tensorflow vector-database vector-search
Last synced: 21 Dec 2024
https://github.com/activeloopai/Hub
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
ai computer-vision cv data-science datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops multi-modal python pytorch tensorflow vector-database vector-search
Last synced: 08 Dec 2024
https://github.com/imanneo/fl_chart
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, and Radar Chart.
barchart chart charts datasets fl-chart flutter flutter-widget graph hacktoberfest linechart piechart radar-chart radar-graphs scatter-chart scatter-plot
Last synced: 16 Dec 2024
https://github.com/imaNNeo/fl_chart
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, and Radar Chart.
barchart chart charts datasets fl-chart flutter flutter-widget graph hacktoberfest linechart piechart radar-chart radar-graphs scatter-chart scatter-plot
Last synced: 30 Oct 2024
https://github.com/liuruoze/easypr
(CGCSTCD'2017) An easy, flexible, and accurate plate recognition project for Chinese licenses in unconstrained situations. CGCSTCD = China Graduate Contest on Smart-city Technology and Creative Design
artificial-intelligence artificial-neural-networks chinese-characters computer-vision datasets machine-learning opencv opencv3 plate-recognition supervised-learning support-vector-machines unconstrained-situation
Last synced: 17 Dec 2024
https://github.com/liuruoze/EasyPR
(CGCSTCD'2017) An easy, flexible, and accurate plate recognition project for Chinese licenses in unconstrained situations. CGCSTCD = China Graduate Contest on Smart-city Technology and Creative Design
artificial-intelligence artificial-neural-networks chinese-characters computer-vision datasets machine-learning opencv opencv3 plate-recognition supervised-learning support-vector-machines unconstrained-situation
Last synced: 26 Oct 2024
https://github.com/tensorflow/datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
data dataset datasets jax machine-learning numpy tensorflow
Last synced: 17 Dec 2024
https://github.com/cluebenchmark/cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
chinese corpus datasets knowledge-graph machine-reading-comprehension machine-translation match ner nlp qa sentiment-analysis text-classification text-similarity text-summarization
Last synced: 20 Dec 2024
https://github.com/CLUEbenchmark/CLUEDatasetSearch
搜索所有中文NLP数据集,附常用英文NLP数据集
chinese corpus datasets knowledge-graph machine-reading-comprehension machine-translation match ner nlp qa sentiment-analysis text-classification text-similarity text-summarization
Last synced: 31 Oct 2024
https://github.com/arize-ai/phoenix
AI Observability & Evaluation
ai-monitoring ai-observability ai-roi aiengineering datasets hacktoberfest llm-eval llmops ml-observability mlops model-observability
Last synced: 16 Dec 2024
https://github.com/Arize-ai/phoenix
AI Observability & Evaluation
ai-monitoring ai-observability ai-roi aiengineering datasets hacktoberfest llm-eval llmops ml-observability mlops model-observability
Last synced: 30 Oct 2024
https://github.com/roapi/roapi
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
analytics arrow blob-storage cloud-native columnar datafusion datasets delta-lake graphql in-memory-database parquet query query-frontends rest-api rust s3 sql static-datasets
Last synced: 30 Oct 2024
https://github.com/opencsgs/csghub
CSGHub is an open-source large model platform just like on-premise version of Hugging Face. You can easily manage models and datasets, deploy model applications and setup model finetune or inference jobs with user interface. CSGHub also provides Python SDK with full compatibility of hf sdk. Join us together to build a safer and more open platform⭐️
ai datasets huggingface llm management-system models platform
Last synced: 07 Nov 2024
https://github.com/microsoft/torchgeo
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
computer-vision datasets deep-learning earth-observation geospatial models pytorch remote-sensing satellite-imagery torchvision transforms
Last synced: 17 Dec 2024
https://github.com/justinzm/gopup
数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
covid19-data data data-analysis data-science datasets economic-data gopup index-data python
Last synced: 19 Dec 2024
https://github.com/zhulf0804/3D-PointCloud
Papers and Datasets about Point Cloud.
autonomous-driving classification completion datasets detection generation monocular papers point-cloud registration segmentation
Last synced: 26 Oct 2024
https://github.com/github/CodeSearchNet
Datasets, tools, and benchmarks for representation learning of code.
bert cnn data data-science datasets deep-learning machine-learning machine-learning-on-source-code ml natural-language-processing neural-networks nlp nlp-machine-learning open-data programming-language-theory python representation-learning rnn self-attention tensorflow
Last synced: 24 Oct 2024
https://github.com/github/codesearchnet
Datasets, tools, and benchmarks for representation learning of code.
bert cnn data data-science datasets deep-learning machine-learning machine-learning-on-source-code ml natural-language-processing neural-networks nlp nlp-machine-learning open-data programming-language-theory python representation-learning rnn self-attention tensorflow
Last synced: 26 Sep 2024
https://github.com/freedomintelligence/medical_nlp
Medical NLP Competition, dataset, large models, paper
collection datasets list medical models nlp
Last synced: 03 Dec 2024
https://github.com/colour-science/colour
Colour Science for Python
color color-science color-space color-spaces colorspace colorspaces colour colour-science colour-space colour-spaces colourspace colourspaces data dataset datasets python spectral-data spectral-dataset spectral-datasets
Last synced: 17 Dec 2024
https://github.com/jsbroks/coco-annotator
:pencil2: Web-based image segmentation tool for object detection, localization, and keypoints
annotate-images coco coco-annotator coco-format computer-vision datasets deep-learning detection image-annotation image-labeling image-segmentation label machine-learning
Last synced: 09 Dec 2024
https://github.com/prabhuomkar/pytorch-cpp
C++ Implementation of PyTorch Tutorials for Everyone
artificial-intelligence autograd colab convolutional-neural-network cplusplus datasets generative-adversarial-network interactive-tutorials language-model libtorch machine-learning neural-network pytorch recurrent-neural-network scriptmodule-files tensors torch tutorial
Last synced: 20 Dec 2024
https://github.com/snap-stanford/ogb
Benchmark datasets, data loaders, and evaluators for graph machine learning
datasets deep-learning graph-machine-learning graph-neural-networks
Last synced: 18 Dec 2024
https://github.com/isl-org/open3d-ml
An extension of Open3D to address 3D Machine Learning tasks
3d-object-detection 3d-perception datasets lidar object-detection pretrained-models pytorch rgbd semantic-segmentation tensorflow visualization
Last synced: 19 Dec 2024
https://github.com/isl-org/Open3D-ML
An extension of Open3D to address 3D Machine Learning tasks
3d-object-detection 3d-perception datasets lidar object-detection pretrained-models pytorch rgbd semantic-segmentation tensorflow visualization
Last synced: 28 Oct 2024
https://github.com/diffgram/diffgram
The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
annotation annotation-tool annotations data data-analytics data-annotation data-science datasets datastore deep-learning image-annotation kubernetes labeling machine-learning training-data video-annotation
Last synced: 25 Oct 2024
https://github.com/logpai/loghub
A large collection of system log datasets for AI-driven log analytics [ISSRE'23]
anomaly-detection datasets log-analysis log-intelligence log-parsing logs unstructured-logs
Last synced: 04 Dec 2024
https://github.com/chineseglue/chineseglue
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
albert bert chinese-corpus datasets glue language-understanding nlp pre-trained-model
Last synced: 21 Dec 2024
https://github.com/ChineseGLUE/ChineseGLUE
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
albert bert chinese-corpus datasets glue language-understanding nlp pre-trained-model
Last synced: 06 Nov 2024
https://github.com/juliadata/dataframes.jl
In-memory tabular data in Julia
data data-frame dataframes datasets hacktoberfest julia tabular-data
Last synced: 17 Dec 2024
https://github.com/JuliaData/DataFrames.jl
In-memory tabular data in Julia
data data-frame dataframes datasets hacktoberfest julia tabular-data
Last synced: 07 Nov 2024
https://github.com/jim-schwoebel/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
audio-dataset audio-datasets data dataset datasets noise voice voice-activity-detection voice-assistant voice-chat voice-commands voice-computing voice-control voice-conversion voice-dataset voice-datasets voice-recognition voice-synthesis
Last synced: 03 Dec 2024
https://github.com/juand-r/entity-recognition-datasets
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
annotations corpora datasets entity-extraction entity-recognition named-entity-recognition natural-language-processing ner nlp nlp-resources
Last synced: 19 Dec 2024
https://github.com/eosphoros-ai/db-gpt-hub
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
database datasets fine-tuning gpt hacktoberfest llm nl2sql sql text-to-sql text2sql
Last synced: 19 Dec 2024
https://github.com/luqmaan/awesome-transit
Community list of transit APIs, apps, datasets, research, and software :bus::star2::train::star2::steam_locomotive:
awesome awesome-list bus datasets gtfs gtfs-analysis gtfs-converters gtfs-feed gtfs-files gtfs-libraries gtfs-realtime gtfs-utils gtfs-validator list realtime-data tools transit transit-agencies transit-data transit-map
Last synced: 04 Nov 2024
https://github.com/eosphoros-ai/DB-GPT-Hub
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
database datasets fine-tuning gpt llm nl2sql sql text-to-sql text2sql
Last synced: 24 Oct 2024
https://github.com/explosion/projects
🪐 End-to-end NLP workflows from prototype to production
annotations datasets natural-language-processing nlp prodigy spacy
Last synced: 19 Dec 2024
https://github.com/pku-alignment/safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
ai-safety alpaca beaver datasets deepspeed gpt large-language-models llama llm llms reinforcement-learning reinforcement-learning-from-human-feedback rlhf safe-reinforcement-learning safe-reinforcement-learning-from-human-feedback safe-rlhf safety transformer transformers vicuna
Last synced: 19 Dec 2024
https://github.com/PKU-Alignment/safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
ai-safety alpaca beaver datasets deepspeed gpt large-language-models llama llm llms reinforcement-learning reinforcement-learning-from-human-feedback rlhf safe-reinforcement-learning safe-reinforcement-learning-from-human-feedback safe-rlhf safety transformer transformers vicuna
Last synced: 16 Nov 2024
https://github.com/PolyAI-LDN/conversational-datasets
Large datasets for conversational AI
conversational-ai datasets machine-learning
Last synced: 11 Nov 2024
https://github.com/RUC-NLPIR/FlashRAG
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
benchmark datasets large-language-models retrieval-augmented-generation
Last synced: 11 Sep 2024
https://github.com/midas-research/audino
Open source audio annotation tool for humans
annotation-tool audio-annotation audio-processing datasets machine-learning python speech-processing
Last synced: 20 Dec 2024
https://github.com/caserec/Datasets-for-Recommender-Systems
This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)
data-science database datasets public-data recommender-systems
Last synced: 28 Nov 2024
https://github.com/dmitryryumin/iccv-2023-papers
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
3d-graphics 3d-reconstruction biometrics computer-vision datasets deep-learning explainable-ai face-recognition gesture-recognition iccv iccv2023 image-processing image-synthesis multimodal-learning pattern-recognition photogrammetry pose-estimation robotics transfer-learning video-synthesis
Last synced: 20 Dec 2024
https://github.com/iamaziz/PyDataset
Instant access to many datasets in Python.
Last synced: 27 Nov 2024
https://github.com/iamaziz/pydataset
Instant access to many datasets in Python.
Last synced: 21 Dec 2024
https://github.com/mims-harvard/TDC
Therapeutics Commons: Artificial Intelligence Foundation for Therapeutic Science
artificial-intelligence benchmarks bioinformatics biology biomedicine biotech cheminformatics chemistry datasets deep-learning drug-discovery machine-learning medicine precision-medicine therapeutics
Last synced: 01 Nov 2024
https://github.com/CLUEbenchmark/CLUECorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
albert bert chinese chinese-corpus corpus datasets nlp pretrain roberta
Last synced: 16 Nov 2024
https://github.com/JizhiziLi/GFM
[IJCV 2022] Bridging Composite and Real: Towards End-to-end Deep Image Matting
animal-matting composition datasets image-matting matting segmentation
Last synced: 26 Oct 2024
https://github.com/cluebenchmark/cluecorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
albert bert chinese chinese-corpus corpus datasets nlp pretrain roberta
Last synced: 09 Nov 2024
https://github.com/zjunlp/prompt4reasoningpapers
[ACL 2023] Reasoning with Language Model Prompting: A Survey
arithmetic-reasoning artificial-intelligence awsome-list chain-of-thought chatgpt commonsense-reasoning datasets gpt-3 language-models large-language-models llm logical-reasoning natural-language-processing nlp paper-list prompt prompt-engineering reasoning survey symbolic-reasoning
Last synced: 09 Nov 2024
https://github.com/zjunlp/Prompt4ReasoningPapers
[ACL 2023] Reasoning with Language Model Prompting: A Survey
arithmetic-reasoning artificial-intelligence awsome-list chain-of-thought chatgpt commonsense-reasoning datasets gpt-3 language-models large-language-models llm logical-reasoning natural-language-processing nlp paper-list prompt prompt-engineering reasoning survey symbolic-reasoning
Last synced: 24 Oct 2024
https://github.com/ipeagit/geobr
Easy access to official spatial data sets of Brazil in R and Python
brazil datasets geopackage geopandas python r rstats sf shapefile spatial-data
Last synced: 18 Dec 2024
https://github.com/ipeaGIT/geobr
Easy access to official spatial data sets of Brazil in R and Python
brazil datasets geopackage geopandas python r rstats sf shapefile spatial-data
Last synced: 25 Oct 2024
https://github.com/huggingface/dataset-viewer
Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.
api-rest data datasets huggingface machine-learning nlp
Last synced: 29 Nov 2024
https://github.com/saltudelft/ml4se
A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering
ai4code ai4se code datasets deep-learning llm4code machine-learning ml4code ml4se papers research software-engineering theses tools tudelft
Last synced: 05 Nov 2024
https://github.com/opencsgs/csghub-server
csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and Application Spaces.
ai datasets golang huggingface llm models platform
Last synced: 21 Dec 2024
https://github.com/scale3-labs/langtrace
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorDBs and more.. Integrate using Typescript, Python. 🚀💻📊
ai datasets evaluations gpt langchain llm llm-framework llmops observability open-source open-telemetry openai prompt-engineering tracing
Last synced: 20 Dec 2024
https://github.com/st-tech/zr-obp
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
contextual-bandits datasets multi-armed-bandits off-policy-evaluation research
Last synced: 11 Nov 2024
https://github.com/jozu-ai/kitops
An open source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI artifact.
ai code datasets devops devops-tools gguf hacktoberfest kubernetes kubernetes-deployment ml mlops mlops-tools model-interpretability model-serving models opensource platform-engineering pytorch sklearn tensorflow
Last synced: 21 Dec 2024
https://github.com/mahmoudnafifi/Exposure_Correction
Project page of the paper "Learning Multi-Scale Photo Exposure Correction" (CVPR 2021).
coarse-to-fine color-correction computational-photography cvpr cvpr2021 dataset datasets deep-learning deeplearning exposure-correction image-enhancement low-light-enhance low-light-image multi-scale overexposure-correction underexposure-correction
Last synced: 08 Nov 2024
https://github.com/satellite-image-deep-learning/datasets
Datasets for deep learning with satellite & aerial imagery
datasets earth-observation remote-sensing satellite-data satellite-imagery sentinel
Last synced: 06 Nov 2024
https://github.com/Synerise/cleora
Cleora AI is a general-purpose open-source model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data. Created by Synerise.com team.
ai cleora-embeddings datasets deepwalk embeddings entity graphs hypergraphs inductive-entity-embeddings machine-learning ml pytorch-biggraph synerise
Last synced: 14 Dec 2024
https://github.com/BaseModelAI/cleora
Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.
ai cleora-embeddings datasets deepwalk embeddings entity graphs hypergraphs inductive-entity-embeddings machine-learning ml pytorch-biggraph synerise
Last synced: 13 Nov 2024
https://github.com/openvinotoolkit/datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
coco computer-vision dataset datasets deep-learning format-converter imagenet neural-networks openvino-toolkit pascal-voc yolo
Last synced: 13 Nov 2024
https://github.com/juliadata/dataframesmeta.jl
Metaprogramming tools for DataFrames
data data-frame dataframes dataframesmeta datasets hacktoberfest julia tabular-data
Last synced: 20 Dec 2024
https://github.com/JuliaData/DataFramesMeta.jl
Metaprogramming tools for DataFrames
data data-frame dataframes dataframesmeta datasets hacktoberfest julia tabular-data
Last synced: 27 Oct 2024
https://github.com/cluebenchmark/pclue
pCLUE: 1000000+多任务提示学习数据集
chinese clue datasets multi-task-learning prompt-learning promptclue zero-shot-learning
Last synced: 15 Dec 2024
https://github.com/EagleW/PaperRobot
Code for PaperRobot: Incremental Draft Generation of Scientific Ideas
attention-mechanism datasets end-to-end-learning generation memory-networks natural-language-generation nlp paper-generation pytorch text-generation
Last synced: 18 Nov 2024
https://github.com/CLUEbenchmark/pCLUE
pCLUE: 1000000+多任务提示学习数据集
chinese clue datasets multi-task-learning prompt-learning promptclue zero-shot-learning
Last synced: 27 Oct 2024
https://github.com/Yuan-ManX/ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
aigc artificial-intelligence audio audio-effect audio-generation datasets deep-learning machine-learning music-generation
Last synced: 27 Oct 2024
https://github.com/chaoswork/sft_datasets
开源SFT数据集整理,随时补充
chinese-dataset datasets large-language-models llms supervised-finetuning
Last synced: 09 Nov 2024
https://github.com/CelebV-HQ/CelebV-HQ
[ECCV 2022] CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
datasets gans generative-model
Last synced: 10 Nov 2024
https://github.com/dmitryryumin/cvpr-2023-24-papers
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!
action-recognition autonomous-driving biometrics computer-vision cvpr cvpr2023 cvpr2024 datasets deep-learning face-recognition gesture-recognition image-synthesis medical-image-processing multi-modal-learning pattern-recognition scene-analysis segmentation self-supervised-learning shape-analysis video-synthesis
Last synced: 15 Dec 2024
https://cambridgeuniversitypress.github.io/FirstCourseNetworkScience/
Tutorials, datasets, and other material associated with textbook "A First Course in Network Science" by Menczer, Fortunato & Davis
datasets indiana-university network-science networkx python social-network textbook tutorials
Last synced: 11 Nov 2024
https://github.com/CambridgeUniversityPress/FirstCourseNetworkScience
Tutorials, datasets, and other material associated with textbook "A First Course in Network Science" by Menczer, Fortunato & Davis
datasets indiana-university network-science networkx python social-network textbook tutorials
Last synced: 26 Sep 2024
https://github.com/MOLAorg/mola
A Modular Optimization framework for Localization and mApping (MOLA)
computer-vision cxx cxx17 datasets graph-slam lidar lidar-point-cloud localization mobile-robots slam toolkit visual-slam
Last synced: 13 Nov 2024
https://github.com/Koziev/NLP_Datasets
My NLP datasets for Russian language
Last synced: 13 Nov 2024
https://github.com/cleardusk/meglass
An eyeglass face dataset collected and cleaned for face recognition evaluation, CCBR 2018.
3dface dataset datasets face-recognition
Last synced: 16 Dec 2024
https://github.com/chakki-works/chakin
Simple downloader for pre-trained word vectors
datasets machine-learning natural-language-processing word-embeddings word-vectors
Last synced: 21 Dec 2024
https://github.com/jovianhq/opendatasets
A Python library for downloading datasets from Kaggle, Google Drive, and other online sources.
data-science datasets machine-learning python
Last synced: 21 Dec 2024
https://github.com/langwatch/langwatch
🤖 Build AI applications with confidence ✅ DSPy Visualizer ✅ Understand how your users are using your LLM-app ✅ Get a full picture of the quality performance of your LLM-app ✅ Collaborate with your stakeholders in ONE platform ✅ Iterate towards the most valuable & reliable LLM-app.
ai analytics datasets evaluation gpt llm observability openai prompt-engineering
Last synced: 04 Dec 2024
https://github.com/src-d/datasets
source{d} datasets ("big code") for source code analysis and machine learning on source code
dataset datasets git github machine-learning mlosc
Last synced: 15 Dec 2024
https://github.com/arjunmann73/Data-Analytics-Projects
:mag_right: Data analysis with real world data sets using Python :mag:
classification datasets machine-learning python regression
Last synced: 07 Nov 2024
https://github.com/jumpingrivers/datasauRus
R Package 📦 Containing the Datasaurus Dozen datasets :bar_chart:
anscombesquartet datasaurus datasaurus-dozen datasets r r-package rstats summary-statistics
Last synced: 25 Oct 2024
https://github.com/jumpingrivers/datasaurus
R Package 📦 Containing the Datasaurus Dozen datasets :bar_chart:
anscombesquartet datasaurus datasaurus-dozen datasets r r-package rstats summary-statistics
Last synced: 16 Dec 2024
https://github.com/weecology/retriever
Quickly download, clean up, and install public datasets into a database management system
data data-retrieval data-science dataset datasets hacktobefest python
Last synced: 04 Nov 2024
https://github.com/waico/SKAB
SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Anomaly Detection algorithms.
algorithms-evaluation anomaly-detection benchmark changepoint-detection collective-anomalies dataset datasets leaderboard outlier-detection skab skoltech
Last synced: 05 Nov 2024
https://github.com/waico/SkAB
SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Anomaly Detection algorithms.
algorithms-evaluation anomaly-detection benchmark changepoint-detection collective-anomalies dataset datasets leaderboard outlier-detection skab skoltech
Last synced: 26 Oct 2024