data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-01-24 00:07:28 UTC
- JSON Representation
https://github.com/itachi-uchiha581/auto-data
Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).
ai data finetuning-large-language-models finetuning-llms generative-ai llm llm-training python python3
Last synced: 20 Sep 2025
https://github.com/debruine/faux
R functions for simulating factorial datasets
Last synced: 28 Aug 2025
https://github.com/ERDDAP/erddap
ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source (Apache and Apache-like) Java Servlet from NOAA NMFS SWFSC Environmental Research Division (ERD).
data environmental erddap noaa scientific server
Last synced: 08 May 2025
https://github.com/purarue/google_takeout_parser
A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)
backup data export google google-location-history google-takeout
Last synced: 14 Jun 2025
https://github.com/1n3/powerexfil
A collection of data exfiltration scripts for Red Team assessments.
data exfil exfiltration hacking powershell redteam redteaming script scripts tool tools
Last synced: 08 Aug 2025
https://github.com/ecsim/pem-dataset1
Proton Exchange Membrane (PEM) Fuel Cell Dataset
activation-procedure chemistry data data-science dataset electrochemistry energy fuel-cell impedance mea nafion open-science open-source pem physics polarization power proton-exchange-membrane science science-research
Last synced: 15 May 2025
https://github.com/zbrookle/dataframe_sql
A Python package that parses SQL and interprets it as methods that act upon existing pandas (or other types of) DataFrames that have been declared and registered
data dataframes pandas python sql
Last synced: 20 Aug 2025
https://github.com/ralyodio/humanparser
Parse a human name string into salutation, first name, middle name, last name, suffix.
data es6 javascript parsing scraping
Last synced: 13 Aug 2025
https://github.com/jason89521/daxus
Daxus is a server state management library for React that provides full control over data, leading to a better user experience.
cache data dedupe hook react revalidate server-state-management user-experience
Last synced: 23 Jun 2025
https://github.com/saschagobel/legislatoR
Interface to the Comparative Legislators Database
data dataset legislators parliament political-science politicians politics r wikipedia
Last synced: 13 Jul 2025
https://github.com/geostatsguy/geodatasets
Synthetic datasets for geoscience (geo)statistical modeling
Last synced: 26 Oct 2025
https://github.com/bukalapak/ktpextractor
This is a service which takes KTP image as the input, and extract the data in the KTP as the output. This is a part of open source project by Data Scientists of Bukalapak.
Last synced: 01 Aug 2025
https://github.com/blueapron/kafka-connect-protobuf-converter
Protobuf converter plugin for Kafka Connect
data data-platform jar kafka kafka-connect protobuf protobuf-converter protocol-buffers
Last synced: 03 May 2025
https://github.com/vr-25/migrator
A backup solution and data migration utility for Android
android appdata backup data factoryreset magisk migation migrate titaniumbackup
Last synced: 08 Jul 2025
https://github.com/gabrieldim/advanced-programming
Generic programming, generic classes, maps, sets, abstract data types and so on.
abstarct class data data-type data-types generic generic-programming generics interface interfaces map set
Last synced: 10 Jul 2025
https://github.com/neurosnap/cofx
A node and javascript library that helps developers describe side-effects as data in a declarative, flexible API.
asynchronous cofx data javascript node promise side-effects yield
Last synced: 14 Apr 2025
https://github.com/airbytehq/airbyte-agent-connectors
🐙 Drop-in tools that give AI agents reliable, permission-aware access to external systems.
ai ai-agents airbyte anthropic connectors data enterprise gemini integrations langchain llm mcp open-source openai pydantic-ai rag
Last synced: 23 Jan 2026
https://github.com/andreiduca/use-async-resource
A custom React hook for simple data fetching with React Suspense
async cache custom-hook data data-fetching fetch hooks react react-cache react-hook react-hooks react-suspense reactjs suspense
Last synced: 12 Jan 2026
https://github.com/manifoldfinance/mev-corpus
MEV Data Corpus
blockchain corpus data ethereum flashbots mev miner-extracted-value
Last synced: 21 Jan 2026
https://github.com/dagshub/client
DagsHub client libraries
ai data data-science data-streaming dvc hacktoberfest hacktoberfest2023 keras machine-learning machinelearning mlops python pytorch tensorflow
Last synced: 16 May 2025
https://github.com/anthonybudd/S4
S4 is 100% S3 compatible storage, accessed through Tor and distributed using IPFS.
data docker ipfs object-storage s3 s4 storage
Last synced: 07 Apr 2025
https://github.com/anthonybudd/s4
S4 is 100% S3 compatible storage, accessed through Tor and distributed using IPFS.
data docker ipfs object-storage s3 s4 storage
Last synced: 12 Apr 2025
https://github.com/anchore/vunnel
Tool for collecting vulnerability data from various sources (used to build the grype database)
data grype hacktoberfest vulnerability
Last synced: 08 Jan 2026
https://github.com/synthesized-io/fairlens
Identify bias and measure fairness of your data
bias data data-analysis data-science fairness ml pandas python statistics
Last synced: 24 Jun 2025
https://github.com/mattphillips/jest-each
A parameterised testing library for Jest. https://www.npmjs.com/package/jest-each 🏃
data each jest parameterised test
Last synced: 13 Apr 2025
https://github.com/jobehi/isthistechdead
The place where your favourite framework will be resting
Last synced: 19 Jun 2025
https://github.com/adzz/data_schema
Declarative schemas for data transformations.
data data-parsing elixir functional-programming types validation
Last synced: 20 Jul 2025
https://github.com/nucs/cryptocurrency-ticks-data
590 days of trade ticks on BTC/ETH/LTC/NEO to USDT
Last synced: 17 Aug 2025
https://github.com/mhanberg/schematic
📐 schematic
data elixir specification validation
Last synced: 12 Apr 2025
https://github.com/googlecloudplatform/dlp-dataflow-deidentification
Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP
beam bigquery data dataflow dlp pii tokenization
Last synced: 11 Apr 2025
https://github.com/yuxqiu/modern-poetry
The most comprehensive database of modern Chinese poetry and foreign poetry 最全的中国近现代诗以及外国诗数据库
data json poems poetry translation
Last synced: 16 Jan 2026
https://github.com/contextdata/vectoretl
Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications
cohere data datapipeline etl etl-framework etl-pipeline openai pinecone python qdrant qdrant-vector-database unstructured vector-database weaviate
Last synced: 09 Apr 2025
https://github.com/phyphox/phyphox-arduino
The phyphox BLE library to connect Arduino projects with the phyphox app to display data on the phone or use the phone's sensors on the Arduino
arduino ble bluetooth bluetooth-low-energy data phyphox sensors
Last synced: 16 Jan 2026
https://github.com/tmcw/simpleopendata
simple guidelines for publishing open data in useful formats
copleft copyright data formats government licensing open
Last synced: 12 Nov 2025
https://github.com/empower-ai/sql-agent
Ai Agent that helps you do data analytics with natural language.
analytics bigquery chatgpt chatgpt-bot data data-analytics data-science mysql postgresql slack slack-bot slackbot
Last synced: 11 Apr 2025
https://github.com/aws-solutions/automated-data-analytics-on-aws
The Automated Data Analytics on AWS solution provides an end-to-end data platform for ingesting, transforming, managing and querying datasets. This helps analysts and business users manage and gain insights from data without deep technical experience using Amazon Web Services (AWS).
Last synced: 17 Apr 2025
https://github.com/josephrp/datatonic
🌟DataTonic : A Data-Capable AGI-style Agent Builder of Agents , that creates swarms , runs commands and securely processes and creates datasets, databases, visualisations, and analyses.
agent-builder agi autogen azure chroma data data-science data-visualization database memgpt semantic-kernel semantic-memory taskweaver
Last synced: 11 Oct 2025
https://github.com/Azure/azure-data-labs-modules
A list of Terraform modules to build your Azure Data IaC templates.
analytics azure data github github-actions labs terraform terraform-modules
Last synced: 06 May 2025
https://github.com/joaocarmo/react-smart-data-table
A smart data table component for React meant to be configuration free
data data-table data-visualization plug-and-play react
Last synced: 13 Apr 2025
https://github.com/stanfordnlp/edu-convokit
Edu-ConvoKit: An Open-Source Framework for Education Conversation Data
data data-analysis data-science education language natural-language-processing
Last synced: 15 Apr 2025
https://github.com/JujuAdams/SNAP
Data format converters for GameMaker LTS 2022
array data gamemaker gamemaker-studio-2 gms2 ini json messagepack struct xml
Last synced: 01 Apr 2025
https://github.com/jujuadams/snap
Data format converters for GameMaker LTS 2022
array data gamemaker gamemaker-studio-2 gms2 ini json messagepack struct xml
Last synced: 06 Apr 2025
https://github.com/volorf/paster
Pasting a text data from a clipboard directlly to Sketch text layers [Sketch plugin]
clipboard data plugin sketch sketch-plugin text
Last synced: 21 Mar 2025
https://github.com/richienb/ros-data-waster
The easiest way to waste your data.
Last synced: 19 Jun 2025
https://github.com/slowkow/tftargets
:dart: Human transcription factor target genes from 6 databases in convenient R format.
bioinformatics data rstats transcription-factors
Last synced: 14 Apr 2025
https://github.com/azure/azure-data-labs-modules
A list of Terraform modules to build your Azure Data IaC templates.
analytics azure data github github-actions labs terraform terraform-modules
Last synced: 05 Jul 2025
https://github.com/Baukebrenninkmeijer/table-evaluator
Evaluate real and synthetic datasets against each other
data data-evaluation evaluation generation synthetic synthetic-data table-evaluator
Last synced: 02 May 2025
https://github.com/packtworkshops/the-data-visualization-workshop
A New, Interactive Approach to Learning Data Visualization
bokeh data data-visualization data-wrangling figures geoplotlib matplotlib numpy pandas plots python seaborn
Last synced: 15 Apr 2025
https://github.com/ropensci/opentripplanner
An R package to set up and use OpenTripPlanner (OTP) as a local or remote multimodal trip planner.
data isochrones java opentripplanner otp public-transport r routing transport transportation-planning
Last synced: 08 Oct 2025
https://github.com/open-discourse/open-discourse
Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag).
bundestag corpus data hacktoberfest
Last synced: 14 Mar 2025
https://github.com/ContextData/VectorETL
Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications
cohere data datapipeline etl etl-framework etl-pipeline openai pinecone python qdrant qdrant-vector-database unstructured vector-database weaviate
Last synced: 22 Sep 2025
https://github.com/drivy/checker_jobs
Regression testing for data
data regression-testing ruby sidekiq
Last synced: 06 Apr 2025
https://github.com/hebilicious/vue-query-nuxt
A lightweight, 0 config Nuxt Module for Vue Query.
data data-fetching fetch nuxt react-query tanstack tanstack-query vue vue-query
Last synced: 04 Apr 2025
https://github.com/Hebilicious/vue-query-nuxt
A lightweight, 0 config Nuxt Module for Vue Query.
data data-fetching fetch nuxt react-query tanstack tanstack-query vue vue-query
Last synced: 02 Aug 2025
https://github.com/knime/knime-python
KNIME Python Integration
arrow data flatbuffer integration knime python science workflow
Last synced: 21 Jan 2026
https://github.com/torkleyy/nitric
[ABANDONED] General-purpose data processing library. Mirror of https://gitlab.com/nitric/nitric
data ecs entity-component processing
Last synced: 20 Aug 2025
https://github.com/fityannugroho/idn-area-map
The map of Indonesia's administrative areas 🇮🇩🌏
data hacktoberfest idn-area indonesia island map nextjs tailwindcss wilayah
Last synced: 07 Apr 2025
https://github.com/mahmoudparsian/pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
algorithms big-data data data-abstractions data-science dataframe distributed-computing graphframes mapreduce monoid nosql partitioning pyspark pyspark-algorithms python rdd spark transformations
Last synced: 07 Apr 2025
https://github.com/jbzoo/data
Extended implementation of ArrayObject - useful collection for any config in your system (write, read, store, change, validate, convert to other format and etc).
arrayobject config converts data filters ini jbzoo php yml
Last synced: 05 Apr 2025
https://victorcouste.github.io/data-tools/
Data Tools Subjective List
awesome awesome-list data data-architecture data-tools datatools list modern open-source opensource tools
Last synced: 10 May 2025
https://github.com/opensource-observer/oss-directory
A curated directory of open source software (OSS) projects and their associated artifacts
data github open-source public-goods research
Last synced: 08 Oct 2025
https://github.com/uwdata/flechette
Fast, lightweight access to Apache Arrow data.
Last synced: 04 Apr 2025
https://github.com/wildflowai/platform
Model natural ecosystems 🌎🪸🐳
ai biodiversity conservation data ocean restoration
Last synced: 25 Nov 2025
https://github.com/f1lt3r/bitcoin-scraper
💲 bitcoin chart history scraper
bitcoin bitcoin-scraper chart data graph javascript json logarithmic moore scrapes-bitcoin-data
Last synced: 16 Mar 2025
https://github.com/ngxs-labs/data
NGXS Persistence API
data entity ngxs ngxs-persistence-api
Last synced: 24 Apr 2025
https://github.com/spine-tools/Spine-Toolbox
Spine Toolbox is an open source Python package to manage data, scenarios and workflows for modelling and simulation. You can have your local workflow, but work as a team through version control and SQL databases.
anaconda data energy miniconda python simulation-model spine-toolbox workflow
Last synced: 07 May 2025
https://github.com/visgl/deck.gl-data
Data for the data visualization library deck.gl examples (https://uber.github.io/deck.gl/#/)
data data-science data-visualization uber
Last synced: 12 Jun 2025
https://github.com/ashvin27/react-datatable
React-datatable is a component which provide ability to create multifunctional table using single component like jQuery Datatable. It's fully customizable and easy to integrate in any react component. Bootstrap compatible.
data datatables datatables-plugin react react-data-table react-datagrid react-datatable react-table table
Last synced: 13 May 2025
https://github.com/smappnyu/youtube-data-api
A Python Client for collect and parse public data from the Youtube Data API
api api-wrapper data python python-client research research-tool youtube youtube-api-v3 youtube-search
Last synced: 28 Oct 2025
https://github.com/turbot/steampipe-postgres-fdw
The Steampipe foreign data wrapper (FDW) is a zero-ETL product that provides Postgres foreign tables which translate queries into API calls to cloud services and APIs. It's bundled with Steampipe and also available as a set of standalone extensions for use in your own Postgres database.
aws azure data devsecops gcp golang hacktoberfest kubernetes postgres postgresql postgresql-fdw security sql steampipe steampipe-engine
Last synced: 07 May 2025
https://github.com/apple/dnikit
A Python toolkit for analyzing machine learning models and datasets.
ai bias compression data data-duplication fairness fairness-ml introspection machine-learning ml python
Last synced: 19 Oct 2025
https://github.com/queryverse/iterabletables.jl
Implementations of the TableTraits.jl interface for various packages
Last synced: 12 Apr 2025
https://github.com/purarue/hpi
Human Programming Interface - a way to unify, access and interact with all of my personal data [my modules]
data gdpr history lifelogging personal-api quantified-self
Last synced: 04 Jul 2025
https://github.com/opennem/opennem
Australian energy market data platform
aemo climate data energy nem nemweb openelectricity opennem superpower wem
Last synced: 12 Apr 2025
https://github.com/textileio/textile-facebook
[DEPRECATED] simple parsing tool to get your data out of a facebook export
data exporters photography privacy
Last synced: 05 Jan 2026
https://github.com/tirthajyoti/synthetic-data-gen
Various methods for generating synthetic data for data science and ML
classification data data-science machine-learning python regression symbolic-computation time-series
Last synced: 30 Apr 2025
https://github.com/mainakrepositor/datasets
A bunch of some 200 datasets. You can call it mini-kaggle :)
csv data data-science database datasets image-files mini-kaggle ml nlp-machine-learning tsv
Last synced: 01 Mar 2025
https://github.com/trailheadapps/coral-cloud
Sample application that showcases Data Cloud, Agents and Prompts.
agents ai cloud data prompt salesforce
Last synced: 05 Apr 2025
https://github.com/dylanhogg/awesome-crypto
A list of awesome crypto and blockchain projects
awesome awesome-list bitcoin blockchain crypto cryptocurrency data data-analysis ethereum github
Last synced: 30 Dec 2025
https://github.com/piquette/qtrn
A cli tool to streamline financial markets data analysis :wrench:
cli data data-science finance go golang options quotes scraper stock stock-analysis stock-market
Last synced: 15 May 2025
https://github.com/melroy89/metacritic_api
PHP Metacritic API - Mirror from my GitLab
api crawler data metacritic parser php scores scraper webscraping
Last synced: 13 May 2025
https://github.com/visivo-io/visivo
✨ Build dashboards with end-to-end version control. 🔋 CLI w/ batteries included, no infra required. Develop on your laptop for instant results, deploy changes safely (with automated checks), and keep every report trustworthy for stakeholders, analysts and agents 🤖
analytics bi bi-analytics bi-as-code business-intelligence data data-analysis data-visualization duckdb plotlyjs pydantic python reactjs sql
Last synced: 16 Oct 2025
https://github.com/cipherstash/jseql
Encrypt and protect data using industry standard algorithms, field level encryption, a unique data key per record, bulk encryption operations, and decryption level identity verification.
data data-security encryption javascript postgres postgresql security typescript
Last synced: 09 Apr 2025
https://github.com/countries/countries-data-json
ISO 3116 country information in JSON format to be included in other projects.
countries currency data iso-3166-1 iso-3166-2 iso-4217 json
Last synced: 18 Jan 2026
https://github.com/ECSIM/pem-dataset1
Proton Exchange Membrane (PEM) Fuel Cell Dataset
activation-procedure chemistry data data-science dataset electrochemistry energy fuel-cell impedance mea nafion open-science open-source pem physics polarization power proton-exchange-membrane science science-research
Last synced: 07 May 2025
https://github.com/capitalone/dataCompareR
dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.
compare-data data data-analysis data-science r
Last synced: 30 Jul 2025
https://github.com/kaustubhhiware/facebook-archive
Just some fun you can have with facebook's archive data
data data-visualization facebook python
Last synced: 15 Apr 2025
https://github.com/rsheftel/raccoon
Python DataFrame with fast insert and appends
Last synced: 03 Apr 2025
https://github.com/j535d165/datahugger
One downloader for many scientific data and code repositories! DOI :open_hands: Data
cli data datacite dataone dataverse dryad figshare github mendeley-data python rdm repository research research-data-management science scientific scientific-data utrecht-university zenodo
Last synced: 14 Jul 2025
https://github.com/geonetwork/geonetwork-ui
GeoNetwork UI is a suite of Applications made to provide a modern facade to your GeoNetwork 4 catalog. It also provides Web Components to embed various parts of your data catalog in third party websites.
angular data geonetwork gis ui webcomponents
Last synced: 10 Apr 2025
https://github.com/cxmeel/sift
Immutable data library for Luau.
conversion data dictionary immutability immutable luau roact roblox roblox-lua roblox-rojo robloxdev robloxlua rojo typescript utility wally
Last synced: 21 Jan 2026
https://github.com/nomihq/nomi
Nomi enable people to use computer more simply.
action ai automation data devtools llm multimodal privacy productivity realtime reinforcement-learning security taskrunner voice
Last synced: 06 Apr 2025