data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-23 00:07:41 UTC
- JSON Representation
https://github.com/yisaienkov/tinysets
The project aims to collect various datasets for tasks such as classification, clustering, object detection... The purpose of this datasets is quick checking models and algorithms performance.
algorithms classification data data-science dataset datasets kaggle kaggle-dataset lego lego-minifigures lego-sets object-detection pypi python regression text-classification tinysets
Last synced: 14 Apr 2025
https://github.com/strmprivacy/cli
This is the STRM Privacy Command Line Interface, to define and manage your privacy streams, data schemas, event contracts and much more.
cli data data-pipeline data-privacy data-privacy-compliance data-processing privacy
Last synced: 23 Jun 2025
https://github.com/prakaa/mms-monthly-cli
Source code and CLI tool to query and download data from the Australian Energy Market Operator's Monthly Data Archive
aemo australia data energy national-electricity-market nem nemweb python
Last synced: 30 Oct 2025
https://github.com/khaouitiabdelhakim/etl-real-example
This repository contains a real example of an Extract, Transform, Load (ETL) process using SQL Server Management Studio (SSMS), SQL Server Integration Services (SSIS), and AdventureWorks2012 data. The objective is to load data into our LightAdventureDW data warehouse.
data database database-management sql sql-server ssis ssms warehouse
Last synced: 18 Mar 2025
https://github.com/bluegrams/periodic-table-data
Data of all chemical elements in the periodic table
chemistry csharp data dotnet elements periodic-table
Last synced: 18 Mar 2025
https://github.com/zh-plus/faker-openai
Generate fake data with OpenAI's GPT-3 API.
data fake fake-data fake-data-generator faker openai-api python test-data test-data-generator testing
Last synced: 18 Mar 2025
https://github.com/wmarquardt/cassandra-csv
A simple way to export cassandra query result to CSV format
Last synced: 18 Mar 2025
https://github.com/philipperemy/japanese-street-addresses-scraper
Scraper for Japanese street addresses (住所).
data dataset-creation dataset-generation scraper scraper-engine
Last synced: 08 May 2025
https://github.com/plateformeio/plateforme
The Python framework for Data Applications
api app asgi async data db fastapi plateforme pydantic python restx services sqlalchemy
Last synced: 01 May 2025
https://github.com/heysupratim/android-app-categories
A JSON having 19K Android package name entries with their Play Store Categories. Useful for people looking to create App Category Based things. Eg Smart Launcher
android crawled-data data json
Last synced: 26 Mar 2025
https://github.com/biglocalnews/bln-python-client
Python client for the biglocalnews.org API
api api-client data data-journalism graphql graphql-client journalism news python
Last synced: 03 Apr 2026
https://github.com/cmpadden/dagster-pipes-rust
Dagster pipes implementation in Rust
dagster data integrations orchestration rust
Last synced: 11 Oct 2025
https://github.com/abcnews/data-australian-political-donations
A data package about political donations in Australia.
Last synced: 27 Jan 2026
https://github.com/fdmorison/tiozin
Tiozin, your friendly ETL framework
data declarative etl framework pipeline
Last synced: 26 Apr 2026
https://github.com/buabaj/byte
data transmission over sound modulator-demodulator model
Last synced: 08 Oct 2025
https://github.com/jrdnbradford/readmdtable
R 📦 for reading markdown tables into tibbles
data data-analysis data-analytics data-extraction data-mining data-science markdown markdown-parser markdown-table r r-package r-programming
Last synced: 23 Oct 2025
https://github.com/datasets/genome-sequencing-costs
Costs associated with DNA sequencing since 2001
Last synced: 19 Oct 2025
https://github.com/synzen/Discord.Stats
Data visualization for Discord server activities
charts data discord statistics tracking visualization
Last synced: 12 Oct 2025
https://github.com/earthinversion/geospatial-data-visualization-using-pygmt
Example script to visualize topographic data, earthquake data, and tomographic data on a map
data geophysics pygmt python3 seismology visualization
Last synced: 10 Apr 2025
https://github.com/simonsfoundation/spectrum-drug-tracker
Python files and datasets underlying the Spectrum Drug Tracker.
autism data data-visualization python python3
Last synced: 27 Feb 2026
https://github.com/koddachad/dq_tester
A lightweight simple data quality testing tool.
data database dataengineering dataquality dataqualitycheck
Last synced: 08 Oct 2025
https://github.com/szczyglis-dev/ultimate-chain-parser
[PHP] Advanced, extendable, and configurable text data parsing and processing toolkit working in a chain-based flow. The concept of the application is based on processing in subsequent iterations using configurable data processing modules in a configured manner. Each element in the execution chain accesses the output of the previous element.
composer-library csv csv-parser data json-parser parsing plugin-architecture processing rearrange-array recordset regex regex-match regex-pattern repack repair-processes reparse text text-generation text-processing yaml-parser
Last synced: 08 Oct 2025
https://github.com/m-dadej/downloading-and-aggregating-stocks
Scripts for downloading WSE/GPW stock prices. Allows for downloading historical price for every stock into a single dataset
data finance gpw historical-data stock
Last synced: 29 Apr 2025
https://github.com/mckraqs/dataride
Lightning-fast data platform setup toolkit for small projects and PoCs
data data-engineering python terraform
Last synced: 24 Oct 2025
https://github.com/paulgrammer/ug-locale
Uganda districts, sub-counties, counties, parishes and villages
data districts nodejs npm-package uganda
Last synced: 02 Mar 2026
https://github.com/filippobovo/betfair_data
Simple script to collect market data from Betfair.
betfair betfair-api collection data python
Last synced: 27 Feb 2026
https://github.com/dbt-labs/jaffle-shop-mesh-finance
A ✨ meshified ✨ open source sandbox project exploring dbt workflows via a fictional sandwich shop's data. This is a domain-focused node in the mesh focused on finance models, built on the jaffle-shop-mesh-platform project.
analytics analytics-engineering data data-engineering dbt dbt-cloud
Last synced: 05 Mar 2026
https://github.com/anthonykrivonos/nba-ml
🏀 Hardcoded ML classifiers from scratch to create predictive models on the outcomes of NBA games!
basketball classifiers data fromscratch hardcoded machine-learning ml nba python science sports
Last synced: 08 Oct 2025
https://github.com/andrewrporter/goiex
A go interface for accessing IEX finanical information
data fetch finance golang iex iex-api iextrading
Last synced: 28 Apr 2025
https://github.com/shreshthvashisht/imdb-movie-analysis
Advanced MS Excel
data data-analysis-excel pivot-tables visualisation
Last synced: 01 Mar 2026
https://github.com/frederickgeek8/lyql
📈 Free realtime stock data. Streamed straight from Yahoo.
data data-mining finance realtime stocks stream stream-api yahoo
Last synced: 05 Mar 2026
https://github.com/insightsoftwareconsortium/rirewebsite
Website sources for The Retrospective Image Registration Evaluation Project (RIRE)
data grand-challenge imaging open-access open-science registeration
Last synced: 12 Oct 2025
https://github.com/dotflow-io/dotflow
🎲 Business Logic Code in a flow!
data data-structures database dataflow dataflow-programming etl etl-framework etl-pipeline flow python python3 workflow workflow-engine
Last synced: 11 Apr 2026
https://github.com/matheusfelipeog/filometro
Obtenha os dados dos postos de vacinação da covid-19 em São Paulo
coronavirus covid-19 data de-olho-na-fila filometro python sao-paulo vacina vacinasampa wrapper
Last synced: 07 Oct 2025
https://github.com/route1io/route1io-python-connectors
Connectors for interacting with popular APIs and services used in marketing analytics via clean and concise Python code.
analytics api api-connector data data-engineering marketing marketing-analytics python python3
Last synced: 13 Apr 2026
https://github.com/zq99/optionsview
This library downloads option chain data for a given symbol from yahoo finance in a trader friendly format.
data options options-trading trading yahoo-finance
Last synced: 14 Jan 2026
https://github.com/giscience/osm-transform
Filter, enrich and prepare your OSM data for openrouteservice 🚙
cleanup data elevation enrichment filter graphs openrouteservice openstreetmap pbf routing
Last synced: 01 Apr 2026
https://github.com/gematik/spec-isip
FHIR resources for information technology systems in nursing care (ISiP – Informationstechnische Systeme in der pflegerischen Versorgung) are determined through the affirmative action process of the same name. Through ISiP, open and standardized interfaces are defined for the interoperable exchange of health data in care.
Last synced: 03 Mar 2026
https://github.com/iondv/report
IONDV. Framework: Report module is to form the analytical reports.
analytics businessintelligence css data data-analysis data-visualization iondv iondv-module reporting
Last synced: 12 Mar 2026
https://github.com/kevinrecuerda/recshark
:shark: Provide some C# tools
actions data di dotnet expression-evaluator netcore testing tools
Last synced: 10 May 2026
https://github.com/siongui/7rsk9vjkm4p8z5xrdtqc
Pāli chanting resources and dhammatalk books
Last synced: 19 Jan 2026
https://github.com/cafali/pathscan
PathScan exports information about the contents of directories and hard drives. With a single click, you can create a complete list of all files and paths within a specific folder or across an entire hard drive.
backup command-line data data-analysis data-migration data-mining data-recovery directory folder-management folders forensics hard-drive keyword-extraction logging pathfinding recovery string-search tools utility windows
Last synced: 10 Oct 2025
https://github.com/caerbannogwhite/aargh
A library that helps you out of data nightmares in Go. 🧙♂️
csv data data-science data-wrangling dataframe go golang html json linq statistics stats xlsx xpt
Last synced: 14 Jan 2026
https://github.com/dathere/qsvpro.dathere.com
🌐 Promo website for qsv pro, a spreadsheet data wrangling desktop app. Includes download links for Windows, macOS, & Linux. Website built with Astro as a static site.
astro ckan csv data data-wrangling framer-motion javascript product qsv react saas tailwindcss website
Last synced: 28 Feb 2026
https://github.com/colour-science/colour-demosaicing-examples-datasets
Colour - Demosaicing - Examples Datasets
color color-science color-space color-spaces colorspace colorspaces colour colour-science colour-space colour-spaces colourspace colourspaces data dataset datasets de-mosaicing debayering demosaicing demosaicking raw
Last synced: 27 Feb 2026
https://github.com/reiniervlinschoten/castoredc_api
Python Wrapper for Castor EDC API
castor-edc castor-edc-api clinical-research clinical-trials data data-science python3 wrapper-api
Last synced: 15 Oct 2025
https://github.com/tiramizoo/simple_data_migrations
Data migrations
data hacktoberfest migration rails ruby
Last synced: 12 Oct 2025
https://github.com/adityashrm21/exploratory_data_analysis
A collection of exploratory data analysis techniques and resources
data data-analysis data-exploration data-science data-visualization dataset datasets eda exploratory-data-analysis insights kaggle
Last synced: 29 Apr 2025
https://github.com/koffisani/coding-data-togo
Données sur les langages et outils de développement utilisés ou sollicités au Togo
data python python3 scrapy scrapy-crawler
Last synced: 26 Mar 2025
https://github.com/freeipcc/freedatascrm
工商数据,电话获客,智能客户关系管理,数据驱动营销,自动化销售线索,B2B营销,客户洞察分析,精准营销!
ai bigdata bigdataanalytics data scrm
Last synced: 08 Feb 2026
https://github.com/jetsly/ddrx
A lightweight front-end framework based on rxjs. (Inspired by camel)
Last synced: 13 Oct 2025
https://github.com/gematik/spec-templateforsimplifierprojects
Template for creating gematik FHIR profiles
data fhir fsh miscellaneous template
Last synced: 25 Feb 2026
https://github.com/cmstatr/cmstatr
An R Package for Statistical Analysis of Composite Material Data
composite-material-data cran data materials-science r statistical-analysis statistics
Last synced: 22 Oct 2025
https://github.com/claudiucreanga/data-science
Data Science notebooks
competitions data kaggle science
Last synced: 14 Oct 2025
https://github.com/adhar-io/adhar
ADHAR - The Open Cloud-Native Foundation
adhar adhar-patform ai analytics architecture cloud-native data developerexperience devops enterprise gitops governance helm idp k8s kubernetes microservices rapid-development security
Last synced: 25 Feb 2026
https://github.com/spatialcurrent/go-math
Math functions that support varied types
Last synced: 29 Jan 2026
https://github.com/zeybek/node-matlab
NodeJS Package for MATLAB
algebra analytics data matlab matrix signal-processing
Last synced: 13 Mar 2026
https://github.com/franloza/running-races-insights
Web application created with Evidence and DuckDB to share stats about the running races in Cuenca.
data dataengineering duckdb elt evidence markdown netlify running sql visualization
Last synced: 23 Jun 2026
https://github.com/cgivre/drill-geoip-functions
GeoIP Functions for Apache Drill
apache-drill city country data data-analysis data-science drill geoip-functions ip-address ipv4
Last synced: 12 Apr 2025
https://github.com/apache/incubator-devlake-playground
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
dashboard-friendly data data-analysis data-engineering data-integration data-transfers devops domain-layer dora etl hacktoberfest integration jira open-source python user-friendly
Last synced: 19 Oct 2025
https://github.com/orfium/s3-parquetifier
This is a tool that takes a file from an S3 bucket and transforms it to Parquet format
Last synced: 12 Apr 2025
https://github.com/navchandar/file-convertor-utils
Set of custom Python Utilities to convert one file format into another. Filetypes supported: Excel, Images, PDF, GIF, MP4, XML, etc.
conversions convertor-utils data dataconversion excel file-conversion fileconversion fileformats image pdf python video xml
Last synced: 21 Sep 2025
https://github.com/unicef/magasin
Cloud native open-source end-to-end data / AI / ML platform
cloud dagster data data-pipelines data-science data-visualization helm-charts kubernetes magasin
Last synced: 21 Apr 2025
https://github.com/longnguyen010203/ecommerce-elt-pipeline
🌄📈📉 A Data Engineering Project 🌈 that implements an ELT data pipeline using Dagster, Docker, Dbt, Polars, Snowflake, PostgreSQL. Data from kaggle website 🔥
dagster data data-engineering dbt docker docker-compose dockerfile elt elt-pipeline extract kaggle load polars postgresql raw-data relational-databases snowflake transform
Last synced: 27 Feb 2026
https://github.com/bkuhlmann/lode
A monadic store of marshaled objects.
data objects persistence pstore storage transactions value
Last synced: 29 Jul 2025
https://github.com/r-js/mangos
🥭's is monorepo collecting data wrangling and data validation utilities
counterculture data data-wrangling fold functional isomorphism javascript json lens optics schema traversal validation
Last synced: 22 Feb 2026
https://github.com/dsietz/pbd
Privacy by Design SDK
actix-web best-practices data data-privacy development-kit nfjs pbd pbd-sdk privacy privacy-by-design rust rust-lang sdk sdk-rust strategies
Last synced: 09 Apr 2025
https://github.com/jesusgraterol/binance-futures-dataset-builder
The dataset builder script extracts the most relevant market data straight from Binance's API and builds a series of datasets that can be used in data science and machine learning projects.
bitcoin blockchain blockchain-technology data datascience datascience-machinelearning dataset dataset-generation futures futures-long-short futures-market machine-learning
Last synced: 06 Mar 2026
https://github.com/sabyasachi-seal/summer-olympics-data-analysis
Analyzing 2020 Summer Olympic Dataset
analysis colab-notebook data data-science dataset jypyternotebook olympics
Last synced: 08 May 2025
https://github.com/amol-/datapyground
Easy to study Data Platform for fun and profit
compute-engine data data-engineering database python
Last synced: 28 Jul 2025
https://github.com/stefen-taime/kafka-pipeline
In the following post, we will learn how to build a data pipeline using a combination of open-source software (OSS), including Debezium, Apache Kafka, Kafka Connect.
bash data docker elasticsearch etl-pipeline k kafka kafka-connect kafka-streams kafka-topic kibana ksqldb masking mongodb mysql pii pipeline postgresql
Last synced: 15 Apr 2025
https://github.com/nasa-pds/naif-pds4-bundler
Package to generate PDS4 SPICE Kernels Archives
archive data geometry geometry-processing navigation planetary-data planetary-science python spice
Last synced: 06 Jan 2026
https://github.com/effect-deprecated/morphic
Domain Modelling and Structural Derivation (port of morphic-ts)
data domain functional typeclasses
Last synced: 29 Jun 2025
https://github.com/spsanderson/healthyr.data
Data sets for the healthyR package.
data data-science data-sets healthcare healthcare-analysis healthcare-application healthcare-datasets r rstats
Last synced: 07 Apr 2025
https://github.com/spsanderson/steveondata
Repository for mainly R tips and tricks for my blog. I also include some VBA, SQL, C and Linux Usage.
ai blog c data data-science linux machinelearning-r ml ms-sql r sql time-series tipoftheday vba vba-excel
Last synced: 07 Apr 2025
https://github.com/mooxphp/data
[READ-ONLY] Static Language Data for Filament
countries currencies data filament languages laravel static timezones
Last synced: 20 Feb 2026
https://github.com/juliaearth/geoartifacts.jl
Artifacts (e.g., datasets) for Geospatial Data Science
Last synced: 10 Apr 2026
https://github.com/dprokop/querier
Simple declarative data layer for React apps
data declarative react typescript
Last synced: 23 Mar 2025
https://github.com/arindal1/striversdsasheet
Solutions of all the problems in Striver's A2Z DSA Sheet
cpp data datastructures datastructures-algorithms striver strivers-sde-sheet
Last synced: 04 Apr 2025
https://github.com/hitsz-ids/dbmasker
DBMasker 是一个针对主流数据库系统的 Java 开源项目,旨在提供统一且安全的访问接口。
data data-security database mask sdk security
Last synced: 26 Apr 2025
https://github.com/samashi47/ml-toolkit-project
A general-purpose toolkit for data preprocessing, machine learning modeling, and visualization.
classification data data-preprocessing machine-learning python3 visualization
Last synced: 30 Jul 2025
https://github.com/bjascob/pythondataserve
A module for serving up python data in a stand-alone process.
Last synced: 23 Apr 2025
https://github.com/flintsh/outlier-tools
A collection of free open-source tools to help you better understand your Outlier account, entirely handled in-browser.
Last synced: 27 Feb 2025
https://github.com/sjefvanleeuwen/rqlite-dotnet
A lightweight database HTTP API client for rqlite. rqlite is a lightweight, distributed relational database, which uses RAFT and SQLite as its storage engine.
cluster data database distributed distributed-computing distributed-database distributed-systems dotnet raft rqlite
Last synced: 12 May 2025
https://github.com/mozahran/data-mapper
A data mapping tool that helps you map JSON with configuration files (JSON structure transformation). It also supports if conditions, casting, and mutators (custom or built-in functions).
data json mapper mappings mutator transformer
Last synced: 13 Jan 2026
https://github.com/pottekkat/bulldozer-prize-predictions
Predict the auction sale price for a piece of heavy equipment to create a "blue book" for bulldozers.
bluebook bulldozer data data-science jupyter-notebook kaggle-competition machine-learning
Last synced: 20 Jun 2026
https://github.com/Scetrov/FrontierSharp
C# / .NET API Clients for EVE Frontier — API client for the static data exposed by CCPs HTTP API plus a HTTP Client tuned to the specific API design patterns implemented by CCP.
api data eve-frontier static-data
Last synced: 30 May 2026