data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-02-05 00:08:03 UTC
- JSON Representation
https://github.com/netcorestack/datatransform
Sql2Sql or Sql2MongoDb Transform Tool
data data-transform database migration mongodb mssql nosql sql sql2mongo sql2mongodb sql2sql tooling transform
Last synced: 11 Apr 2025
https://github.com/evilpegasus/real-estate-price-prediction
Predicting NYC real estate sale prices using neural networks (1st place Berkeley SAAS Kaggle Competition Fall 2020)
Last synced: 28 Oct 2025
https://github.com/stas00/reddit-to-threads
Convert arctic_shift Reddit data dumps into thread-view documents
Last synced: 20 Mar 2025
https://github.com/purarue/hpi-template
A cookiecutter template for creating a HPI repository
data lifelogging quantified-self
Last synced: 18 Mar 2025
https://github.com/mps9506/rattains
:droplet: Access EPA ATTAINS data in R :droplet:
data epa r r-package rstats water-quality
Last synced: 12 Apr 2025
https://github.com/c3n7ral051nt4g3ncy/justtrakem
Sports Tracker Large Scale Profile Checker
chromedriver data intelligence opensourceforgood osint osint-python osint-tool python3 selenium sports tracking
Last synced: 08 May 2025
https://github.com/pysat/pysatmodels
Interface for model analysis and model-data comparisons within the pysat ecosystem
comparison data dineof model pysat python sami2 tie-gcm validation
Last synced: 22 Jul 2025
https://github.com/pnnl/constrain
ConStrain is a data-driven knowledge-integrated framework that automatically verifies that building system controls function as intended.
bms building commissioning data hvac simulation verification
Last synced: 12 Apr 2025
https://github.com/tefra/xsdata-samples
Naive XML Bindings for python Samples
bindings codegen data naive-xml-bindings python python3 schema xml xsd xsdata xsdata-samples
Last synced: 22 Aug 2025
https://github.com/deiu/solidproxy
Proxy server with authentication (using WebID-TLS delegation)
data delegation linked proxy-server solid webid webid-tls-delegation
Last synced: 11 Jan 2026
https://github.com/instafluff/coronavirus
COVID-19 Coronavirus Data Tracker
2019-ncov coronavirus covid-19 data ncov ncov-2019 sars-cov-2 wuhan
Last synced: 29 Oct 2025
https://github.com/cloudposse/terraform-aws-dms
Terraform modules for provisioning and managing AWS DMS resources
data dms dynamodb migration mysql oracle postgres postgresql s3 sql sql-server
Last synced: 09 Sep 2025
https://github.com/malloydata/malloy-cli
A command-line interface for executing Malloy and SQL
data data-modeling transformation
Last synced: 13 Jul 2025
https://github.com/upsonic/server
Self-Driven Autonomous Python Libraries
data data-science gpt-4o library-management ml mlops python
Last synced: 22 Aug 2025
https://github.com/openscm/scmdata
Handling of Simple Climate Model data
climate data data-visualization
Last synced: 27 Jul 2025
https://github.com/seanbreckenridge/HPI-template
A cookiecutter template for creating a HPI repository
data lifelogging quantified-self
Last synced: 29 Jul 2025
https://github.com/effect-deprecated/monocle
Optics for your data (port of monocle-ts)
Last synced: 23 Apr 2025
https://github.com/polyfrost/datastorage
Static data storage for Polyfrost projects.
data data-storage hypixel hypixel-api minecraft minecraft-mod
Last synced: 25 Feb 2025
https://github.com/gagolews/clustering-data-v1
A framework for benchmarking clustering algorithms – Benchmark suite, version 1
benchmark benchmark-datasets clustering data dataset datasets machine-learning
Last synced: 19 Apr 2025
https://github.com/cmpadden/dagster.nvim
Control the Dagster orchestrator directly from Neovim
Last synced: 15 Mar 2025
https://github.com/node/circos
Mirror of cricos. "Circos is a software package for visualizing data and information. It visualizes data in a circular layout — this makes Circos ideal for exploring relationships between objects or positions. "
Last synced: 19 Apr 2025
https://github.com/jgraving/deepposekit-data
Example datasets for DeepPoseKit
data deepposekit pose-estimation posture
Last synced: 24 Sep 2025
https://github.com/0x8b/datamatrix
Library that enables programs to write Data Matrix barcodes of the modern ECC200 variety.
cli code data data-matrix datamatrix ecc ecc200 elixir generator matrix
Last synced: 26 Sep 2025
https://github.com/vatshayan/music-songs-genre-dataset
Classification & Implementation of Machine Learning ALgorithms on Music Dataset
artificial-intelligence-algorithms classification data datascience dataset machine machine-learning machine-learning-algorithms
Last synced: 04 Mar 2025
https://github.com/hyparam/hightable
A dynamic windowed scrolling table component for react
component data dataset grid javascript react scrolling table virtualized windowed
Last synced: 14 May 2025
https://github.com/p0dalirius/windowsbuilds
This repository contains the list of windows builds as parsable JSON files.
Last synced: 03 Sep 2025
https://github.com/jcwieme/data-scripts-star-wars
Useful repo for data-viz (e.g.) which contains the scripts of the Star Wars movies as well as refined versions with only the dialogues.
characters count csv-files data data-visualization dialogues durations film json movies speaker star star-wars wars
Last synced: 14 Apr 2025
https://github.com/colinianking/sluice
Sluice is a program that reads input on stdin and outputs on stdout at a specified data rate.
Last synced: 25 Aug 2025
https://github.com/michaelwitting/metabolomics2018
Scripts & Data for XCMS Workshop, Metabolomics 2018 in Seattle
analysis data metabolomics xcms
Last synced: 06 Mar 2025
https://github.com/ghodsizadeh/household-survey-data-iran
Household Survey Data Iran but in sqlite3(to work with data easily and everywhere)
Last synced: 04 Oct 2025
https://github.com/smac-group/imudata
:battery: This package is meant to serve as a data collection tool for IMU data. This data can be used as a means to assess and test methods designed to analyse IMU error signals (i.e. long and complex autocorrelated signals). An example method used for this kind of data is implemented in the GMWM R package which can also model the latent models that often characterize this data.
accelerometer data gyroscope imu mems-imu-dataset r
Last synced: 22 Apr 2025
https://github.com/hyunjoonbok/R-projects
Portfolio in R
anomaly-detection data data-analysis data-science data-visualization database deep-learning h2o keras lightgbm marketbasketanalysis regression-models tensorflow time-series xgboost
Last synced: 30 Jul 2025
https://github.com/ikstream/dalec
Dalec is a project that aims to provide a privacy preserving data collection method. It utilizes DNS for client/server seperation while transmiting data encrypted
collection data data-collection dns exfiltration shell
Last synced: 11 Aug 2025
https://github.com/HowTheyVote/data
Weekly updated data on roll-call-votes in the European Parliament, as collected by https://howtheyvote.eu/.
data ep eu european parliament plenary votes
Last synced: 12 Aug 2025
https://github.com/marcoradocchia/microxdg
An XDG Base Directory Specification Rust library that aims to be conservative on memory allocation and overall memory footprint.
appdir basedir cache config data directories directory executable file home path runtime rust rust-lang state xdg xdg-basedir xdg-compliance xdg-user-dirs
Last synced: 23 Mar 2025
https://github.com/anandchowdhary/everything
⏳ Everything Everywhere All at Once
Last synced: 30 Jul 2025
https://github.com/half0wl/datagovsg_api
Unofficial Python API wrapper for public APIs at developers.data.gov.sg
api-wrapper data government-data python singapore
Last synced: 13 Aug 2025
https://github.com/pravj/github-dynamics
Source code for "Information Dynamics on the GitHub Network"
data open-source visualization
Last synced: 30 Jul 2025
https://github.com/anna-geller/dataflow-ops-aws-eks
Project demonstrating how to automate Prefect 2.0 deployments to AWS EKS
automation aws cicd data data-engineering data-engineering-infrastructure data-engineering-pipeline data-products dataflow dataflow-ops eks eks-cluster eksctl karpenter kubernetes kubernetes-deployment kubernetes-setup python serverless
Last synced: 24 Mar 2025
https://github.com/simongravelle/publication-data
Data and scripts from recent publications
bash data gromacs interface lammps mlpi molecular-dynamics nmr open-data open-research outreach salt scientific-paper soft-matter water
Last synced: 31 Aug 2025
https://github.com/tyson-swetnam/emsi
ecosystem moisture stress index
data google-earth-engine javascript python rmarkdown
Last synced: 20 Sep 2025
https://github.com/recodehive/recode-website
recodehive helps you to learn and master the skills on data, and encourage you to code on opensource.
data data-science dataengineering opensource python sql tutorials website
Last synced: 27 Jul 2025
https://github.com/linwin-cloud/linwin-db-server
在广袤无垠的现代大数据海洋之中,计算机深度的和信息以及数据绑定,承载这亿万数据的就是数据库软件。 Linwin Data Server,基于Java开发的国产高性能数据库软件。支持国产和Linux操作系统,支持多用户操作。采用Nosql结构,自研mys数据库操作语言,更加简单方便高效。 用户数据的增删改查全部在内存内操作,与硬盘的交互写入读取交由专门的线程管理,无不妨碍.
data data-science database hashmap http java javascript key-value linux programming-language python server typescript webserver website
Last synced: 17 Jun 2025
https://github.com/josechirif/2018-house-price-estimation---melbourne-australia
The project proposes to calculate the price of a Melbourne house according to its characteristics.
Last synced: 14 Apr 2025
https://github.com/mstgnz/data-structures
Data Structures With Go
data data-structures go golang
Last synced: 22 Sep 2025
https://github.com/carpentries-incubator/foundational-computer-skills
Foundational Computer Skills
configuration configuration-management data download english environment-configuration lesson pre-alpha project-management setup software
Last synced: 02 Sep 2025
https://github.com/dp6/raft-suite-hub
O Hub é a solução responsável por centralizar a consolidação dos dados no BigQuery, ferramenta escolhida para servir de data warehouse do raft-suite.
bigquery data data-quality google-cloud google-cloud-functions hacktoberfest
Last synced: 28 Jun 2025
https://github.com/marcusschiesser/vectorstores
Vectorstores is a framework for using vector databases in your AI applications
ai ai-agents data database embeddings vector vector-database
Last synced: 13 Jan 2026
https://github.com/aminkhani/dsa
Data Structures and Algorithms Tutorial
algorithm algorithms computer-science data data-structures data-structures-and-algorithms datastructures tutorial
Last synced: 02 Jan 2026
https://github.com/data-engineering-community/data-engineering-meetup-in-a-box
A collection of guides, resources, and support for DE meetup organizers.
data data-analysis data-engineering data-mining data-structures database meetups
Last synced: 02 Aug 2025
https://github.com/capturr/jsonld-extract
A damn simple tool to extract json-ld metadata from webpage using jquery like api (jQuery, Cheerio, CashDom ...).
cashdom cheerio crawler crawling data extract extractor javascript jquery json jsonld metadata nodejs parser scraper scraping spider typescript
Last synced: 24 Mar 2025
https://github.com/mquezada/uchile-cc5206
Curso Introducción a la Minería de Datos [DCC UChile]
association-rules chile classification clustering course data eda jupyter mining python r science uchile
Last synced: 15 Mar 2025
https://github.com/jldbc/auctionhouse
See how much advertisers are paying for your attention https://chrome.google.com/webstore/detail/auctionhouse/hmjofiljabjmompfgllkpkbkfbpbpkcp
adtech chrome-extension data prebid
Last synced: 26 Oct 2025
https://github.com/eneko/data-repository
Data files used mainly for testing
data json testing-tools word-list words
Last synced: 29 Sep 2025
https://github.com/darothen/experiment
Organizing numerical model experiment output
Last synced: 03 Oct 2025
https://github.com/mlr-org/mlr3oml
Connect mlr3 with OpenML
data data-science datasets machine-learning mlr3 openml r r-package
Last synced: 09 Aug 2025
https://github.com/lvlyke/ngxs-synchronizers
Easily keep your app's local state synchronized with your backend, databases and more! ngxs-synchronizers simplifies synchronizing your NGXS-based application state with external data sources.
async asynchronous backend data external ngx ngxs remote rxjs state store synchronization synchronize synchronizers
Last synced: 22 Aug 2025
https://github.com/bluebrain/data-validation-framework
Simple framework to create data validation workflows.
data data-analysis python validation validation-tool
Last synced: 14 May 2025
https://github.com/mabel-dev/mabel
😊 mabel is a platform for authoring data processing systems.
data pipelines processing python
Last synced: 27 Aug 2025
https://github.com/pazzo83/noaadata.jl
Wrapper of the NOAA Climate Data API in Julia
Last synced: 06 Apr 2025
https://github.com/eurobios-mews-labs/acrocord
This package provide some useful tools to interact with postgresql server using pandas dataframe
data data-science database pandas-dataframe postgresql psycopg2 python python3 sqlalchemy table-factory
Last synced: 15 Apr 2025
https://github.com/jakarto3d/jakarnotator
The Jakarnotator is an annotation tool to create your own database for instance segmentation problem.
annotations computer-vision data database deep-learning detectron instance-segmentation mscoco training-data
Last synced: 15 May 2025
https://github.com/jujuadams/extendingjson
Human-writeable JSON-like data formats for GameMaker
data gamemaker gamemaker-s gms2 json yaml
Last synced: 15 Apr 2025
https://github.com/lVlyke/ngxs-synchronizers
Easily keep your app's local state synchronized with your backend, databases and more! ngxs-synchronizers simplifies synchronizing your NGXS-based application state with external data sources.
async asynchronous backend data external ngx ngxs remote rxjs state store synchronization synchronize synchronizers
Last synced: 24 Apr 2025
https://github.com/chalk-ai/chalk-go
Go client for Chalk
data data-pipeline feature-engineering
Last synced: 03 Feb 2026
https://github.com/fearlesssolutions/engineering-practice-domains
A mono-repo for the Engineering Practice Domains of Development, Data, Infrastructure, Testing, and Platforms
data data-engineering data-science database-design devops drupal end-to-end-testing engineering infrastructure machine-learning salesforce security testing web-development
Last synced: 26 Oct 2025
https://github.com/anna-geller/prefect-getting-started
Get started with Prefect by scheduling your Prefect flows with GitHub Actions
analytics-engineering automation cicd data data-engineering data-engineering-infrastructure data-engineering-pipeline data-pipeline data-science dataflow dataflow-ops github-actions orchestration pipeline prefect python scheduling serverless
Last synced: 13 Jun 2025
https://github.com/cxmeel/dump-parser
Parses data from the Roblox API dump
api conversion data luau roblox roblox-api roblox-api-wrapper roblox-lua robloxdev robloxlua rojo utility wally
Last synced: 21 Jan 2026
https://github.com/forcedotcom/comdagen
COMmerce DAta GENerator will build a Commerce Cloud site import file tailored to your specification
Last synced: 19 Jun 2025
https://github.com/tushar2704/best-ever-streamlit-applications
101 Super Streamlit Applications-This interactive web application collection serves as a showcase of my data science and machine learning projects. With a passion for data-driven insights and a knack for creating engaging data applications, I am excited to present this portfolio as a demonstration of my skills and expertise.
data datascience machinelearning python streamlit streamlit-tushar2704 tushar2704
Last synced: 10 Oct 2025
https://github.com/simeononsecurity/track-helium-mobile-wifi
A collection of scripts and tools that tracks the availability of helium mobile wifi networks in the wild from the Wigle Dataset and Helium API. Updates every 24 hours.
automation bigdata carrieroffload data data-analysis dataset dataset-generation helium openroaming passpoint python
Last synced: 23 Apr 2025
https://github.com/greenelab/scihub-browser-data
Data for the Sci-Hub Stats Browser
data journals piracy sci-hub supplement webapp
Last synced: 09 Oct 2025
https://github.com/mftnakrsu/water-quality-eda-prediction
data datascience deep-learning machine-learning
Last synced: 12 Jul 2025
https://github.com/flother/thecounted
Copy of the data from The Counted, a Guardian project to count people killed by US law enforcement agencies in 2015 and 2016
data guardian police police-killings
Last synced: 09 Apr 2025
https://github.com/enspirit/monolens
Declarative data transformations as data
data data-engineering homoiconic json yaml
Last synced: 12 Apr 2025
https://github.com/prismadic/tractor-beam
high-efficiency text & file scraper with smart tracking, client/server networking for building language model datasets fast
botnet cluster data file-downloader llm llm-finetuning llm-training mass-downloader scraping
Last synced: 07 May 2025
https://github.com/mackysoft/unidata
[PREVIEW] Data Management for Unity. It also useful for implementing Achievements, Quests, etc.
achievements c-sharp csharp data database input-ouput quest unity unity-editor
Last synced: 09 Apr 2025
https://github.com/naupio/pical
(Work In Process) pita is a general distributed computation system with Erlang language base on DAG model. This project is inspired by DouBan 's DPark and Apache Spark.
big-data bigdata dag data distributed distributed-computing distributed-systems erlang erlang-otp flink spark
Last synced: 19 Oct 2025
https://github.com/shrayasr/globalairports.net
The .Net client for http://www.partow.net/miscellaneous/airportdatabase/
airports cross-platform data dotnet dotnet-standard wrapper
Last synced: 14 Jan 2026
https://github.com/grzegorzme/data-toolz
simple python library for handling data-io tasks
data data-wrangling filesystem python python3 s3 tooling
Last synced: 22 Jan 2026
https://github.com/thoughtspot/visual-embed-sdk
Customizable analytics components for your app, powered by ThoughtSpot's AI
Last synced: 18 Jan 2026
https://github.com/the-pew-inc/the-pew
ThePew is an advanced system of records that enables enterprises to detect trends and patterns from questions to drive marketing and business decisions toward their goals.
data data-science docker javascript machine-learning postgresql rails ruby
Last synced: 06 Oct 2025
https://github.com/cjdoris/chevrons.jl
Your friendly >> chevron >> based syntax for piping data through multiple transformations.
data data-science data-transformation julia julia-lang julia-language macros piping repl
Last synced: 16 Oct 2025
https://github.com/0x9ef/go-wiper
Safely wiping your secure data in Golang
data gutmann-method safely secure utility wiping
Last synced: 30 Apr 2025
https://github.com/bhimrazy/litdata-with-minio
Use LitData with MinIO
data docker docker-compose litdata minio streaming
Last synced: 11 Sep 2025
https://github.com/zehracakir/verimadenciliginotlarim
My notes and my own studies in the Data Mining course in the computer engineering department of Süleyman Demirel University
classifying clustering data data-mining data-science linear-regression machine-learning pandas python
Last synced: 18 Jun 2025
https://github.com/Codeblin/ObjectPreference
Fast and easy Shared Preferences managing with object mapping annotations for simple or complex class structures
android code-generation data dataclasses easy-to-use localstorage mapping-annotations sharedpreferences sharedpreferences-easy sharedpreferences-helper sharedpreferences-manager
Last synced: 12 Apr 2025
https://github.com/geus-glaciology-and-climate/mass_balance
Greenland ice sheet mass balance from 1840 through next week
data grass-gis greenland org-mode publication python research science
Last synced: 12 Apr 2025
https://github.com/brightway-lca/bw_processing
Tools to create structured arrays in a common format
bw3 data life-cycle-assessment python
Last synced: 05 May 2025