data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-22 00:07:43 UTC
- JSON Representation
https://github.com/senrok/yadal
Yet Another Data Access Layer: Accessing S3, POSIX in the same way. Deeply inspired by Databend's OpenDAL
cloud-native data go minio s3 storage
Last synced: 12 Jan 2026
https://github.com/hodur-org/hodur-lacinia-schema
Hodur is a domain modeling approach and collection of libraries to Clojure. By using Hodur you can define your domain model as data, parse and validate it, and then either consume your model via an API or use one of the many plugins to help you achieve mechanical results faster and in a purely functional manner.
clojure data graphql lacinia modeling schema
Last synced: 12 Dec 2025
https://github.com/peterdavehello/docker-azcopy
🐳 Tiny Dockerized AzCopy (Azure Storage data transfer utility) inside Alpine Linux 🐧 (~10MB)
azcopy azure cli container copy data docker docker-image hacktoberfest storage sync
Last synced: 18 Mar 2025
https://github.com/Canner/vulcan-sql-examples
Curated VulcanSQL show cases
analytics api-builder bigquery data data-lake data-warehouse database duckdb examples postgresql reporting restful-api sql vulcan-sql vulcansql
Last synced: 11 Apr 2025
https://github.com/gsurma/twitter_data_parser
Python scripts that download metadata and tweets for given users.
data machine-learning parser python python2 twitter twitter-api
Last synced: 20 Jul 2025
https://github.com/rririanto/redash-query-cheatsheets-mongodb
Query cheatsheet redash.io MongoDB for User Metrics
data data-analysis data-visualization mongodb query-cheatsheet-redash redash redashio visualize-data
Last synced: 11 Apr 2025
https://github.com/rezapace/komputasi-big-data
This repository contains materials and practical exercises for learning Python in the context of Big Data Computation. The focus is on analyzing and processing large datasets using various tools and techniques.
ai big data data-science git-reza gunadarma gundar komputasi-big-data
Last synced: 28 Sep 2025
https://github.com/atviriduomenys/spinta
Spinta is a framework to describe, extract and publish data (a DEP Framework).
Last synced: 26 Feb 2026
https://github.com/mikestefanello/batcher
Type-safe, automatic, asynchronous batch processing.
batch batch-processing concurrency data goroutines
Last synced: 15 Feb 2026
https://github.com/rhaldkhein/mithril-data
A rich data model library for Mithril javascript framework
collection data database javascript lodash mithril model schema state stream
Last synced: 21 Feb 2026
https://github.com/hodur-org/hodur-spec-schema
Hodur is a domain modeling approach and collection of libraries to Clojure. By using Hodur you can define your domain model as data, parse and validate it, and then either consume your model via an API or use one of the many plugins to help you achieve mechanical results faster and in a purely functional manner.
clojure data modeling schema spec types validation
Last synced: 12 Dec 2025
https://github.com/ahmetfurkandemir/trendyol-smartphone-price-prediction
Trendyol Smartphone Price Prediction
aws aws-ec2 data datascience flask flask-api linear-regression machine-learning python scikit-learn trendyol
Last synced: 14 Oct 2025
https://github.com/dimitryzub/hotels-scraper-js
Scrape Airbnb, Booking, Hotels.com from a single JavaScript module. ❗No longer maintained.
airbnb booking data datascraping hotels hotels-api playwright puppeteer puppeteer-extra webscraping
Last synced: 07 Sep 2025
https://github.com/journeyapps-labs/sector
Sector is both a standalone & self-hostable data-browser for the JourneyApps platform and a set of core modules that are intended to be integrated in the currently proprietary OXIDE IDE.
browser data data-visualisation editor extensions modules ui
Last synced: 06 Mar 2026
https://github.com/fvaleye/metadata-guardian
Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️
data dataengineering dataset datastructures metadata metadata-driven metadata-extraction metadata-information metadata-management metadata-parser pii-detection
Last synced: 14 Feb 2026
https://github.com/brightway-lca/brightway2-data
Tools for the management of inventory databases and impact assessment methods. Part of the Brightway LCA framework.
brightway data life-cycle-assessment python
Last synced: 19 Oct 2025
https://github.com/marioruiz/string_pattern
Generate strings supplying a simple pattern. Perfect to be used in test data factories. Validate if a text fulfills a specific pattern. Also you can use regular expressions (Regexp) to generate strings: `/[a-z0-9]{2,5}\w+/.gen`. Generate words in English or Spanish.
data error-detection factories generation pattern random regex-pattern regexp regular-expressions ruby ruby-gem string test
Last synced: 05 Mar 2026
https://github.com/etienneschalk/scientific-data-viewer
VSCode Extension to explore the metadata of scientific data files
cfgrib climate data data-science geospatial geotiff grib hdf5 netcdf rasterio scientific sentinel visualization xarray zarr
Last synced: 06 Mar 2026
https://github.com/marykdb/maryk
Maryk is a Kotlin Multiplatform library which helps you to store, query and send data in a structured way over multiple platforms. The data store stores any value with a version, so it is possible to request only the changed data or live listen for updates.
data database graph json kotlin kotlin-multiplatform rocksdb serialization versioned yaml
Last synced: 06 Mar 2026
https://github.com/nevillelyh/scio-koans
A collection of Scio exercises inspired by Ruby Koans and many others.
Last synced: 22 Aug 2025
https://github.com/nxr-deen/student-records
This repository contains a C program that manages student data in a binary file, allowing for input and retrieval of records.
binaryfiles c data management records students
Last synced: 06 Aug 2025
https://github.com/w2sv/koala
A poor man's version of a pandas DataFrame for dart.
dart dartlang data dataframe datamanagement datamanipulation flutter pandas-dataframe
Last synced: 20 Apr 2026
https://github.com/mark-hoffmann/fastteradata
Tools for faster and optimized interaction with Teradata and large datasets.
data fastexport fastteradata python teradata
Last synced: 26 Jan 2026
https://github.com/vkcom/vkdata-sketchplugin
Sketch plugin for using data from your account at vk.com
data sketch sketch-plugin sketchapp vk vkontakte
Last synced: 27 Sep 2025
https://github.com/glamboyosa/mey
A react package that exports hooks for handling the request lifecycle.
data data-fetching fetch hooks react react-native
Last synced: 12 May 2025
https://github.com/magoo-magoo/keyrier-json
SQL queries on JSON & CSV
data desktop html json keyrier-json react sql web webapp
Last synced: 09 Mar 2026
https://github.com/uladz-zubrycki/Reseed
Initialize and clean integration tests database in a convenient, reliable and fast way.
csharp data data-seeding database dotnet integration-testing mssql mssql-database ndbunit2 netcore seed tests
Last synced: 06 Aug 2025
https://github.com/juliahealth/icd_gems.jl
ICD_GEMs.jl is a Julia package that allows to translate ICD-9 codes in ICD-10 and viceversa via the General Equivalence Mappings (GEMs) of the International Classification of Diseases (ICD).
cdc clinical-data clinical-research data death-certificates epidemiology health-data icd-10 icd-10-cm icd-9 icd-codes julia julia-language julia-package mortality-data public-health who
Last synced: 22 Apr 2025
https://github.com/danilofreire/prisonbrief
An R package that returns tidy data from the World Prison Brief website.
data prison rstats world-prison-brief
Last synced: 09 Apr 2025
https://github.com/randomfractals/duckdb-sql-tools
DuckDB SQL Tools add DuckDB support to VSCode, and provide database schema and SQL query interfaces for the popular SQLTools extension, SQL query editor, language server, and data processing tools.
data data-tools duckdb sql sql-tools sqltools sqltools-driver viewer vscode
Last synced: 22 Mar 2025
https://github.com/kettanaito/react-data-preview
Fancy interactive preview of your JavaScript data.
data data-preview javascript preview react react-data-preview
Last synced: 06 May 2025
https://github.com/hoangsonww/latticedb-nextgen-dbms
🗂️ A next-gen relational database with mergeable CRDT tables, time-travel queries, vector search, and differential privacy built-in. Written in C++17 with a SQL engine, WAL storage, and a modern web Studio.
cmake cplusplus data database databases db dbms dbms-project docker file relational-database relational-databases sql sql-parser
Last synced: 13 Sep 2025
https://github.com/itsjafer/tv-show-recommendations
Machine learning pipeline trained offline that, given a TV Show, recommends 10 similar TV Shows using cosine similarities based on a variety of features
data engine learning machine python recommendation science tv-shows
Last synced: 09 Jul 2025
https://github.com/lcsb-biocore/distributeddata.jl
Simple distributed data manipulation and processing routines in Julia
Last synced: 22 Apr 2025
https://github.com/retailmenotsandbox/dart
Self-service data workflow management
Last synced: 10 Apr 2025
https://github.com/tarantool/sdvg
Synthetic Data Values Generator
csv-generator data data-generation data-generator generation generator http-generator parquet-generator random-data random-data-generation synthetic-data synthetic-data-generation synthetic-dataset-generation test-data test-data-generator
Last synced: 12 Jan 2026
https://github.com/d-wasserman/shared-row
This is an open data specification for describing the right-of-way (ROW) for street centerline networks. It is intended to establish a common set of attributes (schema) to describe how space is allocated along a streets right of way from sidewalk edge to sidewalk edge.
data right-of-way row schema sharedstreets specification standard streets
Last synced: 05 May 2025
https://github.com/minhaskamal/alphabetrecognizer
Simple Optical Character Recognizer (english-ocr-image-to-text-recognition-sample-trainig-alphabet-photo-data-database-dataset)
alphabet-recognizer data database english image-processing java machine-learning ocr sample template-matching text-recognition training-data writing
Last synced: 11 Apr 2025
https://github.com/ECCC-MSC/msc-animet
MSC AniMet is a simple tool enabling users to interact with MSC Open Data weather data and create custom weather animations for any area in the world. The resulting animations can be downloaded and shared with a permalink.
animation canada data visualization weather wms
Last synced: 20 Jul 2025
https://github.com/chase-manning/eth-twitter-accounts
Data dump of 15,451 Twitter accounts and their Ethereum Address
Last synced: 12 Apr 2025
https://github.com/zq99/pgn2data
A library that converts a chess pgn file into a tabulated CSV data set.
chess chess-analysis csv data dataset fen library pgn
Last synced: 17 Jan 2026
https://github.com/uber-archive/vis-academy
A set of tutorials on how our frameworks make effective data visualization applications.
data data-visualization data-viz datavis deck deck-gl luma luma-gl react react-gl react-map react-map-gl tutorial uber
Last synced: 04 May 2025
https://github.com/definetlynotai/logicytics
A powerful tool designed to harvest and collect a wide range of windows system data for forensics.
beginner-project data forensics hacking-tool hacking-toolkit help help-wanted logistics maintained ml os osint osint-python osint-tool python python3 scan solo-project windows
Last synced: 18 Jul 2025
https://github.com/uladz-zubrycki/reseed
Initialize and clean integration tests database in a convenient, reliable and fast way.
csharp data data-seeding database dotnet integration-testing mssql mssql-database ndbunit2 netcore seed tests
Last synced: 30 Oct 2025
https://github.com/bbva/mercury-dataschema
Utility package that, given a Pandas DataFrame, it uses the DataSchema class which auto-infers feature types and automatically calculates different statistics depending on the types.
analytics data data-cleaning data-processing data-science feature-engineering
Last synced: 21 Jun 2025
https://github.com/glotzerlab/signac-dashboard
Rapidly visualize signac projects through a customizable dashboard interface.
analysis dashboard data flask signac visualization
Last synced: 28 Feb 2026
https://github.com/jrbourbeau/openbrewerydb-python
Python wrapper for the Open Brewery DB API
Last synced: 11 Aug 2025
https://github.com/nceas/metajam
Bringing data and metadata togetheR
data data-analysis metadata r repositories
Last synced: 14 Feb 2026
https://github.com/tushar2704/everyday_python
Welcome to Everyday Python Sheets – your go-to resource for everyday Python cheat sheets, pro tips, interview questions, Python one-liners, and Python data structures. Whether you're a beginner looking to learn Python or an experienced developer seeking quick reference materials, this Streamlit application has got you covered.
artificial-intelligence cheatsheet data data-analysis data-science data-structures data-visualization database protips python streamlit streamlit-tushar2704 tushar2704
Last synced: 09 May 2026
https://github.com/modern-fortran/weather-buoys
Processing weather buoy data in parallel
Last synced: 19 Jan 2026
https://github.com/gear5sh/gear5
high performance better alternative to Airbyte, Singer, Meltano
airbyte data data-collection data-engineering data-engineering-pipeline data-ingestion elt etl etl-framework g5 gear5 golang meltano singer singer-io singer-tap
Last synced: 14 Jan 2026
https://github.com/ngohungphuc/data-science-and-analytics
My Data Science and Analytics learning journey
Last synced: 07 Oct 2025
https://github.com/ytree-project/ytree
A Python package for analyzing tree data and especially merger trees.
analysis astronomy astrophysics data merger-trees python simulations trees
Last synced: 21 Oct 2025
https://github.com/vapor-community/bits
A bite sized library for dealing with bytes.
binary bit bits byte bytes comprehension data manipulation swift
Last synced: 17 Mar 2026
https://github.com/quarkiverse/quarkus-jpastreamer
Express Hibernate/JPA Queries as Java Streams
data database db hibernate java-stream jax-rs jpa jpa-streamer jpastreamer quarkus-extension queries query spring stream
Last synced: 14 Jul 2025
https://github.com/thombashi/dataproperty
A Python library for extract property from data.
data property python python-library
Last synced: 26 Apr 2025
https://github.com/seanbreckenridge/HPI_API
An automatic JSON API for HPI
data json-api personal-api quantified-self
Last synced: 03 Jul 2025
https://github.com/seabbs/gettbinr
An R package for accessing and summarising the World Health Organisation Tuberculosis data.
binder-ready data eda package r rstats shiny tb-data tb-incidence-rates tuberculosis who world-health-organization
Last synced: 06 Sep 2025
https://github.com/stoney95/pypely
From local functions to cloud deployed pipelines
data data-centric functional-programming mlops pipe pipeline readability testability
Last synced: 14 Jan 2026
https://github.com/parvvaresh/satellite_data
This repository provides Python code for converting satellite data into a format suitable for deep learning models. It supports various deep learning architectures, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory networks (LSTMs).
data data-preprocessing data-reporting numpy pandas python
Last synced: 22 Apr 2025
https://github.com/vojay-dev/sc2-data-pipeline
StarCraft 2 Data Pipeline with Airflow, DuckDB and Streamlit
airflow data data-engineering data-science duckdb starcraft2 streamlit
Last synced: 20 Sep 2025
https://github.com/umbracle/geth-data-layer
Go library to access the blockchain state of Go-ethereum
Last synced: 22 Apr 2025
https://github.com/jhildenbiddle/class-change
A micro-library for manipulating CSS class names, triggering change events using HTML data attributes, and creating declarative class-related event listeners
attributes change class classlist css data event event-listener html listener polyfill ponyfill
Last synced: 17 Aug 2025
https://github.com/marco-roy/DDO
A DBT package to perform DataOps & administrative CI/CD on your data warehouse.
data dataops datawarehouse datawarehouseautomation dbt snowflake
Last synced: 05 May 2025
https://github.com/vaheqelyan/react-keyview
React components to display the list, table, and grid, without scrolling, use the keyboard keys to navigate through the data
components data grid keyboard keys list navigation react table tabular-data
Last synced: 22 Apr 2025
https://jhildenbiddle.github.io/class-change/
A micro-library for manipulating CSS class names, triggering change events using HTML data attributes, and creating declarative class-related event listeners
attributes change class classlist css data event event-listener html listener polyfill ponyfill
Last synced: 11 May 2025
https://github.com/stdlib-js/datasets-cmudict
The Carnegie Mellon Pronouncing Dictionary (CMUdict).
data dataset datasets dictionary en english javascript language nlp node node-js nodejs pronounciation speech spelling stdlib words
Last synced: 24 Aug 2025
https://github.com/chifisource/oddframes.jl
The unique data management platform for Julia
data data-science julia machine-learning
Last synced: 13 Aug 2025
https://github.com/purarue/HPI_API
An automatic JSON API for HPI
data json-api personal-api quantified-self
Last synced: 15 Mar 2025
https://github.com/ishantk/amazon29nov2019
Amazon Chennai Training on Java and OOPS
algorithms amazon data java structures
Last synced: 01 May 2025
https://github.com/uwdata/divi
Automatically interact with SVG charts.
data interaction svg visualization
Last synced: 23 Aug 2025
https://github.com/kenanatmaca/kemptyview
Configurable TableView & CollectionView Empty Page
collectionview data empty-view ios swift tableview xcode
Last synced: 01 May 2025
https://github.com/camposvinicius/aws-etl
This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/data/blob/main/AdventureWorks.zip, it's a zipped file with some .csvs inside that we will apply transformations.
airflow argocd athena aws catalog data data-engineer database emr emr-cluster etl glue kubernetes pipeline postgres pyspark rds spark
Last synced: 30 Jul 2025
https://github.com/abrarnitk/algorithmica
Implementation of DS and Algorithms in Rust
algorithms data mathematics search searching sorting
Last synced: 28 Jul 2025
https://github.com/srslyyyy/custom-data
Custom data system based on Lua tables for MTA:SA.
data lua multitheftauto system tables
Last synced: 13 Mar 2025
https://github.com/williamragstad/lavastore.js
LavaStore is a flexible and scalable local database for the web
data database firestore lavastore local local-database localstorage localstorage-wrapper localstore persist persistent-storage save storage store web
Last synced: 21 Mar 2025
https://github.com/darribas/satellite_led_liverpool
Data and code for the paper "Remote Sensing-Based Measurement of Living Environment Deprivation - Improving Classical Approaches with Machine Learning", by Dani Arribas-Bel, Jorge Patiño and Juanca Duque
data machine-learning paper remote-sensing reproducibility socio-economic-indicators
Last synced: 10 Apr 2025
https://github.com/harrystaley/open-source-data-science-degree-python
A fully curated, open-source Data Science curriculum focused on Python. Includes top-tier university courses (MIT, Stanford, Princeton) covering essential topics in computer science, data analysis, machine learning, and statistics — everything you need to build a solid foundation in Data Science, 100% free.
data data-science dataanalysis datasci ds open open-source py python python3 science source statistics
Last synced: 13 Apr 2025
https://github.com/datisthq/dpkit
dpkit is a fast data management framework built on top of the Data Package standard and Polars DataFrames
ckan csv data database dataframe datapackage excel fair json ods polars quality tableschema typescript validation zenodo
Last synced: 08 Mar 2026
https://github.com/meroxa/turbine-go
Turbine Library for Go
data go golang stream-processing streaming-data
Last synced: 17 Jan 2026