data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-01-24 00:07:28 UTC
- JSON Representation
https://github.com/tomhanika/conexp-clj
A General-Purpose Tool for Formal Concept Analysis
clojure closure-systems conceptual-knowledge data data-analysis data-science formal-concept-analysis lattice order order-theory
Last synced: 30 Dec 2025
https://github.com/expressapp/construct
Library for dealing with data structures
data elixir elixir-construct elixir-lang types validation
Last synced: 11 Dec 2025
https://github.com/moscarde/junior_zone
Vagas Jr. atualizadas diariamente. Telegram e Planilha Online
Last synced: 14 Apr 2025
https://github.com/bpbond/srdb
Global soil respiration database
carbon-cycle data global-database science soil soil-respiration
Last synced: 07 Apr 2025
https://github.com/kaiso/relmongo
Java relationship-enabled domain model persistence framework for MongoDB
annotations association bi-directional data database dbref document framework java lazy mapping mongodb onetomany onetoone reference relation relational relationship spring spring-data-mongodb
Last synced: 14 Jan 2026
https://github.com/argmaxml/conjugate_prior
Implementation of the conjugate prior table for Bayesian Statistics
bayesian-statistics conjugation data data-science likelihood probabilistic-programming probability statistical-models statistics
Last synced: 11 Oct 2025
https://github.com/iesahin/xvc
A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)
command-line-tool data data-engineering data-pipelines data-science devops machine-learning machine-learning-engineering mlops rust
Last synced: 28 Jun 2025
https://github.com/ahmed-mohamed-sn/olliePy
OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.
ai analytics charts dashboard data data-analytics data-science data-scientist eda error-analysis exploratory-data-analysis machine-learning python visualization
Last synced: 08 May 2025
https://github.com/distributedsystemsgroup/zoe
Zoe: Container Analytics as a Service -- mirror of https://gitlab.eurecom.fr/zoe/main/
analytics containers data jupyter python spark
Last synced: 23 Oct 2025
https://github.com/tobilg/aws-iam-data
This repository contains the full dataset of AWS IAM data (services, actions, resource types and conditions keys). It's updated on a daily basis at 4AM UTC.
Last synced: 07 Apr 2025
https://github.com/olist/work-at-olist-data
Apply for a job at Olist's Data Team: https://olist.gupy.io/
analytics data dataengineering datascience dataset julia machinelearning pandas python r sql
Last synced: 25 Jun 2025
https://github.com/digitalghost-dev/poke-cli
A hybrid CLI/TUI tool written in Go for viewing Pokémon data from the terminal!
charm charmbracelet cli data go pokemon terminal terminal-based tui
Last synced: 23 Jan 2026
https://github.com/Mithileysh/Email-Datasets
Email Datasets can be found here
data datasets email emaildata enron enron-dataset enron-email-dataset enron-emails
Last synced: 28 Apr 2025
https://github.com/dig-eds-cat/digeds_cat
This research seeks to examine best practice in the field of digital editions by collating relevant evidence in a detailed catalogue of extant digital projects.
catalogue data digital-edition digital-humanities library-database open-access open-data open-source
Last synced: 08 Apr 2025
https://github.com/jkanev/treetime
TreeTime is a data organisation, management and analysis tool. A tree is a hierarchical structure that arranges information in units and sub-units. TreeTime uses linked trees (one data item can be part of different distinct trees) to store and organise any general purpose data.
ancestry data data-organisation data-organizer hierarchical-data information-management information-management-system information-manager linked-trees ontology project-management time-management to-do-list tree tree-editor tree-structure
Last synced: 14 Dec 2025
https://github.com/jamesleesaunders/d3-ez
D3-EZ Easy Reusable Charts
chart d3 data dataviz graph svg visualization
Last synced: 16 Mar 2025
https://github.com/paulknysh/sym
A Mathematica package for generating symbolic models from data
data generation mathematica model symbolic
Last synced: 08 Jul 2025
https://github.com/streamingfast/streamingfast
The dfuse Blockchain Data Platform
blockchain data eosio ethereum platform
Last synced: 30 Apr 2025
https://github.com/intake/akimbo
For when your data won't fit in your dataframe
awkward-array cudf data dataframe pandas polars python
Last synced: 26 Aug 2025
https://github.com/3c7/common-osint-model
Converting data from services like Censys and Shodan to a common data model
analysis censys data infrastructure model osint shodan
Last synced: 30 Dec 2025
https://github.com/chrisvwn/Rnightlights
R package to extract data from satellite nightlights.
data dmsp-ols extraction nightlights noaa package r satellite snpp-viirs
Last synced: 13 Jul 2025
https://github.com/raphaelmansuy/digital_palace
My Digital Palace - A Personal Journal for Reflection - A place to store all my thoughts
Last synced: 16 Mar 2025
https://github.com/ropensci/rdataretriever
R interface to the Data Retriever
data data-science database datasets r r-package rstats science
Last synced: 22 Oct 2025
https://github.com/jmboehm/douglass.jl
Stata-like toolkit for data wrangling on Julia DataFrames
data data-frames economics julia stata tabular-data
Last synced: 31 Oct 2025
https://github.com/atolcd/pentaho-gis-plugins
🗺 GIS plugins for Pentaho Data Integration
data dxf etl geojson gpx java mif-mid pentaho-data-integration shp spatialite svg
Last synced: 23 Jan 2026
https://github.com/utrechtuniversity/yoda
A system for reliable, long-term storing and archiving large amounts of research data during all stages of a study.
ansible automated-deployment data irods research utrecht-university yoda
Last synced: 07 Apr 2025
https://github.com/s-pro/node-elizabeth
data dummy fake fake-data javascript mock mocking nodejs
Last synced: 10 Jun 2025
https://github.com/cdnjs/cf-stats
📈 Monthly usage statistics from Cloudflare for the cdnjs.cloudflare.com domain - The #1 free and open source CDN built to make life easier for developers.
cdnjs cloudflare data data-analysis statistics stats usage usage-data usage-reports
Last synced: 06 Jul 2025
https://github.com/quickbirdeng/datakit
A Swift library to easily read and write binary formatted data using a modern, declarative interface.
binary-data ble bluetooth bluetooth-le bluetooth-low-energy data declarative declarative-programming decoding dsl encoding network resultbuilder swift swift5
Last synced: 23 Jun 2025
https://github.com/itsvinayak/weather-app
weather app using different python based frameworks
backend beginner beginner-friendly beginner-project bootstrap css data django flask html py python python-framework tkinter twinkle weather web
Last synced: 09 Apr 2025
https://github.com/bluewave-labs/maskwise
Maskwise detects, redacts, masks, and anonymizes sensitive data across text, images, and structured data in training datasets for LLM systems. Powered by Microsoft Presidio
data data-anonymization data-redaction data-scanning gdpr-compliance hipaa-compliance pii-anonymization pii-detection sensitive-data-masking
Last synced: 20 Jan 2026
https://github.com/wjakethompson/taylor
A comprehensive resource for data on Taylor Swift songs, and ggplot2 helper functions
color-palettes data genius-lyrics ggplot2-themes lyrics r spotify spotify-api taylor-swift
Last synced: 06 Apr 2025
https://github.com/scrapinghub/arche
Analyze scraped data
data data-analysis data-visualization jupyter pandas python3 scrapy
Last synced: 02 Aug 2025
https://github.com/doktormike/dammmdatagen
Marketing Mix Modeling Data Generator
benchmark data data-generator marketing-mix-modeling
Last synced: 29 Jul 2025
https://github.com/kuda-io/kuda
Kubernetes 原生的数据交付平台
cloud-native data golang hdfs kubernetes storage
Last synced: 14 Jan 2026
https://github.com/paloaltonetworks/pan-cortex-data-lake-python
Python idiomatic SDK for Cortex™ Data Lake.
api applicationframework cortex data datalake directory directory-sync directory-sync-service event event-service logging logging-service paloalto paloaltonetworks pan pancloud panw python rest-api sdk
Last synced: 05 May 2025
https://github.com/optixal/cryptoinscriber
:chart_with_upwards_trend: A live cryptocurrency historical trade data blotter. Download live historical trade data from any cryptoexchange, be it for machine learning, backtesting/visualizing trading strategies or for Quantopian/Zipline.
backtest bot cryptocurrency data downloader exchange feeds historical historical-data learning live machine poll strategy trade transactions
Last synced: 21 Mar 2025
https://github.com/jqnpm/jqnpm
A package manager built for the command-line JSON processor jq.
command-line-tool data data-processing jq json package-manager
Last synced: 21 Jul 2025
https://github.com/Optixal/CryptoInscriber
:chart_with_upwards_trend: A live cryptocurrency historical trade data blotter. Download live historical trade data from any cryptoexchange, be it for machine learning, backtesting/visualizing trading strategies or for Quantopian/Zipline.
backtest bot cryptocurrency data downloader exchange feeds historical historical-data learning live machine poll strategy trade transactions
Last synced: 22 Mar 2025
https://github.com/cgsecurity/testdisk_documentation
Documentation for TestDisk & PhotoRec
data photorec recovery testdisk
Last synced: 03 Sep 2025
https://github.com/gaelforget/climatemodels.jl
Julia interface to climate models + tracked workflow framework
atmosphere climate cmip data data-science earth-observation ecco git interface ipcc julia mitgcm modeling ocean parameters workflow
Last synced: 18 Jan 2026
https://github.com/koushikphy/interactive_data_editor
A Software to interactively edit data in a graphical manner
3d-plot computational-chemistry data data-analysis data-fitting data-manipulation data-visualization dataset electron-app electronjs fitting graph graphical griddata plotting regression-analysis smoothing snap surface-plot
Last synced: 20 Aug 2025
https://github.com/Koushikphy/Interactive_Data_Editor
A Software to interactively edit data in a graphical manner
3d-plot computational-chemistry data data-analysis data-fitting data-manipulation data-visualization dataset electron-app electronjs fitting graph graphical griddata plotting regression-analysis smoothing snap surface-plot
Last synced: 18 Jul 2025
https://github.com/asem000/pytreeclass
Visualize, create, and operate on pytrees in the most intuitive way possible.
data dataclasses deep-learning jax machine-learning pipelines pytorch pytree tensorflow
Last synced: 07 Apr 2025
https://github.com/timokoerber/laravel-json-seeder
Create and use JSON files to seed your database in your Laravel applications
data database json laravel seed seeder seeder-table seeders
Last synced: 19 Oct 2025
https://github.com/gaelforget/ClimateModels.jl
Julia interface to climate models + tracked workflow framework
atmosphere climate cmip data data-science earth-observation ecco git interface ipcc julia mitgcm modeling ocean parameters workflow
Last synced: 20 Jul 2025
https://github.com/yzfly/mcp-excel-server
The Excel MCP Server is a powerful tool that enables natural language interaction with Excel files through the Model Context Protocol (MCP). It provides a comprehensive set of capabilities for reading, analyzing, visualizing, and writing Excel data.
claude claude-mcp data excel mcp mcp-excel-server
Last synced: 06 Jul 2025
https://github.com/DoktorMike/dammmdatagen
Marketing Mix Modeling Data Generator
benchmark data data-generator marketing-mix-modeling
Last synced: 06 May 2025
https://github.com/articdive/articdata
Collection of data extracted from Minecraft.
data data-extraction data-mining java json mc minecraft minecraft-data minecraft-server minecraft-servers registry
Last synced: 17 May 2025
https://github.com/dosm-malaysia/opendosm-front
data next-js next-js-website open-data tailwindcss
Last synced: 05 Sep 2025
https://github.com/maxgfr/binance-historical
Get historical klines from binance api
binance binance-api binance-historical binance-historical-candle-data binance-history cryptocurrency data historical-data history javascript kline node nodejs trading typescript
Last synced: 25 Jan 2026
https://github.com/daveebbelaar/df-data-science-template
This templated is provided by Datalumina and based on the Cookiecutter Data Science template
Last synced: 06 Sep 2025
https://github.com/greenelab/pubtator
Retrieve and process PubTator annotations
data nlp pubmed pubtator snorkel text-mining tool
Last synced: 05 May 2025
https://github.com/benthosdev/benthos-captain
A Kubernetes Operator to orchestrate Benthos pipelines
benthos data data-engineering gitops go golang helm kubernetes kustomize pipelines stream-processing
Last synced: 22 Jan 2026
https://github.com/albar965/navdatareader
Navdatareader is a command line tool that uses the atools fs/bgl and fs/writer to store a full flight simulator scenery database into a relational database like Sqlite or MySql.
compiler data flight fsx map navigation prepar3d simulator x-plane
Last synced: 02 May 2025
https://github.com/anna-geller/prefect-streaming
Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0
automation aws data data-engineering data-pipeline data-science data-warehouse dataflow dataflow-ops ecs-fargate engineering event-driven mlops modern-data-stack orchestration prefect python real-time serverless streaming
Last synced: 24 Mar 2025
https://github.com/j535d165/cbsodata
Unofficial Statistics Netherlands (CBS) open data API client for Python
census-api census-data data national-statistics netherlands open-data python-library
Last synced: 05 Apr 2025
https://github.com/cdcgov/cdc-open-viz
CDC OpenViz is a library of React packages for data visualization.
data react visualization visualization-library
Last synced: 04 Apr 2025
https://github.com/nasdaq/hackathons
Nasdaq's realtime streaming stock market data for hackathons.
data hackathon market market-data nasdaq real-time realtime stock-market streaming
Last synced: 18 Oct 2025
https://github.com/adieuadieu/japan-train-data
🇯🇵 🚂 A circular object of train data for Japan including translations & station geocoding and a tool to generate it.
data eki japan nihon train translations
Last synced: 18 Mar 2025
https://github.com/guocaoyi/meituan-spider
美团™爬虫练习项目(Region、POI、店铺、商品)
china-city data learning meituan meituan-pois poi puppeteer reptile reptile-nodejs
Last synced: 17 Aug 2025
https://github.com/ethicnology/ophois
Creates street graph from OpenStreetMap
data graph network openstreetmap osm street
Last synced: 11 Oct 2025
https://github.com/vida-nyu/data-polygamy
Data Polygamy is a topology-based framework that allows users to query for statistically significant relationships between spatio-temporal data sets.
Last synced: 10 Apr 2025
https://github.com/rOpenSpain/climaemet
R Climate AEMET Tools
aemet climate cran data forecast-api r r-package ropenspain rstats science spain weather-api
Last synced: 20 Jul 2025
https://github.com/fedora-infra/datagrepper
HTTP API for datanommer and the fedmsg bus
data data-analysis data-science fedora fedora-project postgres postgresql python
Last synced: 12 May 2025
https://github.com/parafoxia/analytix
A simple yet powerful SDK for the YouTube Analytics API.
analytical-information api-wrapper arrow data excel google pandas polars python service utility youtube youtube-api
Last synced: 06 Apr 2025
https://github.com/xefi/faker-php-symfony
Symfony integration of the xefi\faker-php package
data fake faker php symfony symfony-bundle
Last synced: 18 Mar 2025
https://github.com/datakitchen/dataops-observability
DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.
data data-engineering data-observability data-science dataops pipleine-monitoring
Last synced: 09 Apr 2025
https://github.com/itext/itext-pdfocr-dotnet
pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving
archival character data diacritic extractable glyphs hindi image iso-compliant ligatures mandarin ocr optical pdf portuguese recognition scan searchable spanish tesseract
Last synced: 08 Jan 2026
https://github.com/airbytehq/write-for-the-community
Contribute and collaborate on educational content for the Airbyte Community.
airbyte articles data open-source showcases tutorial videos
Last synced: 24 Feb 2025
https://github.com/the-alchemists-of-arland/gray-matter-rs
A tool for easily extracting front matter out of a string. It is a fast Rust implementation of gray-matter. Parses YAML, JSON, TOML and support for custom parsers. Use it and let me know by giving it a star!
data front-matter front-matter-parsers frontmatter gray-matter gray-matter-rs gray-matter-rust markdown matter parse rust rust-crate yaml
Last synced: 10 Apr 2025
https://github.com/jpmorganchase/py-avro-schema
Generate Apache Avro schemas for Python types including standard library data-classes and Pydantic data models.
avro data dataclasses deserialization generate jpmorganchase kafka messaging pydantic python schema serialization types
Last synced: 28 Jun 2025
https://github.com/jobovy/apogee
Tools for dealing with APOGEE data
astronomy astrophysics data data-analysis python spectroscopy
Last synced: 02 Oct 2025
https://github.com/paezha/idealista18
Open data product with real estate listings from Idealista. The datasets are for three major cities in Spain and the year 2018. https://doi.org/10.1177/23998083241242844
data open-data-products packages r real-estate spain spatial
Last synced: 29 Jun 2025
https://github.com/PatrickCuba-zz/thedatamustflow
Visio stencils and artefacts related to data vault guru
data data-vault stencil vault visio
Last synced: 20 Jul 2025
https://github.com/tombarr/open-source-words
Visualization of the most frequent words used in open source projects
d3 data data-visualization javascript python
Last synced: 13 Apr 2025
https://github.com/Articdive/ArticData
Collection of data extracted from Minecraft.
data data-extraction data-mining java json mc minecraft minecraft-data minecraft-server minecraft-servers registry
Last synced: 08 May 2025
https://github.com/alir3z4/django-databrowse
Databrowse is a Django application that lets you browse your data.
Last synced: 11 Apr 2025
https://github.com/0015/python-data-sampling-app
Data Sampling App from Serial to CSV file
accelerometer arduino csv data esp32 gyroscope pysimplegui python sampling serial-communication serialport
Last synced: 26 Apr 2025
https://github.com/rmax/databrewer
The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!
command-line data datasets discovery python
Last synced: 20 Mar 2025
https://github.com/maicius/universityrecruitment-ssurvey
用严肃的数据来回答“什么样的企业会到什么样的大学招聘”?
analysis beautifulsoup crawler data redis university
Last synced: 28 Apr 2025
https://github.com/ipeagit/flightsbr
R Package to Download Flight and Airport Data from Brazil
aviation-data brazil data r rstats rstats-package
Last synced: 02 May 2025