data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-01-21 00:07:59 UTC
- JSON Representation
https://github.com/Gmousse/dataframe-js
A javascript library providing a new data structure for datascientists and developpers
data data-frame dataframe datascience datastructures functional groupby javascript manipulation matrix sql sql-syntax
Last synced: 15 Mar 2025
https://github.com/rjurney/Agile_Data_Code_2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
agile-data agile-data-science airflow amazon-ec2 amazon-web-services analytics apache-kafka apache-spark data data-science data-syndrome kafka machine-learning machine-learning-algorithms predictive-analytics python python-3 python3 spark vagrant
Last synced: 19 Jul 2025
https://github.com/vmware/versatile-data-kit
One framework to develop, deploy and operate data workflows with Python and SQL.
analytics data data-engineer data-engineering data-engineering-pipeline data-lineage data-pipelines data-science data-structures data-warehouse database dataops elt etl pipeline python snowflake sql trino warehouse
Last synced: 15 May 2025
https://github.com/geopython/pygeoapi
pygeoapi is a Python server implementation of the OGC API suite of standards. The project emerged as part of the next generation OGC API efforts in 2018 and provides the capability for organizations to deploy a RESTful OGC API endpoint using OpenAPI, GeoJSON, and HTML. pygeoapi is open source and released under an MIT license.
api data geospatial ogc ogc-api osgeo pygeoapi
Last synced: 09 Apr 2025
https://github.com/priyank-purohit/PostGUI
A React web application to query and share any PostgreSQL database.
admin bioinformatics dashboard data data-sharing database database-as-a-service database-gui genomics gui material-design material-ui postgres postgresql postgrest query-builder react react-admin reactjs typescript
Last synced: 20 Jul 2025
https://github.com/elementary-data/dbt-data-reliability
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
analytics analytics-engineering data data-lineage data-observability data-pipeline-monitoring data-pipelines data-reliability dbt dbt-artifacts dbt-packages dbt-tests
Last synced: 16 May 2025
https://github.com/Mohitkr95/Best-Data-Science-Resources
This repository contains the best Data Science free hand-picked resources to equip you with all the industry-driven skills and interview preparation kit.
ai artificial-intelligence artificial-intelligence-algorithms aws computer-vision data data-structures datascience deep-learning git github jupyter-notebook machine-learning mongodb natural-language-processing neural-network python sql statistics
Last synced: 07 May 2025
https://github.com/vincent-pradeilles/keypathkit
KeyPathKit is a library that provides the standard functions to manipulate data along with a call-syntax that relies on typed keypaths to make the call sites as short and clean as possible.
Last synced: 06 Apr 2025
https://github.com/ydataai/ydata-quality
Data Quality assessment with one line of code
data machine-learning pandas python quality-assessment
Last synced: 30 Apr 2025
https://github.com/vincent-pradeilles/KeyPathKit
KeyPathKit is a library that provides the standard functions to manipulate data along with a call-syntax that relies on typed keypaths to make the call sites as short and clean as possible.
Last synced: 06 Aug 2025
https://github.com/DataScienceUB/introduction-datascience-python-book
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
analytics data data-science datascience machine-learning python sentiment-analysis
Last synced: 19 Jul 2025
https://github.com/randomfractals/geo-data-viewer
Geo Data Analytics tool for VSCode IDE with kepler.gl support to generate and view maps 🗺️ without any Python 🐍, IPyWidgets ⚙️, pandas 🐼, Jupyter notebooks 📚, or ReactJS ⚛️ app code.
data data-analytics geo keplergl map spatial tool viewer vscode
Last synced: 21 Mar 2025
https://github.com/dry-rb/dry-struct
Typed struct and value objects
constraints data data-modeling dry-rb ruby struct type-safety types value-object
Last synced: 10 Apr 2025
https://github.com/reubano/meza
A Python toolkit for processing tabular data
csv data excel featured functional-programming library pandas tabular-data xlsx xml
Last synced: 14 May 2025
https://github.com/rebecca-vickery/data-science-learning-resources
A comprehensive list of free resources for learning data science
artificial-intelligence data data-science machine-learning python
Last synced: 26 Apr 2025
https://github.com/zhaoyachao/zdh_web
大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批流,私域营销等模块
bigdata collection data data-collection datapipeline datax-web etl pipline scheduler spark sparketl
Last synced: 04 Apr 2025
https://github.com/kunalj101/Data-Science-Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks
Last synced: 05 May 2025
https://github.com/kunalj101/data-science-hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks
Last synced: 28 Oct 2025
https://github.com/cosmicmind/samples
Sample projects using Material, Graph, and Algorithm.
algorithm cosmicmind data data-structure database graph material material-design projects swift swift-3 us
Last synced: 29 Mar 2025
https://github.com/aiguofer/gspread-pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
data data-analytics data-engineering data-science dataframes google google-sheets google-spreadsheets gspread pandas python sheets
Last synced: 15 May 2025
https://github.com/markpflug/sylvan
A collection of .NET libraries, including the fastest general-purpose CSV parser for .NET.
Last synced: 14 May 2025
https://github.com/ethyca/fides
The Privacy Engineering & Compliance Framework
data data-privacy data-privacy-compliance developer-tools gdpr hacktoberfest privacy-as-code python
Last synced: 18 Dec 2025
https://github.com/opengvlab/scalecua
ScaleCUA is the open-sourced computer use agents that can operate on corss-platform environments (Windows, macOS, Ubuntu, Android).
computer-use-agents data gui-agents models online-evaluation-suite scalecua
Last synced: 28 Oct 2025
https://github.com/datalayer/jupyter-ui
🪐 ⚛️ React.js components 💯% compatible with 🪐 Jupyter https://jupyter-ui-storybook.datalayer.tech
data data-product data-science data-visualisation datalayer ipywidgets jupyter jupyterlab lumino notebook reactjs ui
Last synced: 15 May 2025
https://github.com/ZacharyHampton/HomeHarvest
Python package for scraping real estate property data
data finance mls properties proptech real-estate realtor redfin redfin-scraper scraper scraping webscraping zillow zillow-scraper
Last synced: 26 Oct 2025
https://github.com/quickbirdeng/surveykit
Android library to create beautiful surveys (aligned with ResearchKit on iOS)
android android-library clinical-trials data forms ios-researchkit-surveys kotlin kotlin-android medical poll research study-conduct survey
Last synced: 23 Jun 2025
https://github.com/earthobservations/wetterdienst
Open weather data for humans.
canada data deutscher-wetterdienst dwd eccc germany historical-data hydrology meteorology open-data open-source radar time-series uk united-states weather weather-api weather-forecast weather-station weatherservice
Last synced: 14 May 2025
https://github.com/kenkundert/nestedtext
Human readable and writable data interchange format
config configuration configuration-files data json json-alternative nested-text nestedtext serialization toml toml-alternative yaml yaml-alternative yaml-configuration
Last synced: 15 May 2025
https://github.com/thebowja/genshin-db
npm package with searching functions for Genshin Impact data of all in-game languages. Data parsed/organized directly from GenshinData repo.
data genshin genshin-impact genshinimpact javascript json
Last synced: 14 May 2025
https://github.com/jim-schwoebel/voicebook
🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
data data-cleaning encryption-decryption featurization generation machine-learning python3 security server transcription visualization voice voice-activity-detection voice-assistant voice-computing voice-control voice-recognition voice-recording wake-word-detection
Last synced: 06 Apr 2025
https://github.com/tinybirdco/web-analytics-starter-kit
Tinybird Web Analytics template
analytics data real-time realtime starter-kit tinybird tracker tracking web-analytics webanalytics
Last synced: 15 May 2025
https://github.com/microsoft/ghcrawler
Crawl GitHub APIs and store the discovered orgs, repos, commits, ...
crawler data github github-api github-webhooks ospo
Last synced: 27 Sep 2025
https://github.com/xorbit01/webpalm
🕸️ Crawl in the web network
crawler crawling data data-science datamining go golang hack mining osint redteam spider tool
Last synced: 15 Dec 2025
https://github.com/khive-ai/lionagi
AGI SDK
agents ai automation data llm machine-learning workflow
Last synced: 13 Oct 2025
https://github.com/ka8725/migration_data
Migrate data along with schema migrations in Rails and keep them up to date.
data data-migration migrations rails
Last synced: 16 May 2025
https://github.com/rpbouman/huey
Light-weight, browser-based ROLAP pivot tables on top of DuckDB-WASM
data duckdb excel pivot-tables rolap small-data sql
Last synced: 24 Mar 2025
https://github.com/XORbit01/webpalm
🕸️ Crawl in the web network
crawler crawling data data-science datamining go golang hack mining osint redteam spider tool
Last synced: 14 Apr 2025
https://github.com/SheetJS/j
:x: Multi-format spreadsheet CLI (now merged in http://github.com/sheetjs/js-xlsx )
cli csv data dbf dif excel javascript markdown ods prn spreadsheet sylk xls xlsx
Last synced: 22 Jul 2025
https://github.com/princeton-nlp/less
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
data data-selection influence instruction-tuning llama llm mistral
Last synced: 05 Apr 2025
https://github.com/intenthq/anon
A UNIX Command To Anonymise Data
anonymity anonymization cli csv data go golang
Last synced: 26 Mar 2025
https://github.com/compas-dev/compas
Main library of the COMPAS framework and CAD integrations for Rhino/GH and Blender.
aec blender3d data datastructures geometry grasshopper3d rhino3d rpc
Last synced: 24 Oct 2025
https://github.com/openlists/ElectrophysiologyData
A list of openly available datasets in (mostly human) electrophysiology.
data ecog eeg electrophysiological-data electrophysiology lfp meg open-data open-science research
Last synced: 09 May 2025
https://github.com/mtahiraslan/data-analyst-roadmap
Based on my own experience, I think this roadmap will answer all the questions of how to become a data analyst from zero, which technologies and programming languages are better to know, what kind of soft skills do we need, how do I start my professional career in this field.
blogs businessintelligence courses data dataanalysis dataanalyst excel interview mtahiraslan powerbi programming python resources resume roadmap softskills sql statistics tableau tutorials
Last synced: 14 Mar 2025
https://github.com/alibaba/feathub
FeatHub - A stream-batch unified feature store for real-time machine learning
apache-flink data data-engineering data-quality data-science feature-engineering feature-store machine-learning mlops streaming
Last synced: 14 Oct 2025
https://github.com/panel-extensions/panel-graphic-walker
A project providing a Graphic Walker Pane for use with HoloViz Panel.
business-intelligence data data-analysis data-app data-exploration data-mining data-visualization eda holoviz-panel low-code notebook pivot-table python tableau tableau-alternative vega vega-lite visualization
Last synced: 20 Oct 2025
https://github.com/monosidev/monosi
Open source data observability platform
data metrics monitoring monitoring-tool observability observability-data python react
Last synced: 14 Jan 2026
https://github.com/splitgraph/sgr
sgr (command line client for Splitgraph) and the splitgraph Python library
data data-version-control developer-tools postgres postgresql python sql
Last synced: 13 Apr 2025
https://github.com/weecology/retriever
Quickly download, clean up, and install public datasets into a database management system
data data-retrieval data-science dataset datasets hacktobefest python
Last synced: 21 Oct 2025
https://github.com/micromata/http-fake-backend
Build a fake backend by providing the content of JSON files or JavaScript objects through configurable routes.
api backend data fake fake-data http http-server json json-files mock mocking mocking-server mocks node nodejs rest rest-api restful restful-api
Last synced: 24 Mar 2025
https://github.com/sn3fru/datascience_course
Curso de Data Science em Português
artificial-intelligence brasil curso dados data data-analysis data-science data-science-learning dataset deep-learning machine-learning python
Last synced: 28 Apr 2025
https://github.com/tokern/piicatcher
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
aws-athena aws-glue aws-redshift catalog data data-catalog database phi pii python snowflake
Last synced: 12 Apr 2025
https://github.com/ag-grid/ag-charts
AG Charts is a fully-featured and highly customizable JavaScript charting library. The professional choice for developers building enterprise applications
angular angular-charts angular-component angular-graphs charts data data-component graph graphs react react-charts react-graphs reactjs vue vuejs
Last synced: 14 May 2025
https://github.com/holtzy/react-graph-gallery
A set of graph examples showing how to make react and d3.js work together
Last synced: 24 Mar 2025
https://github.com/juliadata/tables.jl
An interface for tables in Julia
data hacktoberfest interfaces julia tables
Last synced: 14 May 2025
https://github.com/infuseai/artivc
A version control system to manage large files.
data machinelearning storage version-control
Last synced: 01 Aug 2025
https://github.com/ropensci/charlatan
Create fake data in R
data dataset fake-data faker peer-reviewed r r-package rstats
Last synced: 16 Jan 2026
https://github.com/yamafaktory/hypergraph
Hypergraph is data structure library to create a directed hypergraph in which a hyperedge can join any number of vertices.
data data-science data-structure data-structures hypergraph hypergraphs rust rust-lang rustlang
Last synced: 15 May 2025
https://github.com/flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
automation data data-science extensible flyte flyte-tasks hacktoberfest mlops pypi python sdk spark workflows
Last synced: 12 Jan 2026
https://github.com/noobaa/noobaa-core
High-performance S3 gateway to any backend (file, s3-compatible, multi-clouds, caching, replication...)
data hybrid mirroring multi-cloud nodejs noobaa s3 storage tiering
Last synced: 16 May 2025
https://github.com/openml/openml-python
OpenML's Python API for a World of Data and More 💫
benchmarking data datascience machine-learning meta-learning openml python tabular-data
Last synced: 10 Apr 2025
https://github.com/ZumoLabs/zpy
Synthetic data for computer vision. An open source toolkit using Blender and Python.
ai blender blender-addon computer-vision data deep-learning ml python synthetic synthetic-data
Last synced: 11 May 2025
https://github.com/dodiku/AudioOwl
Fast and simple music and audio analysis using RNN in Python 🕵️♀️ 🥁
analysis audio audio-features beat-detection beats data feature-extraction machine-learning mir ml music music-information-retrieval pip pypi python rnn sample-rate tempo
Last synced: 14 Apr 2025
https://github.com/felipenoris/xlsx.jl
Excel file reader and writer for the Julia language.
data excel julia julia-language
Last synced: 15 May 2025
https://github.com/felipenoris/XLSX.jl
Excel file reader and writer for the Julia language.
data excel julia julia-language
Last synced: 10 May 2025
https://github.com/InfuseAI/ArtiVC
A version control system to manage large files.
data machinelearning storage version-control
Last synced: 18 Apr 2025
https://github.com/hodur-org/hodur-engine
Hodur is a domain modeling approach and collection of libraries to Clojure. By using Hodur you can define your domain model as data, parse and validate it, and then either consume your model via an API or use one of the many plugins to help you achieve mechanical results faster and in a purely functional manner.
Last synced: 12 Dec 2025
https://github.com/wll8/mockm
用于处理前端在接口环节中的各种问题,例如快速生成 api 以及创造数据、页面部署等,开箱即用,便于迁移。A framework based on Express. It can quickly generate APIs and create data, ready for deployment out of the box.
api data dummy fake json mock mocking prototyping rest restfull sandbox server test testing
Last synced: 15 May 2025
https://github.com/ropensci/taxize
A taxonomic toolbelt for R
api api-wrapper biodiversity biology darwincore data nomenclature r r-package rstats taxize taxonomy
Last synced: 15 May 2025
https://github.com/stocknear/backend
Backend of stocknear - Open Source Stock Analysis
data data-science fastapi fastify finance javascript machine-learning nodejs pocketbase python redis
Last synced: 16 May 2025
https://github.com/nshiab/simple-data-analysis
Easy-to-use and high-performance JavaScript library for data analysis. Works with tabular and geospatial data.
analysis bun data data-analysis data-science duckdb geospatial javascript node node-js nodejs spatial spatial-analysis sql typescript
Last synced: 17 Jan 2026
https://github.com/the-osint-toolbox/data-acquisition-osint
You can find links to data acquisition websites.
breach-check breach-compilation breached breaches combolist data datasets dehash directory dmp files hash hashing leaks password pastebin pastes public stealer-logs stealers
Last synced: 11 Mar 2025
https://github.com/The-Osint-Toolbox/Data-Acquisition-OSINT
You can find links to data acquisition websites.
breach-check breach-compilation breached breaches combolist data datasets dehash directory dmp files hash hashing leaks password pastebin pastes public stealer-logs stealers
Last synced: 24 Mar 2025
https://github.com/shahinrostami/plotapi
Engaging visualisations, made easy.
data data-science data-visualization plotting python visualization
Last synced: 22 Jan 2026
https://github.com/solnic/drops
🛠️ Tools for working with data effectively - data contracts using types, schemas, domain validation rules, type-safe casting, and more.
data elixir elixir-lang elixir-library json schema validation
Last synced: 15 May 2025
https://github.com/adamhl8/filterql
A tiny query language for filtering structured data
data filter filtering filterql language query query-language typescript
Last synced: 04 Oct 2025
https://github.com/argoproj-labs/old-argo-dataflow
Dataflow is a Kubernetes-native platform for executing large parallel data-processing pipelines.
data jetstream kafka kubernetes pipeline
Last synced: 16 May 2025
https://github.com/spamscanner/spamscanner
Spam Scanner is a Node.js anti-spam, email filtering, and phishing prevention tool and service. Built for @ladjs, @forwardemail, @cabinjs, @breejs, and @lassjs.
anti-spam api data enron javascript node rspam rspamd scanner service set spam spam-classification spam-classifier spam-detection spam-filter spam-filtering spam-prevention spam-protection spamassassin
Last synced: 30 Sep 2025
https://github.com/Bears-R-Us/arkouda
Arkouda (αρκούδα): Interactive Data Analytics at Supercomputing Scale :bear:
chapel data data-analysis data-science distributed-computing eda hpc python
Last synced: 08 Jul 2025
https://github.com/ignoreintuition/jSchema
A simple, easy to use data modeling framework for JavaScript
data data-modeling data-structures data-visualization dataset drop filter group groupby hacktoberfest javascript javascript-library multiple-datasets orderby sort
Last synced: 05 Apr 2025
https://github.com/markpflug/sylvan.data.excel
The fastest .NET library for reading Excel data files.
data dotnet dotnet-core excel xls xlsb xlsx
Last synced: 15 May 2025
https://github.com/empower-ai/dsensei
AI-powered key driver analysis tool that pinpoints root cause behind metrics fluctuation in one minute.
analytics business-analytics business-intelligence data data-analytics data-insights data-science
Last synced: 01 Aug 2025
https://github.com/pluginpal/strapi-plugin-config-sync
:recycle: CLI & GUI for continuous migration of config data across environments
config config-sync data dtap dump environments migrate strapi strapi-plugin sync
Last synced: 16 May 2025
https://github.com/bears-r-us/arkouda
Arkouda (αρκούδα): Interactive Data Analytics at Supercomputing Scale :bear:
chapel data data-analysis data-science distributed-computing eda hpc python
Last synced: 06 Apr 2025
https://github.com/oliver006/elasticsearch-test-data
Generate and upload test data to Elasticsearch for performance and load testing
data elasticsearch python test-data tornado
Last synced: 05 Apr 2025
https://github.com/houbb/nlp-hanzi-similar
The hanzi similar tool.(汉字相似度计算工具,中文形近字算法。可用于手写汉字识别纠正,文本混淆等。)
chinese data han nlp ocr word-correction
Last synced: 07 Apr 2025
https://github.com/ebonnal/streamable
Pythonic Stream-like manipulation of iterables
asyncio collections data data-engineering decorator-pattern etl etl-pipeline fluent-interface immutability iterable iterator iterator-pattern lazy-evaluation method-chaining python python3 reverse-etl streams threads visitor-pattern
Last synced: 15 May 2025