data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-23 00:07:41 UTC
- JSON Representation
https://github.com/divithraju/divith-raju-openmetadata
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
automation bigdata bigdataanalytics data data-structures dataengineering datascience hacktoberfest2022 metadata metadata-extraction
Last synced: 20 Feb 2026
https://github.com/rrighart/rrighart.github.io
A webpage about data science, programming, statistics and related topics
analyses data data-mining programming statistics
Last synced: 20 Jan 2026
https://github.com/erwan-simon/aws-data-platform-framework
A unified framework to industrialize data ingestion, transformation and pipeline execution on AWS using Terraform, from infrastructure provisioning to runtime execution, designed as a reusable and standalone data platform.
aws data data-framework datalake docker iceberg python spark step-functions terraform terraform-module
Last synced: 23 May 2026
https://github.com/randomfractals/unfolded-map-renderer
Unfolded Map 🗺️ Notebook 📓 Renderer for VSCode.
cell data flat-data geo geo-location geo-spatial map notebook output renderer unfolded view vscode
Last synced: 21 Mar 2025
https://github.com/imranhsayed/programming-in-c
Programming in C
array c c-programming circular-linked-list cprogramming data data-structures-and-algorithms file-handling linked-list pointers
Last synced: 28 Jan 2026
https://github.com/espoirmur/balobi_nini
An End to End Data Science Project, where I used Tweepy and Airflow to collect tweets related to the DRC and topic modeling technics to discover which topics Congolese are talking about on Twitter.
Last synced: 24 Aug 2025
https://github.com/hmeleiro/alquilermad
Housing rent map in Comunidad de Madrid / Mapa del alquiler en la Comunidad de Madrid
data data-science data-visualization datascience housing-location-visualization rent renting
Last synced: 13 Sep 2025
https://github.com/defano/chicago-oasis
A visualization of Chicago business accessibility by neighborhood or census tract.
census chicago data data-science javascript neighborhood
Last synced: 11 Mar 2026
https://github.com/crazywolf132/jungla
🌲🌲🌲 Your new favourite data manipulator
backend data data-manipulation easy-to-use frontend fullstack help-wanted interpreter language library microservices mobile nodejs parser programming-language
Last synced: 05 Apr 2025
https://github.com/thejeshgn/thejeshgn
data data-visualization datameet india opendata public-interest
Last synced: 15 Jan 2026
https://github.com/techbureau/zaifdata
:blue_book: Data Reader for zaif Exchange
bitcoin blockchain cryptocurrency data exchange nem token trading xem zaif
Last synced: 19 Apr 2026
https://github.com/abuzar-alvi/employee-data-to-info-card-generator-with-python
This Python project is made by me, Python project for improving python skills.
card data data-generator employee python
Last synced: 03 Feb 2026
https://github.com/leeper/mcode
Functions to merge and recode across multiple variables
data data-transformation r recode recoding
Last synced: 16 May 2025
https://github.com/andrew-johnson-4/misspeller
Take correctly spelled words and return common spelling mistakes
common-mistakes data language natural nlp processing rust
Last synced: 30 Apr 2025
https://github.com/p32929/use-megamind
A simple react hook for managing asynchronous function calls with ease on the client side
async asynchronous-tasks axios client-side-javascript data data-fetching easy fetch generics hooks javascript npm painless promise query react rest simple small typescript
Last synced: 23 Jan 2026
https://github.com/gabrielu3/ori-inverted-list
Inverted List made for a college discipline named Organization and Retrieval of Information
c data data-structures index inverted-index list
Last synced: 24 Feb 2026
https://github.com/uk-ipop/open-data-pipeline
A pipeline for processing, enhancing, and sharing open datasets.
actions automation data python
Last synced: 25 May 2026
https://github.com/ivangrigorov/neutrino-search-engine
Creating Java search engine both for HTML or document type of files
data data-analysis data-knowledge information-extraction information-retrieval java-language search-engine
Last synced: 31 Mar 2025
https://github.com/deepwaterpaladin/statscanpy
Basic package for querying & downloading StatsCan data by table name.
Last synced: 16 Jan 2026
https://github.com/dongminlee94/data-visualization-tutorial
A repository for data visualization tutorial
data data-science data-visualization matp matplotlib pca plotly python seaborn t-sne tutorial umap visualization
Last synced: 29 Apr 2026
https://github.com/anandchowdhary/health
🫀 @AnandChowdhary's body measurements
csv data fitness github-actions health
Last synced: 29 Apr 2026
https://github.com/mongoexpuser/synthetic-drilling-data-app-for-sqlite-ml
Generate synthetic drilling data that can be used for testing machine learning (ML) models.
classification data drilling events inference machine-learning ml-models node-sqlite3 nodejs prediction python sqlite3 training
Last synced: 08 Apr 2026
https://github.com/jongirard/unique_names_generator
A Unique Names Generator built in Elixir
data data-generator elixir elixir-lang fake-data name-generator phoenix seed
Last synced: 21 Oct 2025
https://github.com/thyringer/cast
CLI tool for reading strings or complex data sets from CSV files to output them in other text formats.
csv-converter data data-preprocessing python python3 sql-builder
Last synced: 02 Feb 2026
https://github.com/joisino/twinpaper
Code for "Twin Papers: A Simple Framework of Causal Inference for Citations via Coupling" (CIKM 2022)
causal-inference data research science-of-science
Last synced: 21 Mar 2025
https://github.com/ngambip/diabetes_factors_2024
Exploring BMI Categories and Health Factors.
dashboards data datacleaning dax-languague powerbi sql sqlstudio tsql visualization
Last synced: 03 Mar 2026
https://github.com/sadcenter/messenger
Data messaging system between servers using popular messaging brokers
Last synced: 06 Aug 2025
https://github.com/coatless-rpkg/ucimlrepo
An unofficial R port of the Python package to download data off of the UCI ML repository
data data-science machine-learning rstats statistics uci-machine-learning uci-machine-learning-repository web-api
Last synced: 28 Jun 2025
https://github.com/blakedrumm/scvmm-scripts-and-sql
The Scripts provided here are compatible with System Center Virtual Machine Manager
collector data powershell scripts scvmm sql
Last synced: 11 May 2025
https://github.com/poncoe/passdatatoanotherfragment
Latihan Passing data Ke Fragment Lain
android android-app android-application android-studio data fragment fragments kotlin kotlin-android passing-parameters passingdataintent viewmodel
Last synced: 23 Jun 2026
https://github.com/emrecpp/datapacket-cpp
Send, recv, encrypt, decrypt, compress data as Packet and send it with socket for C++.
compress data deserialization deserialize deserializer encrypt packet recv send serialization serialize serializer socket storage
Last synced: 28 Mar 2025
https://github.com/writetome51/big-dataset-paginator
A TypeScript/JavaScript class for pagination in a real-world web app.
app data javascript pagination paginator typescript
Last synced: 17 May 2026
https://github.com/stdlib-js/datasets-anscombes-quartet
Anscombe's quartet.
anscombe anscombes-quartet data dataset datasets javascript node node-js nodejs quartet sample statistics stats stdlib
Last synced: 13 Oct 2025
https://github.com/14richa/patient-readmission-analysis
This project focuses on predictive modeling to foresee hospital readmissions of diabetic patients within 30 days post-discharge. By leveraging a dataset spanning a decade (1999-2008) and covering records from 130 US hospitals, the aim is to enhance healthcare management and patient outcomes.
analytics data jupyter-notebook numpy
Last synced: 29 Apr 2026
https://github.com/jesusgraterol/bitcoin-lightning-network-stats-dataset-builder
The dataset builder script extracts Bitcoin's Lightnining Network statistics through Mempool.space's public API. The data is stored in a .csv file, facilitating its use in data science and machine learning projects.
bitcoin blockchain blockchain-technology data data-science dataset dataset-generation lightning-network machine-learning
Last synced: 16 May 2026
https://github.com/georgetdn/syscppcp
Store C++ class data in a file ( persistence ) and manipulate it programmatically or using Small SQL (included)
class data framework object persistence serialize sql windows
Last synced: 04 Apr 2025
https://github.com/eliot-akira/png-compressor
Compress and encode data as PNG image
Last synced: 17 Mar 2025
https://github.com/ismet55555/pdw-asym-2link
Clear and easy way of simulating a passive dynamic walker (PDW) model derived and exectured using MATLAB.
data dynamics inverted-pendulum matlab numerical-simulations passive-dynamic-walker passive-dynamics ramp research robotics simulation slope walking-simulator
Last synced: 29 Apr 2026
https://github.com/max-tonny8/android_web3
This is a library for Android to call data from Node on Ethereum Chain or Solana Chain
android blockchain coroutines coroutines-android data eth-call ethereum kotlin ktx retrofit rpc smart-contracts solana web3 web3j
Last synced: 27 Mar 2025
https://github.com/Ekey/ER.DATA.Tool
Tool for extract data archives from mobile game Earth Revival (Project Arrival)
data earth-revival idx project-arrival
Last synced: 19 May 2026
https://github.com/yash22222/data-analysis-with-python
This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.
binning data data-acquisition data-analysis data-binning data-cleaning data-formatting data-integration data-normalization data-preprocessing data-science data-transformation data-wrangling dataframe description numpy pandas pandas-dataframe python python3
Last synced: 09 Apr 2026
https://github.com/dataopstix/modelt
Modelt(mow·delt) is a modern data integration solution that connects data to data for advanced analytics.
airbyte airflow airflow-docker data data-analysis data-visualization database dbt elt etl etl-automation metabase metadata modern modern-dev modernization
Last synced: 28 Mar 2025
https://github.com/stdlib-js/utils-named-typed-tuple
Named typed tuple.
array collection data data-structure data-structures javascript list named node node-js nodejs stdlib structure tuple typed typed-array util utilities utility utils
Last synced: 14 Apr 2025
https://github.com/hdk101/credentials-validator
A quick way to validate credentials in server-side
backend credentials data email frontend javascript login node npm npm-install password register server-side
Last synced: 21 Sep 2025
https://github.com/norton120/dfmock
Python Pandas DataFrame mock generator. You need mock'd data in a dataframe? this is what you need.
data mock pandas pandas-dataframe python python37
Last synced: 19 Jan 2026
https://github.com/pjmagee/starwars-data
A Star Wars Web app with Charts and entire Timeline events!
aspire blazor blazor-webassembly data database dataset docker dotnet json starwars starwars-data starwars-fandom
Last synced: 07 Mar 2026
https://github.com/bastianolea/sinim_info_municipal
Base de datos del Sistema Nacional de Información Municipal, que incluye datos comunales sobre finanzas municipales, recursos humanos, educación, salud, pensiones, organizaciones sociales, y más.
chile comunas data estado laboral politica social tiempo
Last synced: 26 Oct 2025
https://github.com/bgmp/tesis-german-deuster
Datos estadísticos para tercería de una tésis
Last synced: 28 Mar 2025
https://github.com/dantesc03/uberpool-case-study
This project was designed to understand the statistical effects of longer wait times on uber rides. Particularly on the user and driver experience with the Uber Pool System.
analysis data excel jupyter jupyternotebooks learn python seaborn statistics t-tests uber visualization
Last synced: 16 Apr 2026
https://github.com/ium101/files-and-folders-lister-z
Files and Folders Lister Z is a utility for listing the contents of directories on your computer. It provides both a command-line and a graphical user interface (GUI) for easy use.
application application-code brasil brazil cmd command data database databases exe filemanagement filesystem linux lowcode macos python sh tool utility windows
Last synced: 09 Oct 2025
https://github.com/mskian/tamil-words
Tamil words Collections with English Meaning - API and SQL Data.
api data javascript json json-api mysql pdo php sql tamil tamil-language tamil-sms tamilwords translate translator
Last synced: 14 Apr 2026
https://github.com/openpeeps/zxc-nim
Bindings to the ZXC compression library, a LZ77-based compressor optimized for high decompression speed
archive compression compressor data decompression game-assets lossless lossless-compression lz77 nim nim-bindings nim-package nim-wrapper openpeeps zxc
Last synced: 07 Jun 2026
https://github.com/bredalis/kpopnews
A place to see kpop news 📝
backend css data feedparser flask frameworks frontend html jinja2 kpop mongodb mongodb-atlas news newsletter os pages pymongo python requests web
Last synced: 12 Feb 2026
https://github.com/StudyResearchProjects/arrbuffstr
Creates Strings from ArrayBuffers and viceversa in NodeJS and the Browser
arraybuffer browser data node string transform
Last synced: 09 Oct 2025
https://github.com/mickeyshi-syd/actuarial-hackathon-2019
2019 Actuarial Hackathon
actuarial actuaries analytics data data-science hackathon
Last synced: 15 Jul 2025
https://github.com/camara94/data-visualization-with-python
Data visualization and some of the best practices when creating plots and visuals. The history and architecture of Matplotlib, and how to do basic plotting with Matplotlib. Generating different visualization tools using Matplotlib such as line plots, area plots, histograms, bar charts, box plots, and pie charts. Seaborn, another data visualization library in Python, and how to use it to create attractive statistical graphics. Folium, and how to use to create maps and visualize geospatial data.
data data-science data-structures data-visualization python3
Last synced: 16 May 2026
https://github.com/bastianolea/siedu_indicadores_urbanos
Datos del Sistema de Indicadores y Estándares de Desarrollo Urbano, con datos comunales sobre temas como transporte, urbanismo, servicios básicos, calidad de vida y más.
ambiental app chile ciudad comunas data estado social
Last synced: 19 Feb 2026
https://github.com/infinitode/pwlds
A public dataset of over 10 million passwords, with assigned strength levels.
ai classes classification cyber-security data dataset ml open-source password passwords synthetic-data
Last synced: 22 Feb 2026
https://github.com/slipke/eurlex-model-go
This projects implements the EUR-Lex XML data model in Golang. For more information see README.md
data datamodel eur-lex eurlex webservice
Last synced: 09 Mar 2026
https://github.com/quetz-al/quetzal-client
Python client for the Quetzal API
client data data-science openapi-client openapi3 python quetzal
Last synced: 28 Jul 2025
https://github.com/mikeintoshsystems/hispmd
HIS Performance Monitoring Dashboard
api dashboard data dhis2 dhis2-api docker docker-compose hispmd mfr rest-api visualization web
Last synced: 08 Apr 2025
https://github.com/nikoshet/exploratory-data-analysis-using-r
Exploratory Data Analysis using R Course Project for M.Sc. 'Data Science and Machine Learning' in NTUA
data data-analysis data-science eda exploratory-data-analysis ggplot2 r
Last synced: 14 May 2026
https://github.com/bernard-ng/drc-news-corpus
DRC News Corpus : Towards a scalable and efficient system for Congolese news dataset curation
aggregator data news nlp politics
Last synced: 06 Sep 2025
https://github.com/mohasarc/treeviz
The best tree data-structures visualization tool
data structures visualization visualization-tools
Last synced: 25 Apr 2026
https://github.com/woo071002/parcel-management-system
A Parcel Delivery Management System streamlining deliveries with features for admin, users, and delivery personnel, including real-time tracking, delivery requests, and personalized dashboards.
cors csharp data dotenv html-css iconfont jkuat land-information-system mongodb python react-router-dom sass tech-expo xaml
Last synced: 08 Oct 2025
https://github.com/minightdev/paperclip
Paperclip is a powerful privacy-focused data breach search engine that empowers users to swiftly and securely investigate breaches using email addresses and phone numbers. Our robust search engine delivers real-time results while prioritizing the privacy and security of user queries.
beaches data database pwn pwned search-engine
Last synced: 22 Mar 2025
https://github.com/wpp-public/akqa-nz-tagmanager-connector
A simple javascript library to send events to a tag manager container
Last synced: 05 Apr 2025
https://github.com/divithraju/divith-raju-searchengine-wikipedia
search engine optimizationA complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
algorithms data dataengineering inverted-index linux merge-sort nlp project project-repository python3 serchengine software-engineering ubuntu wikipedia
Last synced: 16 May 2026
https://github.com/wibosco/modelingformchanges-example
An example project to show how we can implement a model to simplify form validation
data swift unit-testing validator
Last synced: 16 Mar 2025
https://github.com/mark-summerfield/uxf
Uniform eXchange Format (uxf) is a plain text human readable optionally typed storage format that supports custom types. It may serve as a convenient alternative to csv, ini, json, sqlite, toml, xml, or yaml.
data ini json parser pretty-printer sqlite storage-engine toml xml yaml
Last synced: 08 Oct 2025
https://github.com/weisscharlesj/data_scicompforchem
Zipped data for SciCompforChem book for easy download
chemistry chemistry-education data data-visualization python
Last synced: 07 Nov 2025
https://github.com/memair/apps
App Store for Memair
apps appstore data data-science quantified-self
Last synced: 06 Apr 2026
https://github.com/stdlib-js/ndarray-base-char2dtype
Return the data type string associated with a provided single letter abbreviation.
abbr abbreviation array base c data dtype javascript multidimensional ndarray node node-js nodejs stdlib type types util utilities utility utils
Last synced: 12 Mar 2026
https://github.com/bastgau/snow-revoke-privileges
Script designed to simplify the management of permissions in your Snowflake databases.
data database dba dev-container python snowflake
Last synced: 20 Apr 2025
https://github.com/ciscorn/tinybufr
A Rust library for decoding BUFR meteorological observation data format
bufr data meteorology rust weather wmo
Last synced: 11 Jan 2026
https://github.com/rahulraikwar00/advault
Advault is a adhaar data vault generation tool
aadhaar data hacktoberfest uidai vault
Last synced: 05 Apr 2025
https://github.com/johntocci/nullaxe
Nullaxe is a powerful and user-friendly Python library designed for cleaning and preprocessing data. It works seamlessly with both pandas and polars DataFrames, making it a versatile tool for data scientists and developers.
data data-analysis data-science datacleaning pandas polars python
Last synced: 06 Apr 2026
https://github.com/ssiarhei115/customer-classification
Developing ML model predicting bank' customer inclination to open a deposit
big-data big-data-analytics data data-science data-visualization mashine-learning
Last synced: 09 Apr 2025
https://github.com/mmabiaa/data-structure-and-algorithms-java
Data structures and algorithms in java
algorithms algorithms-and-data-structures data data-structure-and-algorithm data-structures data-structures-algorithms data-structures-and-algorithms datastructures dsa dsa-learning-series dsa-practice java
Last synced: 09 Apr 2026
https://github.com/fiedsch/datamanagement
Data management helpers (PHP-CLI)
csv-data data datamanagement helper php
Last synced: 05 Apr 2025
https://github.com/e-candeloro/data-analysis-code-snippets-for-pandas-and-sklearn
These notebooks are useful to learn how to load, understand, clean and classify data using Pandas and Sklearn with Python
analysis big-data classification data datascience datavisualization machine-learning notebook numpy pandas python sklearn
Last synced: 10 Apr 2026
https://github.com/meetyildiz/pandazip
Reduce RAM footprint of Pandas DataFrame without losing information.
compression data data-mining data-science machine-learning pandas utilities
Last synced: 20 Jan 2026
https://github.com/luminati-io/Pinterest-dataset-samples
Two sample datasets of over 1000 Pinterest profiles and posts, extracted using the Bright Data API, ideal for market research, influencer marketing, and product development.
data data-extraction data-mining database datasets pinterest pinterest-api structured-data web-scraping
Last synced: 09 Apr 2025
https://github.com/bkamapantula/india-pc-nfhs4
Parliamentary constituency factsheet for indicators of nutrition, health, and development in India using NFHS4 data.
data government health india nfhs nfhs4
Last synced: 19 Mar 2026
https://github.com/richardschoen/ileaccess
Save file packaging of IBM i libraries ILEASTIC, NOXDB and ILEUSION for convenience installation. ILEastic allows HTTP microservices to be created on IBM i. NOXDB makes SQL,JSON and XML made easy for IBM i, ILEusion packages it all up to make IBM i data and program call services easier.
as400 call cl command data database db2 http https ibmi ile ileastic ileusion microservice noxdb program queue rpg sql xmlservice
Last synced: 24 Jul 2025
https://github.com/programmer-rd-ai/open-images-v6
Open-Images-V6
ai data dataset dl images ml object-detection open open-images programming python v6
Last synced: 03 Aug 2025
https://github.com/garciparedes/matlab-examples
Set of awesome Matlab Examples
data data-science examples garciparedes matlab statistics university-of-valladolid
Last synced: 05 Mar 2025
https://github.com/frefrik/covid19norge-data
🦠 COVID-19 Datasets for Norway
covid covid-19 covid19 covid19-data csv data datasets norge norway norwegian smittestopp vaccine
Last synced: 09 Apr 2026
https://github.com/pommes-public/pommesdata
A full-featured transparent data preparation routine from raw data to POMMES model inputs
data opensource power raw-data transparent
Last synced: 07 Oct 2025
https://github.com/evoluteur/web-scraper-sitemaps
Sitemaps for the Web Scraper Chrome extension.
chrome-extension data dataset scraper scraping scrapper scrapping scrapy-crawler sitemap web-scraper web-scraping
Last synced: 04 Jun 2026
https://github.com/chaitanyac22/hr_policy_query_resolution_with_retrieval_augmented_generation_rag
This repository contains an HR Policy Query Resolution system using Retrieval-Augmented Generation (RAG). It leverages a 4-bit quantized Mistral-7B-Instruct-v0.2 LLM and JP Morgan Chase’s publicly available Code of Conduct documents to generate accurate, contextually relevant responses for HR policy queries.
artificial-intelligence data hr large-language-models llm mistral-7b nlp pipeline prompt-engineering quantization rag retrieval-augmented-generation
Last synced: 12 Feb 2026