data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-29 00:07:49 UTC
- JSON Representation
https://github.com/williamwutq/bllist
Durable, crash-safe, checksummed block-based linked list allocators stored in a single file
data data-storage data-structure database file-based linkedlist
Last synced: 25 Jun 2026
https://github.com/aiwithqasim/competitive-programming
I will add all material which i did or in the future i will do to make my programming skill more enhanced to become a competitive programmer
c-plus-plus code data java programming structured-data
Last synced: 20 May 2026
https://github.com/gematik/app-fhir-snapshots-package-generator
The repository contains a library and a console application to generate snapshots for StructureDefinitions in FHIR-packages.
Last synced: 05 Oct 2025
https://github.com/arif-miad/heart-attack-risk-prediction
This dataset explores key factors influencing heart attack risk, such as age, cholesterol, blood pressure, and lifestyle habits. Using machine learning models.
classification data data-science matplotlib ml pandas-python seaborn visualization
Last synced: 18 Aug 2025
https://github.com/avto-dev/static-references-data
Data for static references
Last synced: 05 Oct 2025
https://github.com/freddy03h/immutable-data-structure
Normalize and Merge your application's data store using Immutable.JS objects
Last synced: 05 Oct 2025
https://github.com/DefinetlyNotAI/VulnScan_Data
Logicytics VulnScan Module's Training Data and old model archive
ai data logicytics ml models pytorch sensitive-files text-processing tfidf-text-analysis training-data
Last synced: 17 Aug 2025
https://github.com/dylanhogg/cloud-products
A package for getting cloud products and product descriptions from a cloud provider website.
aws cloud-products crawler data text-processing
Last synced: 05 Oct 2025
https://github.com/garcane/income-prediction-ml
This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.
data data-science machine-learning ml numpy pandas python random-forest scikit-learn
Last synced: 08 Apr 2026
https://github.com/zediculz/block
Block is a data structure/collection that uses Blockchain principle in managing data.
Last synced: 05 Oct 2025
https://github.com/pradeep221b/turbofan_predictive_maintenance
An R project for predicting turbofan engine RUL using {targets} and {tidymodels}.
data data-science-portfolio machine-learning nasa preditive-maintaince r rstats targets-pipeline tidymodels
Last synced: 04 Oct 2025
https://github.com/rishabh-agarwal/datastructuremachineproblem
Data Structure MP - Clemson University (Language C)
273 alogrithms clemson data ece structure university
Last synced: 26 Oct 2025
https://github.com/stdlib-js/array-one-to-like
Generate a linearly spaced numeric array whose elements increment by 1 starting from one and having the same length and data type as a provided input array.
array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector
Last synced: 20 Feb 2026
https://github.com/nikoshet/rust-dms-cdc-operator
The rust-dms-cdc-operator is a Rust-based utility for comparing the state of a list of tables in an Amazon RDS database with data stored in Parquet files on Amazon S3, particularly useful for change data capture (CDC) scenarios.
aws cdc data dms parquet pgdatadiff polars postgres rds rust s3 validation
Last synced: 18 Jan 2026
https://github.com/pharo-ai/data-imputers
This project contains transformers for missing value imputation
ai data data-science imputer pharo pharo-smalltalk smalltalk
Last synced: 18 Jan 2026
https://github.com/scienxlab/datasets
Some small datasets for demos, courses, testing, etc.
data open-data sample-data teaching-resources
Last synced: 09 Oct 2025
https://github.com/davidteather/scrape-crossfit-gyms
Scrapes crossfit gym data
cross-fit crossfit data data-scraping python python-requests python3 scraping
Last synced: 13 Aug 2025
https://github.com/sathyasris27/data-analysis-on-adult-smoking-patterns-in-the-uk
The aim of this analysis is to understand the smoking patterns among adults in the UK.
data data-analysis data-visualization python3
Last synced: 09 May 2026
https://github.com/helosantosdesousa/analise-previsao-de-rotatividade-ml
Projeto final do Bootcamp Data Girls 2025 que analisa a rotatividade de funcionários usando Machine Learning. Com base no dataset IBM HR Analytics Attrition, o projeto identifica os principais fatores de risco e cria modelos preditivos (SVC e Random Forest) com até 89% de acurácia para antecipar saídas e apoiar decisões estratégicas de RH.
analise-de-dados analise-exploratoria bootcamp ciencia-de-dados colab-notebook dados data data-analysis data-science dataanalytics dataframe eda machine-learning machine-learning-algorithms pandas python random-forest svc
Last synced: 16 Apr 2026
https://github.com/alexandregazagnes/rica-analysis
This repository contains the code to download, analyse, and modelize the RICA dataset from the french ministry of agriculture.
analysis argiculture business data data-analysis data-analytics food python
Last synced: 29 Apr 2026
https://github.com/vikjam/ui-policy
Unemployment policy at the state level
data government government-data
Last synced: 13 Feb 2026
https://github.com/davorg/dmp
Data Munging with Perl
book data hacktoberfest munging perl
Last synced: 21 Jan 2026
https://github.com/isaac-lal/english-arabic-dictionary
This is a dictionary website that implements a search feature which allows input for a word in either English or Arabic and returns the alternative translation.
data db javascript react web-development
Last synced: 09 Apr 2026
https://github.com/jrmedd/emojinal
An experimental API for determining emoji sentiment, based on research from Institut "Jožef Stefan", Slovenia.
data emojis sentiment user-research ux
Last synced: 19 Jan 2026
https://github.com/tiaanduplessis/country-currency-data
Data about currencies of countries
countries currencies data symbols
Last synced: 08 Aug 2025
https://github.com/genert/metis
Asynchronous data sender library
analytics asynchronous data dependency-free typescript
Last synced: 27 Jan 2026
https://github.com/anobaka/insidecollector
这是一个介于Excel和纯记录工具之间的软件,您可以自由创建各种列表,然后将其以各种规则关联起来,并且可以创建自定义视图帮助您更好地理解数据。
collection data excel-like list list-manager table
Last synced: 19 Jan 2026
https://github.com/R-Mahesh45/HR---Resume-Text-Classification
Text Classification for Resumes: Conducted Exploratory Data Analysis (EDA) on a vast collection of resumes. Organized the data using Bag of Words (BoW) and TF-IDF techniques. Built and evaluated multiple models, with Logistic Regression delivering standout performance. Created Word Clouds and Histograms.
data datacleaning extract-transform-load feature-extraction nlp nltk-tokenizer text-mining text-processing
Last synced: 13 Oct 2025
https://github.com/tberey/social-stocks
A Graphical Data and Analysis Tool
data data-analysis data-science data-stream data-visualization database javascript mysql mysql-database node nodejs rest rest-api social-stocks stock-market stocks ticker-data tickers trends typescript
Last synced: 21 Jan 2026
https://github.com/connectaman/deepseek-ocr-multigpu-infer
Efficient multi-GPU OCR inference framework leveraging parallel processes for accelerated token throughput and faster batch processing. Designed for scalable, high-performance optical character recognition workloads using PyTorch. Supports dynamic GPU assignment, optimized resource utilization, and easy integration for large-scale image datasets.
agentic-extraction data deepseek document-parser extraction extractor gpu image-parser llm multigpu nvidia ocr parallel-computing parser pdf-parser vlm
Last synced: 22 Jan 2026
https://github.com/geocollections/turvas
Database of peat geology
data data-visualization database estonia geology mineral-resources peat
Last synced: 05 Feb 2026
https://github.com/lisakey/datacamp-data-analyst-python-sql-projects
Several projects completed during my Data Analyst 📊 training on the DataCamp platform with Python 🐍 and SQL 🗃️. Each project addresses real-world challenges using modern analytical tools and techniques.
analysis cleaning-data data dataanalysis dataanalyst matplotlib pandas python seaborn sql transformation visuali
Last synced: 19 Apr 2026
https://github.com/simranjeet97/leetcode_practice
Practicing the Leet Code Codes for Competitive Programming
algorithms amazon coding competitive-programming data data-structures facebook google leetcode python
Last synced: 03 Aug 2025
https://github.com/stephaniehicks/flowsorted.blood.wgbs.blueprint
A Bioconductor ExperimentHub data package for flow sorted purified whole blood cell types measured using DNA methylation on WGBS platform from BLUEPRINT
bioconductor bioconductor-package bisulfite-sequencing blood data dna-methylation flowsort wgbs
Last synced: 25 Sep 2025
https://github.com/intersystems-ib/workshop-healthcare-interop
Learn the basics in HealthCare Interoperability using InterSystems IRIS for Health
data fhir health hl7 interoperability
Last synced: 14 Apr 2026
https://github.com/open-i18n/data-iso-15924
Git mirror for ISO 15924, Codes for the representation of names of scripts data
data iso iso-15924 iso15924 open-i18n scripts unicode unicode-data writing-systems
Last synced: 14 Mar 2026
https://github.com/ajsalemo/python-pandas-datalib
Testing and experimenting with some simple Pandas functionality using Flask to serve the parsed data.
csv data flask json pandas pandas-dataframe pandas-series python tabular tabular-data terminal
Last synced: 09 Apr 2026
https://github.com/akv3sic/cryptocurrency-charts
Cryptocurrency API data visualizations 📈 with Matplolib.
cryptocurrency data data-visualization matplotlib python
Last synced: 16 Oct 2025
https://github.com/potreic/etl-fashion-trend-analysis
✨ Automate fashion trend analysis with Apache Airflow! Extract data from X & Pinterest, transform into insights, and load into PostgreSQL. Predict seasonal styles & visualize trends. 💃📊
airflow airflow-dags data data-engineering etl etl-automation etl-pipeline fashion-trends
Last synced: 27 Jan 2026
https://github.com/nicolasbizzozzero/datagenerator
Randomly generate various commonly used data
data data-generation data-generator data-science
Last synced: 18 Oct 2025
https://github.com/dannyben/datamix
DSL for manipulating tabular data
csv data data-analysis data-engineering gem ruby tabular-data
Last synced: 31 Jul 2025
https://github.com/gematik/poc-isik-patient-merge
The repository contains a proof of concept (POC). The POC demonstrates how a FHIR subscription can be used to inform about happened merges within the ISIK context.
Last synced: 19 Oct 2025
https://github.com/gappeah/london-housing-price-dashboard
This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.
data data-analysis data-visualization excel visual
Last synced: 31 Jul 2025
https://github.com/visenger/prada
Profiling Datasets
cleaning data dataset profiling
Last synced: 24 Aug 2025
https://github.com/mouneshgouda/learn_dsa
This repository explores fundamental data structures and their implementations. Learn how to organize and manipulate data efficiently for various programming tasks. (Feel free to add your specific focus areas here, e.g., algorithms, interview prep)
c data queue sorting-algorithms stack structured-data
Last synced: 30 Jul 2025
https://github.com/asuozzo/medicare-data-analysis
An analysis of Medicare Part D data in Vermont
Last synced: 04 May 2026
https://github.com/joeyism/py-cifar10
This library was created to allow an easy usage of CIFAR 10 DATA. This is a wrapper around the instructions givn on the CIFAR 10 site
cifar cifar-10 cifar10 data machine-learning machinelearning
Last synced: 30 Jul 2025
https://github.com/ishaansathaye/cpe202-datastructalgos
CPE 202 Data Structures and Algorithms Winter 2022 Freshman at Cal Poly
algorithm binary binary-search-tree data graph hash heap python queue stack structures
Last synced: 12 May 2026
https://github.com/charliecm/meteorite-landings
Data visualization of meteorite landings on Earth.
astronomy d3 data data-visualization mapbox space visualization
Last synced: 18 Apr 2026
https://github.com/velocitatem/cellviz
Cellular Automata inspired by live-data visualization, designed to handle multidimensional and high-throughput data efficiently.
cellular-automata conways-game-of-life data economics
Last synced: 29 Jul 2025
https://github.com/lemniscate-world/stratai
This project analyzes financial assets using a Hidden Markov Model (HMM) to identify different market regimes and patterns. The analysis includes calculating daily returns, rolling volatility, and volume changes, and visualizing the hidden states identified by the HMM.
ai assets data data-science data-visualization finance financial-analysis fintech hmm-model hmmlearn machine-learning trading
Last synced: 23 Oct 2025
https://github.com/dicook/tutorial_effective_data_plots
Materials for WOMBAT 2024 tutorial
data graphics inference statistics tidyverse visualisation
Last synced: 23 Jan 2026
https://github.com/purarue/git_doc_history
copy/track file history in git, with python bindings to traverse and extract history/files/lines at some date
Last synced: 17 May 2026
https://github.com/mustika-putri-m/analysis-of-sales-transactions-in-an-online-shop---london
Crucial Question 1. How was the sales trend over the months? 2. What are the most frequently purchased products? 3. How many products does the customer purchase in each transaction? 4. What are the most profitable segment customers? 5. Based on your findings, what strategy could you recommend to the business to gain more profit?
data data-analysis-python data-analytics data-visualization ecommerce
Last synced: 24 Oct 2025
https://github.com/ginga1402/travego_travellers
MySQL Mini Project
college-project data mysql-database
Last synced: 27 Jul 2025
https://github.com/incubrain/awesome-maharashtra-data
A collection of datasets specific to Maharashtra, India. WIP
ai artificial-intelligence data data-analysis data-science datasets maharashtra marathi
Last synced: 23 May 2026
https://github.com/patelabhi574/hotel_reservation_analysis
Analyzing data collected by hotel to make future prediction for the owner of what are the segments they are making most profit & also which are the patterns & trends which have been seen over the past years in the booking in different times throughout the year and price setting on the website in peak time as per availability index.
data data-visualization datamodeling looker-studio powerbi reporting sql-query sql-server
Last synced: 19 Feb 2026
https://github.com/stonecharioteer/renfield
Synchronize and Search through Hard Drives
catalogue data search storage synchronization
Last synced: 09 Feb 2026
https://github.com/capire/xtravels-java
Travel booking app using master data from xflights built with CAP Java
cap cds data federation flights java reuse
Last synced: 23 Jan 2026
https://github.com/cmda-tt/course-24-25
🎓 tech track · 2024-2025 · curriculum and syllabus 📊
d3 data datavis datavisualization es6 functional javascript programming svelte
Last synced: 28 Jan 2026
https://github.com/sixarm/sixarm_ruby_fab
SixArm.com → Ruby → Fab gem to fabricate sample data for testing
data fabrication factory fake gem mock ruby
Last synced: 24 Jul 2025
https://github.com/itrauco/robots
ai, machine learning, and robots...
ai artificial-intelligence automation big-data cloud cloud-engineering data data-engineering data-science data-science-projects m machine machine-learning ml prompts robots
Last synced: 11 Jun 2026
https://github.com/qeeqbox/data-lifecycle-management
Data Lifecycle Management (DLM) is a policy-based model for managing data in an organization
data data-lifecycle-management infosecsimplified lifecycle management qeeqbox
Last synced: 07 Mar 2026
https://github.com/aruneshbasak/python-dsa-problems-geeksforgeeks-160-days
I will upload my daily Python DSA problems solved on GeeksforGeeks and post it here!
algorithms-and-data-structures and data data-structures dsa python python3 structure
Last synced: 08 May 2025
https://github.com/maccccd/wsoa3029a_2444372
This website serves an extension of my portfolio work. It focuses specifically on showcasing my understanding of D3.js , a JavaScript library used to create interactive data visualizations. The visualizations in here were used to provide insights on two types of cybersecurity attacks: Phishing & Ransomware.
d3js data hacking visualization
Last synced: 24 Jan 2026
https://github.com/zoekelepiri/ota_observatory
A front-end web application that provides detailed information about the boundaries and statistical data of the regions and prefectures of Greece.
backend data database spring-boot
Last synced: 06 Feb 2026
https://github.com/ariqf1/learn_data
Currently learning and building projects related to data pipelines, ETL processes, and data processing using Python. Passionate about scalable data solutions and modern data stack tools.
Last synced: 15 Apr 2026
https://github.com/stdlib-js/ndarray-base-output-policy-str2enum
Return the enumeration constant associated with an output ndarray data type policy string.
array data dtype dtypes enum javascript multidimensional ndarray node node-js nodejs policy stdlib types util utilities utility utils
Last synced: 15 Apr 2026
https://github.com/fairspec/fairspec-typescript
Fairspec TypeScript is a fast data management framework built on top of the Fairspec standard and Polars DataFrames
ckan csv data dataframe dataset excel fair json ods polars quality schema sqlite table typescript validation zenodo
Last synced: 09 Feb 2026
https://github.com/real-veersandhu/cia-country-comparison
Data analysis system on the CIA World Factbook
Last synced: 25 Feb 2025
https://github.com/qbicsoftware/research-data-management
Documentation about the life science research data management at QBiC
data data-management data-stewardship documentation hacktoberfest life-science management metadata rdm reasearch-data-management
Last synced: 30 Jan 2026
https://github.com/jinsyin/datagovernance
公众号:「数据之道」
data data-governance datagovernance governance
Last synced: 30 Jan 2026
https://github.com/dmitriiweb/tr-data-getter
Tool to get market data from bitstamp.ne
Last synced: 14 May 2026
https://github.com/simranjeet97/quotes-analysis
Kaggle Dataset on Quotes Analysis and Visualization With Python, Pandas and MatplotLib Using Jupyter Notebook.
data data-science datavisualization jupyter-notebook kaggle kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python quotes quotes-application
Last synced: 15 Apr 2026
https://github.com/stdlib-js/ndarray-base-assert-is-real-data-type
Test if an input value is a supported ndarray real-valued data type.
array assert base check data dtype is javascript multidimensional ndarray node node-js nodejs stdlib test types util utilities utility utils
Last synced: 31 Jan 2026
https://github.com/tee8z/noaa-oracle
NOAA data oracle, queryable from the browser and can attest to events for a Bitcoin DLC in dlctix style
data duckdb-wasm noaa-weather parquet-files sql weather
Last synced: 17 Feb 2026
https://github.com/tupizz/data-processing-pipeline-aws
This project is a serverless application built with the Serverless Framework, TypeScript, and AWS services. It provides an enrichment service that processes contact information and enriches it with additional data.
aws data pipeline serverless typescript
Last synced: 13 May 2026
https://github.com/cyberoctane29/cyclistic-bike-share--analyzing-rider-behavior
Analyzed Cyclistic's bike-share data to uncover usage differences between casual riders and annual members. Utilized SQL and MySQL for data processing, R for visualisation, and Kaggle for collaboration. Insights will guide marketing strategies to convert casual riders into annual members.
data dataanalysis dataanalytics database rlanguage rmarkdown spreadsheet sql
Last synced: 22 May 2026