data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-01 00:07:35 UTC
- JSON Representation
https://github.com/danielrosehill/monetised-ghg-emissions
Calculating monetised GHG emissions for various companies based upon disclosure data
data sustainability sustainability-data
Last synced: 07 Sep 2025
https://github.com/programmer-rd-ai/moviedatascraper
Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!
beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web
Last synced: 01 Mar 2025
https://github.com/basemax/buskool.com-data
This repository contains the collected product data from the Buskool website (باسکول). The data is stored in 20k+ JSON files, each containing detailed information about products available on the website.
buskool buskoolcom data farsi information ir iran json persian
Last synced: 03 Apr 2025
https://github.com/flowsynx/plugin-postgresql
FlowSynx plugin to interfaces with PostgreSQL for CRUD operations. Supports JSONB, full-text search, and advanced query features.
data database flowsynx postgresql postgresql-database sql
Last synced: 09 May 2026
https://github.com/rapter1990/data-visualization-examples
Data Visualization Examples
data data-analysis data-visualization folium matplotlib plot plotly python seaborn visualization
Last synced: 13 Apr 2026
https://github.com/dannyben/datamix
DSL for manipulating tabular data
csv data data-analysis data-engineering gem ruby tabular-data
Last synced: 31 Jul 2025
https://github.com/vincentlaucsb/csv-data
A curated repository of real and fake CSV data for use in testing suites
Last synced: 08 Mar 2026
https://github.com/avto-dev/static-references-data
Data for static references
Last synced: 05 Oct 2025
https://github.com/wangshouh/cryptofinancedata
An ipynb file containing data acquisition of futures, options and other financial derivatives
Last synced: 05 Oct 2025
https://github.com/derrickbaruga7/python-data-analysis
This project analyzes ORU’s off-season sewer usage using Python, with `pandas` for data handling, histograms and line plots for exploration, and a `scipy`-based model for prediction. Pearson’s correlation and visualizations help reveal key trends and relationships.
analytics data data-science visualization
Last synced: 31 Jul 2025
https://github.com/gappeah/london-housing-price-dashboard
This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.
data data-analysis data-visualization excel visual
Last synced: 31 Jul 2025
https://github.com/visenger/prada
Profiling Datasets
cleaning data dataset profiling
Last synced: 24 Aug 2025
https://github.com/anuveyatsu/cloudflare-data-fabric
Cloudflare Data Fabric: Use Cloudflare's global infrastructure to build a flexible, resilient framework for data solutions.
cloudflare data data-lake fabric lakehouse mesh
Last synced: 29 Jun 2026
https://github.com/mouneshgouda/learn_dsa
This repository explores fundamental data structures and their implementations. Learn how to organize and manipulate data efficiently for various programming tasks. (Feel free to add your specific focus areas here, e.g., algorithms, interview prep)
c data queue sorting-algorithms stack structured-data
Last synced: 30 Jul 2025
https://github.com/nikoshet/rust-dms-cdc-operator
The rust-dms-cdc-operator is a Rust-based utility for comparing the state of a list of tables in an Amazon RDS database with data stored in Parquet files on Amazon S3, particularly useful for change data capture (CDC) scenarios.
aws cdc data dms parquet pgdatadiff polars postgres rds rust s3 validation
Last synced: 18 Jan 2026
https://github.com/jakakokosar/bioinformatics-serverfiles
Knowledge base for Orange3-bioinformatics add-on
bioinformatics data dictybase gene genesets go homologene markergenes ncbi serverfiles
Last synced: 16 Apr 2026
https://github.com/yessasvini23/accenture_-social-buzz-data-analytics-virtual-programme-forage
Accenture Data Analytics and Visualization - Virtual Internship
accenture content data dataanalytics excel forge socialbuzz
Last synced: 18 Jan 2026
https://github.com/kamal-singh22/ai-driven-emotional-sentiments-analysis
This project leverages machine learning to analyze and classify the emotional sentiment of textual data. The goal is to accurately identify and categorize emotions, aiding applications in customer feedback analysis, social media sentiment analysis, and mental health monitoring.
analysis artificial-intelligence data emotion nlp-machine-learning python sentiment-analysis streamlit text-classification
Last synced: 14 Apr 2026
https://github.com/codeforafrica/ckanext-followy
[ARCHIVED] A CKAN extension to show the datasets a user is following.
ckan ckan-extension ckanext-followy data dataset followy-extension open-data
Last synced: 29 Jun 2026
https://github.com/definetlynotai/vulnscan_data
Logicytics VulnScan Module's Training Data and old model archive
ai data logicytics ml models pytorch sensitive-files text-processing tfidf-text-analysis training-data
Last synced: 11 Oct 2025
https://github.com/strata/data
Tools to help you read data from a range of different data providers.
Last synced: 27 Jan 2026
https://github.com/davorg/dmp
Data Munging with Perl
book data hacktoberfest munging perl
Last synced: 21 Jan 2026
https://github.com/stdlib-js/array-base-assert-is-real-floating-point-data-type
Test if an input value is a supported array real-valued floating-point data type.
array assert base check data dtype is javascript node node-js nodejs stdlib test types util utilities utility utils valid validate
Last synced: 12 Oct 2025
https://github.com/malvfr/zap
Fill your database with fake data.
cli csv data database generator hacktoberfest mock node populate populate-database seed sql
Last synced: 21 Jan 2026
https://github.com/asuozzo/medicare-data-analysis
An analysis of Medicare Part D data in Vermont
Last synced: 04 May 2026
https://github.com/mccarthy-m-g/alda
An R data package for the book "Applied longitudinal data analysis: Modeling change and event occurrence" by Singer and Willett (2003).
data growth-curves longitudinal-data mixed-models nonlinear-mixed-models r r-package structural-equation-modeling survival-analysis time-to-event
Last synced: 19 Jan 2026
https://github.com/joeyism/py-cifar10
This library was created to allow an easy usage of CIFAR 10 DATA. This is a wrapper around the instructions givn on the CIFAR 10 site
cifar cifar-10 cifar10 data machine-learning machinelearning
Last synced: 30 Jul 2025
https://github.com/simonbernarding/ml_project_simonbernarding
This project focuses on predicting flight delays using historical data from a Tunisian airline. We analyzed patterns in airport operations and flight schedules to build a machine learning model that can forecast potential delays.
data data-science flight-delay-prediction machine-learning machinelearning prediction
Last synced: 12 Oct 2025
https://github.com/R-Mahesh45/HR---Resume-Text-Classification
Text Classification for Resumes: Conducted Exploratory Data Analysis (EDA) on a vast collection of resumes. Organized the data using Bag of Words (BoW) and TF-IDF techniques. Built and evaluated multiple models, with Logistic Regression delivering standout performance. Created Word Clouds and Histograms.
data datacleaning extract-transform-load feature-extraction nlp nltk-tokenizer text-mining text-processing
Last synced: 13 Oct 2025
https://github.com/iamgmujtaba/github-python-daily-trending
This repository provides an automated, daily-updated list of the top trending Python repositories on GitHub. Using a GitHub Actions workflow, it scrapes data from GitHub's trending page, sorts the results by total stars, and generates a clean, well-structured README file
data data-scraping github-actions tranding tranding-bot
Last synced: 13 Oct 2025
https://github.com/tberey/social-stocks
A Graphical Data and Analysis Tool
data data-analysis data-science data-stream data-visualization database javascript mysql mysql-database node nodejs rest rest-api social-stocks stock-market stocks ticker-data tickers trends typescript
Last synced: 21 Jan 2026
https://github.com/geocollections/turvas
Database of peat geology
data data-visualization database estonia geology mineral-resources peat
Last synced: 05 Feb 2026
https://github.com/olamide100/capstone-project-llm-zoomcamp
Comparative Guide Assistant
argocd data dataengineering docker grafana kubernetes llm-agent mlops-workflow rag strreamlit
Last synced: 14 Feb 2026
https://github.com/stdlib-js/ndarray-base-dtypes2signatures
Transform a list of array argument data types into a list of signatures.
api array base data dtype dtypes interface javascript multidimensional ndarray node node-js nodejs sig signatures stdlib types utilities utility utils
Last synced: 14 Apr 2026
https://github.com/yeshunit/walmart-product-customer-sales-sql-analysis
This project aims to explore the Walmart Sales data to understand top performing branches and products, sales trend of of different products, customer behaviour. The aims is to study how sales strategies can be improved and optimized. The dataset was obtained from the Kaggle
data database mysql sql walmart
Last synced: 24 Feb 2026
https://github.com/morphaxthedeveloper/yokatlas-dataset-2025
yök atlas detaylı üniversite, bölüm, puan vb. datası..
data database liste scrape universite veri yok-atlas yok-atlas-api yok-atlas-data yokatlas yokatlas-crawler yokatlas-data
Last synced: 14 Oct 2025
https://github.com/ibilalkayy/covid-tracking-app
This repository contains the code of a covid tracking app that shows the data of covid-19 on Google Map.
Last synced: 14 Oct 2025
https://github.com/open-i18n/data-iso-15924
Git mirror for ISO 15924, Codes for the representation of names of scripts data
data iso iso-15924 iso15924 open-i18n scripts unicode unicode-data writing-systems
Last synced: 14 Mar 2026
https://github.com/nxank4/loclean
⚡️ The All-in-One Local AI Data Cleaning Library. No GPU or API keys required.
automated-cleaning data data-cleaning data-engineering data-preprocessing data-science data-wrangling etl llm normalization open-source polars privacy-preserving python semantic-analysis slm structured-data
Last synced: 22 Jan 2026
https://github.com/potreic/etl-fashion-trend-analysis
✨ Automate fashion trend analysis with Apache Airflow! Extract data from X & Pinterest, transform into insights, and load into PostgreSQL. Predict seasonal styles & visualize trends. 💃📊
airflow airflow-dags data data-engineering etl etl-automation etl-pipeline fashion-trends
Last synced: 27 Jan 2026
https://github.com/data-forge-notebook/javascript-cheat-sheet
Cheat sheet that accompanies my book Data Wrangling with JavaScript
cheatsheet data data-wrangling javascript nodejs
Last synced: 15 Apr 2026
https://github.com/velocitatem/cellviz
Cellular Automata inspired by live-data visualization, designed to handle multidimensional and high-throughput data efficiently.
cellular-automata conways-game-of-life data economics
Last synced: 29 Jul 2025
https://github.com/divithraju/divith-aju-hadoop-pyspark-pipeline
This project demonstrates the creation of a scalable data processing pipeline for handling and analyzing log data from a hypothetical e-commerce platform. Leveraging Hadoop and PySpark, the pipeline is designed to process large volumes of log files, providing meaningful insights into user behavior, system performance, and sales metrics.
apache-hadoop-framework apache-spark bigdata client data database dataengineering dataingestionframework datapreprocessing documentation ecommerce-platform hdfs pipeline project project-repository pyspark python3 software-engineering
Last synced: 27 Jan 2026
https://github.com/dicook/tutorial_effective_data_plots
Materials for WOMBAT 2024 tutorial
data graphics inference statistics tidyverse visualisation
Last synced: 23 Jan 2026
https://github.com/purarue/git_doc_history
copy/track file history in git, with python bindings to traverse and extract history/files/lines at some date
Last synced: 17 May 2026
https://github.com/mustika-putri-m/analysis-of-sales-transactions-in-an-online-shop---london
Crucial Question 1. How was the sales trend over the months? 2. What are the most frequently purchased products? 3. How many products does the customer purchase in each transaction? 4. What are the most profitable segment customers? 5. Based on your findings, what strategy could you recommend to the business to gain more profit?
data data-analysis-python data-analytics data-visualization ecommerce
Last synced: 24 Oct 2025
https://github.com/ayushverma135/accenture-data-analytics-and-visualization
This program provided practical experience in advising a hypothetical social media client as a Data Analyst at Accenture. The simulation involved cleaning, modeling, and analyzing multiple datasets, culminating in the creation of a PowerPoint deck and video presentation to communicate key insights.
accenture analytics data data-visualization forage presentation
Last synced: 19 Sep 2025
https://github.com/garcane/Income-Prediction-ML
This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.
data data-science machine-learning ml numpy pandas python random-forest scikit-learn
Last synced: 24 Oct 2025
https://github.com/ginga1402/travego_travellers
MySQL Mini Project
college-project data mysql-database
Last synced: 27 Jul 2025
https://github.com/public-health-scotland/waiting_times_clinical_prioritisation
This repository contains the Reproducible Analytical Pipeline (RAP) to produce the quarterly statistics on clinical prioritisation, part of the Stage of Treatment (SoT) publication.
data healthcare nhs public-health scotland shiny shiny-app treatment waiting-time
Last synced: 26 Jul 2025
https://github.com/CheeseWithSauce/HadithsJSONFormat
Free, authentic Hadith data from sunnah.com organized bookwise specially for Muslim devs. Includes Arabic, English, and gradings. Use freely without credits. Collections: Bukhari, Muslim, Abu Dawud, Tirmidhi, Nasa'i, Ibn Majah, Malik, Riyad as-Salihin. Expanding soon, Inshallah.
api arabic data dev free hadith islam islamic muslim open-source quran sunnah
Last synced: 24 Feb 2026
https://github.com/fairspec/fairspec-typescript
Fairspec TypeScript is a fast data management framework built on top of the Fairspec standard and Polars DataFrames
ckan csv data dataframe dataset excel fair json ods polars quality schema sqlite table typescript validation zenodo
Last synced: 09 Feb 2026
https://github.com/elissorokin/data-analyst-portfolio-rus
Это репозиторий, в котором я демонстрирую свои навыки, делюсь проектами и отслеживаю прогресс в области анализа данных и Data Science.
ab-testing data data-analysis datalense matplotlib numpy pandas plotly portfolio postgresql python scipy seaborn sql statistical-analysis
Last synced: 25 Feb 2026
https://github.com/giladbarnea/to
A simple CLI tool to convert and diff between JSON, YAML, TOML, JSON5 and Python collections.
conversion data data-conversion json json5 parser script terminal toml yaml
Last synced: 08 Feb 2026
https://github.com/sixarm/sixarm_ruby_fab
SixArm.com → Ruby → Fab gem to fabricate sample data for testing
data fabrication factory fake gem mock ruby
Last synced: 24 Jul 2025
https://github.com/garcane/cookie-company-visual-dashboard
This Excel-based interactive dashboard provides a comprehensive overview of the Cookie Company's sales performance and key metrics.
dashboard data data-visualization excel microsoft-excel
Last synced: 09 Feb 2026
https://github.com/3squared/smoulder
Smoulder is a really good data pipe
composition data facade-pattern forge-framework object-oriented
Last synced: 25 Apr 2026
https://github.com/danielbello7/nosql-json-database
Simple and quick database to help development process and speed
data database json json-database models nosql nosql-database nosql-json-database schema
Last synced: 09 May 2026
https://github.com/codenoid/alodokter.com-database
a Alodokter.com Database, collected by Hofesh Bot (Scrapper)
alodokter data extraction hofesh
Last synced: 18 Mar 2026
https://github.com/jhpoelen/bats
self-documenting data publication on Bat (Chiroptera) specimen
biodiversity data natural-history-collections provenance specimen
Last synced: 18 Mar 2026
https://github.com/scottleechua/data
Public datasets under CC-BY-4.0 license.
Last synced: 18 Mar 2026
https://github.com/kerlossony/nested-formdata
Nested-FormData is a Function designed to handle nested form data structures in a simplified and efficient way. It helps in managing complex form data, making it easier to work with forms that require hierarchical data
data forms javascript nested-structures nextjs reactjs typescript
Last synced: 08 Mar 2026
https://github.com/tushar2704/applied-ai-playground
This repository serves as a comprehensive collection of resources and projects for Applied Artificial Intelligence (AI). Whether you're an AI enthusiast, a data scientist, or a developer looking to explore practical applications of AI, this repository aims to provide you with valuable materials and hands-on projects to deepen your understanding.
artificial-intelligence data data-science machine-learning machine-learning-algorithms
Last synced: 12 Feb 2026
https://github.com/qeeqbox/data-security
Safeguarding your personal information (How your info is protected)
data data-security infosecsimplified qeeqbox security
Last synced: 19 Mar 2026
https://github.com/sakan811/tekken-8-jun-kazama-data-analysis-showcase
Showcase visualizations about Jun Kazama from Tekken 8
data data-analysis data-visualization game games powerbi tekken tekken-8 video-game visualization web-scraping
Last synced: 28 Feb 2026
https://github.com/aruneshbasak/python-dsa-problems-geeksforgeeks-160-days
I will upload my daily Python DSA problems solved on GeeksforGeeks and post it here!
algorithms-and-data-structures and data data-structures dsa python python3 structure
Last synced: 08 May 2025
https://github.com/garcane/london-housing-price-dashboard
This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.
data data-analysis data-visualization excel visual
Last synced: 13 Feb 2026
https://github.com/frictionlessdata/extensiondp
Extension DP (Data Package Extension Template) is a Git repository template for rapid Data Package extension development
data datapackage exchange extension format
Last synced: 13 Feb 2026
https://github.com/luminati-io/twitter-x-dataset-samples
A sample dataset of over 1000 Twitter (X) posts, extracted using the Bright Data API, ideal for trend discovery, brand monitoring, and competitive insights.
api data dataset twitter twitter-api twitter-scraper web-scraping x
Last synced: 19 Mar 2026
https://github.com/ghonimo/diode-pn-junction-characterization-psu-ece515
A detailed analysis of the I-V characteristics of a PN junction diode (1N4148) under different temperatures, utilizing Excel for graphical analysis and parameter extraction. This study was conducted as part of the ECE 515: Fundamentals of Semiconductor Devices course at Portland State University.
analysis characterization data device diode diodes excel mosfet-transistor pn-junction
Last synced: 28 Feb 2026
https://github.com/eugenedakin/polyalphabeticcipher
PolyAlphabetic Cipher
data decryption encryption polyalphabetic polyalphabetic-cipher polyalphabetic-crypto polyalphabetic-substitution xojo
Last synced: 19 Mar 2026
https://github.com/theonlybeardedbeast/exercise-data
Datasets for workout exercises
data dataset fitness health healthcare
Last synced: 20 Mar 2026
https://github.com/efler/microservice-data-bus
Data bus based on Apache Kafka and consisting of separate components [copied from own private repos]
data data-bus deduplication enrichment filtering kafka microservice mongodb postgresql redis
Last synced: 16 Apr 2026
https://github.com/skywardai/paper_gallery
Papers gallery for using LLMs ability over dataset
ai data data-science llm medicine neural-network research security
Last synced: 19 Mar 2026
https://github.com/mohamedhany99/human-voice-identifier-counter
the application developed in (KIVY) it can identify the users imported into the dataset based on the support vector machine training model it has two features ( Importing new voice - Detection to detect the human voices and count them)
android android-app android-application automation automation-framework data data-analysis data-mining data-science data-visualization datascience kivy kivy-framework machine-learning python
Last synced: 27 Mar 2026
https://github.com/stdlib-js/utils-fifo
First-in-first-out (FIFO) queue.
collection data data-structure data-structures fifo first-in first-out javascript node node-js nodejs queue stdlib structure util utilities utility utils
Last synced: 16 Apr 2026
https://github.com/chompfoods/stub-go-server
Go server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.
api branded chomp data database food go-server go-swagger grocery ingredients nutrition raw recipe-api recipes
Last synced: 17 Apr 2026
https://github.com/sadmanca/uoft-pey-coop-job-postings
Code for parsing approximately 1.8k HTML pages of UofT PEY co-op job postings (from September 2023 to May 2024) to a single sqlite3 database file.
co-op data html python singlefile sqlite sqlite3 uoft uoft-pey
Last synced: 17 Apr 2026
https://github.com/nitrosh/nitro-validate
A powerful, standalone, dependency-free data validation library for Python with extensible rules and a clean, intuitive API.
data python3 validation validation-library
Last synced: 17 Apr 2026
https://github.com/timmymatten/spikeball-stat-tracker
Spikeball stat tracking web app built with Streamlit and Python, designed to easily log and analyze player performance over multiple games.
data data-analysis data-visualization dataset matplotlib-pyplot multipage python spikeball statistics streamlit
Last synced: 18 Apr 2026
https://github.com/azmisahin/azmisahin-software-web-package-storage-nodejs-javascript-v1
storage container logical partitions management.
conventional-commits data dev-container docker library linux module nodejs package partitions queue redis storage
Last synced: 05 Apr 2026
https://github.com/buchananja/dpyp
A convenience tool for small-scale data pipelines in Python
data data-analysis data-cleaning data-engineering data-pipeline data-preprocessing data-processing data-science pandas pipeline
Last synced: 18 Apr 2026