data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-03 00:07:49 UTC
- JSON Representation
https://github.com/kaijagahm/2023-10-20-stlzoo
Data Carpentry workshop, hosted at the St. Louis Zoo. Beta testing the new ecology data lesson.
data data-science ecology r rstudio
Last synced: 05 Feb 2026
https://github.com/psyteachr/sdg-data
Data relevant to the UN Sustainable Development Goals
Last synced: 09 Oct 2025
https://github.com/dhimmel/hgnc
Extracting human gene families from HGNC
data gene-families genes hgnc hugo human
Last synced: 01 May 2026
https://github.com/cburmeister/disc-golf-courses
All the disc golf courses i've played at. Maintained with http://geojson.io/.
Last synced: 21 Jan 2026
https://github.com/coderixc/rforai
Learn R Programming Language for Statistics & Data Science
artificial-neural-networks data data-science deep-neural-networks machine-learning probability quant-analyst r science
Last synced: 09 Oct 2025
https://github.com/leevilaukka/alkometriikka
Tool to search Alko database and see some fun stats about different beverages
data gh-pages svelte typescript xlsx
Last synced: 18 May 2026
https://github.com/benmizrahi/reactivejs
microservices event bus for async/sync communications
Last synced: 01 May 2026
https://github.com/udofia2/crudwithdatabase
A simple Nodejs app that connect to a database.
Last synced: 08 Oct 2025
https://github.com/syedzaheerabbas/jamboree-education-linear-regression
Using data from Jamboree, this project explores the relationship between applicant profiles (GRE, TOEFL, GPA, etc.) and their chances of admission to Ivy League graduate programs. Linear regression, Ridge, and Lasso regression are employed to build predictive models and identify key factors.
data eda linear-regression python visualization
Last synced: 01 May 2026
https://github.com/ishaantek/leetcode-solutions
Collection of my solutions to LeetCode problems in Python.
algorithms coding-challenge coding-interviews competitive-programming data data-structures leetcode leetcode-python python python3
Last synced: 08 Oct 2025
https://github.com/lut-ful/ibm-capstone-project-stack-overflow-job-survey
IBM Data Analyst professionale certificate program final project.
cognos data data-analytics looker power-bi python sql statics
Last synced: 01 May 2026
https://github.com/mwiatrzyk/modelity
Data parsing and validation library for Python
data library model parsing python tool validation
Last synced: 18 Jan 2026
https://github.com/thanhleviet/vietnam_antibiotics_bidding
This repo contains data of bidding for multiple drugs and antibiotics reported to Vietnam Ministry of Health in 2015, 2016, 2017.
Last synced: 23 Feb 2026
https://github.com/adithivs/prodigyy_ds_03
data data-visualization datapreprocessing decision-tree-classifier
Last synced: 07 Oct 2025
https://github.com/elimu-ai/analytics
📊 Android application which collects, provides and uploads learning event data
csv data data-science dataset edtech egma egra infrastructural learning-analytics
Last synced: 12 Oct 2025
https://github.com/piyushkumar2025/india-general-elections-2024_data-analyst
Analyzed election data for 540+ constituencies and 100+ parties using SQL. Calculated state-wise seat distributions, classified 30+ parties into alliances, identified top 10 candidates by EVM votes, calculated victory margins, and analyzed voting patterns for 300+ candidates to uncover key insights.
analytics data database mysql sql statistics
Last synced: 22 May 2026
https://github.com/shauryauppal/mydatatoolkit
A toolkit for data scientists to get work done faster, easier, and in a smarter way.
analytics awesome-list data data-science hacktoberfest
Last synced: 08 Jun 2026
https://github.com/abdellah-laassairi/thyroid-disease-analysis
Thyroid dataset visualization dashboard in R
dashboard data flexdashboard imputation-methods rshiny visualization
Last synced: 18 Jan 2026
https://github.com/pythoncoderunicorn/startrek
a repo for Star Trek data from Technical Manuals
data klingon-language star-trek vulcan
Last synced: 07 Oct 2025
https://github.com/iankitnegi/tableautales
"Discover my Tableau journey! Dive into data-driven stories, visualizations, and projects as I explore the power of data visualization."
data data-visualization tableau
Last synced: 21 Jan 2026
https://github.com/prajjwol09/sql_retail_analysis_project
This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.
data data-analysis datacleaning dataexploration pgadmin4 sql
Last synced: 15 Feb 2026
https://github.com/eharshit/end-to-end-vendor-insights
End-to-end analysis of vendor performance for wholesale/retail businesses, featuring data ingestion, cleaning, insights, and interactive Power BI dashboards.
analysis analysis-algorithms analytics dashboard data data-analysis datascience jupyter jupyter-notebook pandas powerbi powerbi-report retail wholesale
Last synced: 07 Oct 2025
https://github.com/drzax/light-up-brisbane
Where, what and why various public places in Brisbane are lit up.
Last synced: 19 Jan 2026
https://github.com/jigyasag18/sql-music-store-analysis
This repository contains an analysis of sales and customer data from a fictional music store. Using SQL, we explore trends in sales, popularity of artists and genres, and customer purchasing behavior. The project aims to derive actionable insights that can guide marketing strategies and inventory management decisions.
data dataanalysis dataanalytics database database-management dataset sql sqlqueries sqlquery
Last synced: 08 Jun 2026
https://github.com/vim89/flowforge
Let's be honest - most data pipeline frameworks treat types as suggestions. Config files are strings. Schemas are "validated" at runtime. Data quality is an afterthought. So, let's do differently
archetype data data-contracts data-engineering data-pipelines data-quality data-science database dataengineering datapipeline etl etl-framework pipelines scala scalability spark spark-sql spark-streaming
Last synced: 14 Apr 2026
https://github.com/tyriek-cloud/nyc-dca-etl
Created an ETL pipeline to merge two CSV files (converted to JSON) into a parquet file using Azure Data Factory, The data was extracted from NYC Open Data: https://opendata.cityofnewyork.us/ and I created a Blob Container within an existing storage account.
azure azure-data-factory blob-storage data data-engineering etl-pipeline
Last synced: 21 Jan 2026
https://github.com/rysteq/abstract-data-structures
This repository contains two programs written in C about the stack and queue ADT's
abstract-data-structures c data queue stack
Last synced: 06 Oct 2025
https://github.com/badranalyst/data-cleaning-and-exploratory-data-analysis-project
This project uses SQL to clean and analyze a layoffs dataset. Data cleaning tasks include removing duplicates, standardizing values, and handling missing data. Exploratory analysis is performed to identify trends in layoffs across companies, industries, and time periods.
cleaning-data data database dataset mysql mysql-database sql
Last synced: 07 Apr 2025
https://github.com/luminati-io/httpx-web-scraping
Web scraping using HTTPX in Python, covering setup, advanced features, comparisons with Requests, and more.
beautifulsoup data html httpx python web-scraper web-scraping
Last synced: 13 Oct 2025
https://github.com/fatihemres/fruits
Fruit Details app by SwiftUI. Using data, models, animation and practically onboarding usage.
animations data models onboarding swift swiftui
Last synced: 01 May 2026
https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito
This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.
bigquery data data-analysis etl-pipeline tableau
Last synced: 21 Jan 2026
https://github.com/urvish-06/seaborn-dataset
Seaborn data sets
csv csv-files data data-science data-visualization dataset example jupyter-notebook jypyternotebook python seborn vacation
Last synced: 18 May 2026
https://github.com/flyconnectome/hnf
Documentation for the hierarchical neuron format
annotations data dotprops hdf5 mesh neurons skeleton storage
Last synced: 17 Jan 2026
https://github.com/anandvai/ai_rag_chatbot_multi_pdf_support
RAG (Retrieval-Augmented Generation) Chatbot built with Streamlit and LangChain, powered by Groq's blazing-fast LLaMA3-8B. It allows you to upload multiple PDFs, ask questions, and get precise, context-aware answers in a conversational format.
ai data data-science data-visualization data-visualizations dataengineering fastapi langchain langgraph python sql streamlit
Last synced: 01 May 2026
https://github.com/amethyst-php/catalogue-product
amethyst amethyst-catalogue-product api catalogue-product data laravel
Last synced: 20 May 2026
https://github.com/affan005-ai/tesla-stock-prediction
This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models
data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn
Last synced: 05 Oct 2025
https://github.com/pathilink/ebury_case
Technical case study in Analytics Engineering using BigQuery, focusing on dimensional modeling and SQL queries for payment and client analysis.
Last synced: 05 Oct 2025
https://github.com/rugwiroparfait/alx_sql
This repo is where I save my queries and learning materials in Data Science program from ALX
anaconda data data-analysis jupyter-notebook sql
Last synced: 19 Aug 2025
https://github.com/deepanshkhurana/facebook-birthdays
Python script to create a .csv from Facebook's Event Data to list Birthdays.
Last synced: 14 Oct 2025
https://github.com/amethyst-php/courier
amethyst amethyst-package api courier data laravel
Last synced: 17 May 2026
https://github.com/tabarzin/dh
A collection of links to various resources on Digital Humanities
data digitalhumanities opensource
Last synced: 24 Jan 2026
https://github.com/samhollings/nhs_data_cleansing
A repo of reusable functions for cleansing data
cleansing data data-cleaning data-cleansing preprocessing pyspark python python3
Last synced: 05 Oct 2025
https://github.com/eshan-sud/secureit
A Blockchain-based Data Sovereignty Platform
blockchain data decentralised-application platform sovereignty
Last synced: 21 Jan 2026
https://github.com/albanecoiffe/jo2024_visualization
Tableau de bord avec Streamlit sur les JO de Paris 2024.
Last synced: 30 Apr 2026
https://github.com/corneliustanui/personal_quarto_website
This repo contains source files for my personal Quarto-based website.
data netlify programming quarto r rbind websites
Last synced: 02 Apr 2025
https://github.com/fnu-ankit/8-week-sql-challenge
My attempt on solving Case studies from #8WeeksSQLChallenge
8-week-sql-challenge 8-weeks-sql-challenge 8weeksqlchallenge case-study data data-analysis data-analysis-sql data-analytics database datawithdanny sql sqlserver
Last synced: 19 Apr 2026
https://github.com/digital-media/cv_data
Datasets used for courses/tutorials at the Digital Media Department
computer-vision data image-processing images
Last synced: 14 Oct 2025
https://github.com/polyee99/kaggle-titanic-data-analytics
Jupiter notebook to predict the outcome of passengers who died or not in the tragical Titanic event.
data eda jupiter-notebook matplotlib numpy pandas python regression-analysis test-train-split visualization
Last synced: 05 Feb 2026
https://github.com/soenneker/soenneker.data.email.disposables
Simply adds a list of compiled disposable/temporary email domains, updated daily (if available)
csharp data disposable disposables domain dotnet email mailinator
Last synced: 29 May 2026
https://github.com/vdoninav/real_estate_analysis
real estate analysis
data data-analysis data-analysis-python data-science pandas pandas-dataframe pandas-python plotly plotly-express scipy seaborn streamlit streamlit-application streamlit-dashboard streamlit-webapp
Last synced: 12 Apr 2026
https://github.com/deliprofesor/health-score-prediction-model-the-impact-of-lifestyle-and-demographic-factors
A machine learning project predicting health scores based on lifestyle and demographic factors like age, BMI, diet, and exercise. Techniques include Random Forest, Polynomial Regression, and Linear Regression, with a focus on model performance and actionable health insights.
cross-validation data data-science data-visualization feature-engineering linear-regression machine-learning polynomial-regression random-forest
Last synced: 10 Apr 2025
https://github.com/arush-codes/lgmvip-data-science-task-1
data data-science iris-classification lgmvip virtual-internship
Last synced: 14 Oct 2025
https://github.com/mominurr/fire-gas-leak-detection-system
A real-time fire prevention system integrating IoT sensors and computer vision to trigger evacuations.
ai computer-vision data datascience machine-learning ml python yolo
Last synced: 27 Jan 2026
https://github.com/spajai/etl-sharepoint-data-uploader-pipeline
Custom Python Script to Pull specific data from source and Upload to the Microsoft SharePoint
data etl etl-pipeline microsoft microsoft365 python3 sharepoint sharepoint-online
Last synced: 11 Nov 2025
https://github.com/stupidcucumber/elephant-crawler
System for mining texts from websites.
data data-mining-python python
Last synced: 25 Apr 2026
https://github.com/rafie-b/data-analytics
Activities of Data Analysis.
apache-spark api aws business-analytics data data-analytics data-science database dataframe jupyter-notebook python scikit-learn sql
Last synced: 14 Apr 2026
https://github.com/stdlib-js/array-base-last-index-of-same-value
Return the index of the last element which equals a provided search element according to the same value algorithm.
array data find generic index javascript locate node node-js nodejs same scan search stdlib structure types
Last synced: 13 Apr 2026
https://github.com/shef4793/hackerrank-sql-challenges-solutions
The solutions of all SQL challenges on HackerRank executed on either MySQL or MS SQL environment.
data data-engineering hackerrank hackerrank-challenges hackerrank-solutions mssql mssql-server mysql problem-solving solutions sql sql-challenges sql-query
Last synced: 11 Mar 2026
https://github.com/jpcurada/exploralytics
A python package for creating intermediate plotly visualizations
data eda plotly python visualization
Last synced: 05 Feb 2026
https://github.com/instagram-automations/scrape-data-from-instagram
scrape data from instagram and automation toolkit
api automation bot data doker instagram nodejs playwright procy scrape selenium toolkit
Last synced: 14 Oct 2025
https://github.com/desininja/food-delivery-realtime-data-analysis
ETL Pipeline in AWS for Real Time Data Analysis
airflow data data-engineering emr-cluster etl kinesis kinesis-strea real-time redshift
Last synced: 15 Oct 2025
https://github.com/ankitrai259/sales_insight_dashboard
Sales Insight: Using SQL for data cleaning and Power BI for making interactive dashboard
dashboard data data-visualization datacleaning postgresql powerbi sql
Last synced: 17 Mar 2025
https://github.com/roshaka/samplr
Samplr is a Python decorator for selecting a subset of items from a list, with options for customisation and informative console printouts.
data data-analysis data-engineering decorators list python sampling
Last synced: 14 Jan 2026
https://github.com/science-analyse/clv_model
customer lifetime value prediction
banking banking-applications clv clv-analysis data data-science machine-learning
Last synced: 15 Oct 2025
https://github.com/poojaharihar03/wellness-cities-case-study
A case study for dats analysis of city health centers
Last synced: 11 Jun 2026
https://github.com/quonverbat/ordner
A simple, customizable and cross-platform data tracker.
data datatracker javafx management
Last synced: 07 Jul 2025
https://github.com/sorairolake/japanese-era-dataset
日本の元号のデータセット / Dataset of the Japanese era
data dataset date japanese-calendar japanese-era json toml wareki yaml
Last synced: 01 May 2026
https://github.com/stoyank7/football-prediction
This is my Semester 7 Project for my "AI for Society" minor at Fontys University of Applied Sciences.
ai betting data football machine-learning university-project
Last synced: 25 Mar 2025
https://github.com/paezha/bsantiago
A data package with the results of a travel and well-being survey conducted in Santiago in 2016
data equity package r santiago survey travel well-being
Last synced: 18 Mar 2025
https://github.com/jigyasag18/project-diwali-sales-analysis
This project analyzes retail sales data during the Diwali festival using exploratory data analysis (EDA) to identify buyer demographics and product preferences. The findings reveal that the primary purchasers are married women aged 26-35 from Uttar Pradesh, Maharashtra, and Karnataka, working in IT, Healthcare, and Aviation.
analysis data datapr datapro eda jupyter-notebook python realtimedata
Last synced: 01 Jun 2026
https://github.com/welli7ngton/mysql-server-formacao-alura
repositório para guardar códigos escritos em SQL de cursos da formação em mysql server da alura
Last synced: 19 Apr 2026
https://github.com/2kabhishek/pybank
Data Analysis for the silliest Bank 💰🏦
csv data data-science learning pandas python topic1 topic2
Last synced: 12 May 2026
https://github.com/j-sephb-lt-n/personal-projects
A history of my personal projects and professional development
ai api auth cloud data llms personal-development web
Last synced: 24 Jan 2026
https://github.com/poissonconsulting/klexdatr
An R package of data from the Kootenay Lake Exploitation Study
cran data fish kootenay-lake rstats
Last synced: 16 Oct 2025
https://github.com/tyriek-cloud/statistical-work-sample
The purpose of this study is to observe if a sample of people that has siblings is independent of a sample of people that possess an opinion of whether patients with incurable diseases should be allowed to die.
analysis data spss statistics t-test
Last synced: 22 Jan 2026
https://github.com/fcoagz/rate-reader-epv
pyDolarVenezuela API utilities, image processing (EnParaleloVzla) to extract currency exchange rates from specific platforms, validating content against expected patterns
data finance json processing-images pydolarvenezuela
Last synced: 14 Jun 2025
https://github.com/fatihilhan42/nba-players-data-1950-to-2021
In this project, the data of the NBA players between the years 1950-2021 were examined. After the NBA players' season, height, performance, averages of points, teams and positions they played were obtained through csv files, important tables and graphs were created using data cleaning and data visualization algorithms.
data data-analysis data-engineering data-science data-visualization
Last synced: 16 Oct 2025
https://github.com/eshitakundu/disease-outbreak-predictor
Disease Outbreak Predictor: A Streamlit-based web application for predicting diabetes, heart disease, and Parkinson's disease using machine learning models.
data data-science disease-prediction healthcare-application jupyter-notebook machinelearning ml notebook prediction python streamlit streamlit-webapp
Last synced: 01 May 2026
https://github.com/prakhargpt/sql-data-warehouse-project
Building Data Warehouse project using SQL Server, including ETL processes, data modelling and analytics.
analytics data data-analysis data-cleaning data-engineering data-engineering-pipeline data-lakehouse data-science data-warehouse etl etl-job etl-pipeline medallion-architecture sql sql-server
Last synced: 12 Jun 2026
https://github.com/vatshayan/pokemon-analysis
Visualization, Analysis & Predicting the accuracy of finding Pokemon power, attack & speed through Machine Learning
artificial-intelligence data data-analysis data-science data-visualization dataset machine-learning machine-learning-algorithms pokemon scikit-learn
Last synced: 30 May 2026
https://github.com/gcoronelc/cepsuni-disbd-64505
Taller de Modelamiento de de Base de Datos con Gustavo Coronel
data database databases db2 db2-database modeling oracle oracle-database relational-database relational-database-design relational-databases relationships sql sql-server
Last synced: 02 May 2026
https://github.com/soenneker/soenneker.cloudflare.origincerts.thumbprints
The current Cloudflare origin certificate thumbprints
cloudflare csharp data dotnet origincerts thumbprint thumbprints
Last synced: 23 Apr 2026