data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-29 00:07:49 UTC
- JSON Representation
https://github.com/shadmanshaikh/data-analysis-and-ml-work
All of my work in Data Analysis and Machine learning
analytics artificial-intelligence data machine-learning
Last synced: 05 Jul 2025
https://github.com/veivel/f1-sentiment-analysis
An entiment analysis project on tweets about Formula 1. To be reworked.
data f1 nlp-library nlp-machine-learning
Last synced: 04 Jul 2025
https://github.com/white-gecko/lineage-dump
RDF dump of the device information from the lineage wiki
Last synced: 28 May 2026
https://github.com/quangandrei1003/france_air_pollution_pipeline
End-to-end air pollution data pipeline for French metropolitan cities using Airflow, Python, dbt, BigQuery.
airflow bigquery data data-analytics data-engineering data-modeling data-visualization dbt docker etl pandas python terraform
Last synced: 13 Apr 2026
https://github.com/nikolatechie/spotify-playlist
Data pipeline that fetches recently played songs in the past 24 hours using Spotify API and saves the data in the SQLite database. Scheduled to run daily using Apache Airflow.
apache-airflow api data data-engineering python spotify sql sqlite
Last synced: 30 Apr 2026
https://github.com/zcebeci/odetector
Outlier Detection Using Cluster Analysis
anomaly-detection cluster-analysis clustering clustering-methods data datapreparation datapreprocessing exception-handling fcm fraud-detection fuzzy-clustering novelty-detection outlier-detection outlier-removal outliers partitioning pcm r surprise-exploration
Last synced: 29 Oct 2025
https://github.com/rohitblaze10/netflix_analysis_using_tableau
The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.
data data-analysis data-science data-visualization netflix tableau
Last synced: 04 Feb 2026
https://github.com/frer0t/userverse
creating api for data analysis
data data-analytics spring-boot users
Last synced: 12 Apr 2026
https://github.com/j-hagedorn/locals
:globe_with_meridians: A collection of tidied, neighborhood-level public datasets
address-dataset census-data census-tract data neighborhood social-sciences
Last synced: 03 Feb 2026
https://github.com/pdoup/enegry
Time-Series dataset combining multiple sources to explain the broader Greek energy market
data dataset day-ahead-auction energy-markets exploratory-data-analysis forecasting futures-market greek-energy-market renewable-energy time-series-data weather-data
Last synced: 07 May 2025
https://github.com/grace-mengke-hu/redditpushshiftapi
This package is for collecting Reddit dataset and organize the data in Mongo Database
Last synced: 13 Jun 2025
https://github.com/thewillyhuman/willyos-java
willyOS for java developers
collections data data-structures java os structures
Last synced: 12 Jun 2025
https://github.com/tttardigrado/fq
Graffs for the MEDEA project
bokehplots data data-science dataanalysis pandas physics python3
Last synced: 12 Apr 2026
https://github.com/luciarevaliente/shell_script_data_cleaning
This project focuses on cleaning and processing datasets using Shell scripts. It is part of the Fundamentals of Informatics course (2022-23) and involves handling movie and show data to create cleaned and filtered datasets for further analysis.
data data-cleaning shell-script
Last synced: 04 Feb 2026
https://github.com/sakshamarora07/blinkit-sales-report-power-bi
This dashboard provides Blinkit with insights to optimize its grocery delivery operations and understand customer preferences. It evaluates sales trends, outlet performance, and item categories to identify key areas for improvement. The interactive visuals allow detailed exploration of sales distribution, customer ratings, and product popularity.
data data-science dataanalytics datavisualization excel powerbi sql
Last synced: 08 Jan 2026
https://github.com/koltyakov/pgcopy
🐘 PostgreSQL data migration tool
cli data database golang migration postgresql sync
Last synced: 29 Apr 2026
https://github.com/cassandrajm/reddit-dashboard
INTERACTIVE DASHBOARD: Analyzing Political Discourse on Reddit: A Multi-Faceted NLP Approach to Toxicity, Bias, and Political Stance
capstone data data-analysis data-science politics python reddit
Last synced: 09 Apr 2025
https://github.com/denisecase/620-mod6-web-scraping
Notes on how to get started scraping content from the web
beautifulsoup4 data mining python
Last synced: 11 Apr 2025
https://github.com/musamairshad/dsa-python
This repository contains all the material related to Data Structures and Algorithms implemented in Python.
algorithms data datastructures efficiency python searching-algorithms sorting-algorithms
Last synced: 25 Mar 2025
https://github.com/sajjad425/missingvalue
This repository provides a guide on handling missing values in Python, covering identification methods, imputation techniques (mean, median, mode, fill, interpolation), advanced methods (KNN, multiple imputation), and best practices. It includes practical examples for both numerical and categorical data.
data data-analysis-python data-science missing-value-handling missing-value-imputation
Last synced: 04 Apr 2025
https://github.com/tazeenrashid/orders-analysis-using-python-sql-server-and-tableau
I sourced some Orders data through Kaggle; did EDA using Python and then fetched some insights out of cleaned data using SQL Server (SSMS). Then, I built a Tableau Dashboard for some visual insights. Have a look and share your feedback!
analytics data eda jupyter-notebook python sql tableau
Last synced: 29 Apr 2026
https://github.com/mukhlishga/data-engineering
all about data engineering
airflow beam data data-engineering pyspark python
Last synced: 13 Apr 2026
https://github.com/keminghe/osu
Unofficial and publicly-available NPM data-package about The Ohio State University.
college data majors ohio-state organizations public students university unofficial
Last synced: 06 Jan 2026
https://github.com/entropyorg/p5-data-testimage
:notebook::camera: interface for retrieving test images
Last synced: 29 May 2026
https://github.com/luminati-io/ZoomInfo-dataset-samples
A sample dataset of over 1000 ZoomInfo companies, extracted using the Bright Data API, ideal for market growth, lead generation, and market analysis.
b2b business companies data data-extraction database dataset datasets web-scraping zoominfo
Last synced: 09 Apr 2025
https://github.com/filipnet/infoscreen
Arduino subscribes values by MQTT and view info on an OLED I2C display
arduino data display i2c mqtt oled-display-ssd1306 visualization weather weatherstation
Last synced: 12 Apr 2026
https://github.com/thingston/extractor
Collection of PHP classes to extract data from HTML pages.
Last synced: 14 Jan 2026
https://github.com/steveanik/kestra
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
data data-engineering data-integration data-pipeline data-quality elt etl low-code orchestration pipelines scheduler workflow workflow-engine
Last synced: 06 Jan 2026
https://github.com/mukul273/spring-data-rest-jpa-demo
Spring Data Rest JPA Demo
data jpa rest spring spring-boot spring-mvc
Last synced: 20 Apr 2026
https://github.com/alextanhongpin/node-github-api
:page_with_curl: sample github api queries with nodejs for scraping purposes
Last synced: 06 May 2026
https://github.com/lightdash/quickstart-github
Instant analytics for Github
analytics business-intelligence data dbt github
Last synced: 14 Sep 2025
https://github.com/shubhamsoni98/project_using_knn
This project applies the K-Nearest Neighbors (KNN) algorithm to predict iPhone purchases based on customer data. Using features like age, salary, and previous purchase behavior, the KNN model classifies customers into buyers and non-buyers.
anaconda analytics data data-science eda knn knn-classification machine-learning-algorithms predict project python scikit-learn tableau
Last synced: 03 Jan 2026
https://github.com/idhruvs/angular4-smart-table-demo
Angular4 Smart Table Demo Project
angular4 data tables typescript
Last synced: 21 Apr 2026
https://github.com/blackroad-os-inc/blackroad-portal
BlackRoad Portal — unified search routing to 30+ BlackRoad services.
blackroad cloudflare-workers data search
Last synced: 04 Apr 2026
https://github.com/dms-codes/scrape_tripsantai
Trip Santai Tour Data Scraper This Python script is a web scraper designed to extract and collect information about tours from the Trip Santai website. It utilizes the requests library to fetch web pages, BeautifulSoup for parsing HTML, and writes the collected data to a CSV file.
beautifulsoup4 data python requests scraper webscraper
Last synced: 21 May 2026
https://github.com/johnelliott/wb-web
Moved —> https://github.com/johnelliott/waybot
arduino browser data iot raspberry-pi web
Last synced: 12 Apr 2026
https://github.com/ludwing-mj/manipulacion_ej
Ejercicio utilizado en la seccion numero ocho del manual para ejemplificar las herramientas proporcionadas por el tydyverse para la manipulacion de datos.
data manipulate-data package r
Last synced: 01 Apr 2025
https://github.com/ipstack/wizard
Wizard for create ipstack databases
composer data geo geoip id-database info ip ipstack ipstack-wizard php wizard
Last synced: 29 Apr 2026
https://github.com/darkogamerz/dhis2heat
A Comprehensive data management and Health Equity Assessment and Analysis platform that fetches data from DHIS2, optimize, calculate, clean and visualize inequality data.
analytics data data-science dhis2 equality equity health heat inequality r shiny shinydashboard visualization
Last synced: 01 Apr 2025
https://github.com/eby8zevin/android-intent
Intent & Bundle - Android Studio
android android-development android-studio bundle data intent java xml
Last synced: 03 Sep 2025
https://github.com/merekat/flight-delay-prediction
This project focuses on predicting flight delays using historical data from a Tunisian airline. We analyzed patterns in airport operations and flight schedules to build a machine learning model that can forecast potential delays.
aviation data data-science machine-learning machine-learning-algorithms machinelearning prediction predictive-modeling
Last synced: 08 Apr 2025
https://github.com/microsoftcloudessentials-learninghub/demosscenarios-techtalks
This repository showcases demonstrations and scenarios using Microsoft Cloud technologies. Please note that these demos are intended as a guide and are based on my personal experiences.
ai analytics azure copilot data data-science fabric m365 microsoft-general ml powerapps powerbi privatebot security sharepoint
Last synced: 14 Mar 2026
https://github.com/shahsuvarli/election-voters-data-analysis-pandas
Educational project analyzing Azerbaijan voter demographics with pandas, focusing on data cleaning, grouping, and visualization.
cleaning data grouping matplotlib numpy pandas python visualization
Last synced: 12 Apr 2026
https://github.com/mumtaz4118/nlp-course
Programming Assignments and Lectures for Stanford's CS 224: Natural Language Processing with Deep Learning
course data data-analysis data-analytics data-science data-visualization deep-learning education machine-learning natural-language-processing neural-network transfer-learning
Last synced: 24 Nov 2025
https://github.com/rishitabansal9/adult-census-income-prediction
This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.
data data-analysis data-science feature-engineering random-forest-classifier
Last synced: 25 Mar 2025
https://github.com/juangesino/research-project
Course files for Research Project @ University of Amsterdam
data data-science economics stata
Last synced: 02 Jan 2026
https://github.com/atiqurcode/scrap-spec
Scrap data from the html to table html code / json
data html-table json-data scarp
Last synced: 05 Feb 2026
https://github.com/bkestelman/dasy-ml
DaSy DataSynthesizer - Create synthetic data with desired statistical properties for machine learning research.
data data-science machine-learning
Last synced: 14 Jan 2026
https://github.com/noedemange/orderedheatmapanalysis
OrderedHeatMapAnalysis (OHMA) is a direct data analysis framework allowing to simultaneously visualize and analyze the structure of complex datasets. An optimized seriation of rows and columns of the input data table is performed, resulting in a mapping of the whole dataset into an ordered heatmap.
analysis bi-seriation data dataanalysis heatmap r rstats seriation shiny shiny-apps
Last synced: 27 Feb 2025
https://github.com/vbhatsaccnt/retail-strategy-and-analytics-optimization-of-control-stores-for-sales-enhancement
In this project, we aim to optimize the performance of retail chain stores by establishing control stores based on their performance compared to selected trial stores. By leveraging data analytics and strategic insights, we seek to enhance sales revenue and drive growth within the retail chain.
customer-segmentation data data-science risk-analysis
Last synced: 13 May 2026
https://github.com/mladen/ds-ml-and-ai-experiments
:1234: My Data Science, Machine learning and Artificial Intelligence experiments and projects
data data-mining data-science datascience dataset
Last synced: 09 Jun 2026
https://github.com/woctezuma/recent-sales-data
Data available to estimate sales of Steam games during release week.
Last synced: 05 Feb 2026
https://github.com/matthewgferrari/covid-contextualizer
A Coronavirus Contextualizer for the USA
Last synced: 26 Jun 2026
https://github.com/vatshayan/pokemon-analysis
Visualization, Analysis & Predicting the accuracy of finding Pokemon power, attack & speed through Machine Learning
artificial-intelligence data data-analysis data-science data-visualization dataset machine-learning machine-learning-algorithms pokemon scikit-learn
Last synced: 30 May 2026
https://github.com/preranarao03/madhav_e-commerce_dashboard
This repository features the Madhav_E-Commerce_Dashboard built with Power BI. It provides interactive visualizations for analyzing e-commerce sales performance, product categories, customer segments, and geographic data, aiding in data-driven business decisions.
Last synced: 30 Jan 2026
https://github.com/yuvrajsaraogi/car-price-prediction-with-machine-learning
The price of a car depends on a lot of factors like the goodwill of the brand of the car, features of the car, horsepower and the mileage it gives and many more. Car price prediction is one of the major research areas in machine learning. So, if you want to learn how to train a car price prediction model then this project is for you.
car-price-prediction-with-machine-learning data data-science deep-learning deep-neural-networks engineer github learning machine-learning mini-project natural-language-processing prediction predictive-modeling project python3 sql
Last synced: 15 Apr 2026
https://github.com/rahult18/atmo-flow
AtmoFlow is a robust data engineering pipeline built on Google Cloud Platform (GCP) that processes and analyzes weather and air quality data in both batch and streaming modes
airflow data data-modeling data-science data-visualization dataengineering gcp-bigquery gcp-cloud-composer gcp-cloud-functions pyspark
Last synced: 23 Jun 2026
https://github.com/elijah-1994/pre-process-e-commerce-dataset
Importing, Cleaning, and Pre-Processing E-Commerce Data for Analysis Using MySQL.
analytics data dataanalytics datacleaning dataprocessing mysql mysql-database sql
Last synced: 11 Mar 2025
https://github.com/lamouchi-bayrem/data-matrix-scanner
A dual-interface tool that leverages AI to **detect and decode QR codes and Data Matrix codes** from images using computer vision
data datamatrix-scanner decoder flask qrcode scanner tkinter-gui webapp
Last synced: 30 Apr 2026
https://github.com/turner-kendall/turner-kendall
Turner Kendall - dev, opps, sec.
config data github-config go rust security
Last synced: 31 Oct 2025
https://github.com/itsmeyogesh22/Solved-8-Weeks-SQL-Challenge-Correct-Solutions
Included in Serious SQL Virtual apprenticeship program, this repository contains solutions for all eight different case studies crafted by Danny Ma. For more information please visit: https://8weeksqlchallenge.com/
8weeksqlchallenge data dataanalytics datawithdanny postgresql sql sqlserver-2022 t-sql
Last synced: 29 Aug 2025
https://github.com/amazingandyyy/dataviz
amazingandyyy data data-visualization
Last synced: 08 Jan 2026
https://github.com/jameshenderson12/chatbot-utils
Generic data and elements that can be reused or repurposed for chatbot development.
boilerplate chatbot data development elements intents template utterances
Last synced: 04 Mar 2026
https://github.com/omarsaad21/it-salary-eda
A python EDA project implemented on IT department salaries data we made data exploration and made data visulization for some questions on dataset
data explotary-data-analysis juypter-notebook numpy pandas python visualization
Last synced: 30 Apr 2026
https://github.com/anand-sony/mttr-dashboard
Streamlit dashboard for MTTR analysis with shift-wise loss insights and machine-level downtime tracking.
analytics business-analytics dashboard data python statistical-analysis
Last synced: 30 May 2026
https://github.com/nafisalawalidris/nafisalawalidris
Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and Bitcoin converge.
artifical-intelligence bitcoin config data data-science developer github-config github-pages machine-learning
Last synced: 16 May 2026
https://github.com/shudhanshusaurabh001/super_market-data-analysis-using-python
This project focuses on analyzing supermarket sales data using Python. The goal is to extract meaningful insights from the dataset, such as sales trends, customer purchasing behavior, and product performance.
analysis csv data insights matplotlib numpy pandas project python seaborn
Last synced: 06 Apr 2026
https://github.com/amethyst-php/activity
Someone just did something, should we save who did this and when?
activity amethyst amethyst-package api data laravel
Last synced: 17 May 2026
https://github.com/docuvesta/shiseido_skincare_usa_fr_infographics
Découvrir les indicateurs de performance liés aux avis d'un sérum très réputé de la marque de beauté luxe japonaise Shiseido. Cette comparaison concerne les sites web USA et FR 💯
analysis automatisation data datanalysis graphique infographie pandas plotly python skincare soins
Last synced: 11 Apr 2026
https://github.com/plnech/never2late
Never 2 Late - a reinterpretation of Everest Pipkin's 'i've never picked a protected flower'
dada dada-science data generative-art glitch-art installation nlp poetry spacy vector-similarity wallpaper
Last synced: 10 Jun 2025
https://github.com/dcmox/algorithms
General purpose data structures and algorithms
algorithms binary data hash linked list structures tree
Last synced: 10 Jun 2026
https://github.com/nolanbconaway/rollercoaster-tycoon-data
Every roller coaster I have built in RCT2 for iPad
Last synced: 24 Mar 2025
https://github.com/davitshahnazaryan3/data-management-web
Explore datasets with ease using taxonomy filtering, allowing you to quickly identify the specific experimental datasets you need and download them effortlessly
data environmental experiments filtering-data seismic taxonomy
Last synced: 17 Jan 2026
https://github.com/fehmitahsindemirkan/web-scrapper
Professional and high performance web scraping project.
data ecommerce emailsender fileexplorer logging python web webscraping
Last synced: 10 Jan 2026
https://github.com/mightymetrika/holi
holi: Higher Order Likelihood Inference Web Applications
data data-science r statistics
Last synced: 10 Feb 2026
https://github.com/stdlib-js/ndarray-vector-int8
Create a signed 8-bit integer vector (i.e., a one-dimensional ndarray).
constructor ctor data int8 javascript ndarray node node-js nodejs stdlib structure types vec vector
Last synced: 24 Apr 2026
https://github.com/jamiew/void-runners-analysis
basic data analysis for the Void Runners Genesis Fleet spaceships
Last synced: 29 Mar 2025
https://github.com/priyam-hub/covid-19-data-analysis
Explore COVID19 case numbers and deaths related to Coronavirus outbreak 2019/2020 in Pandas and in Jupyter notebook
analysis data data-visualization jupyter-notebook machine-learning python
Last synced: 08 Jun 2026
https://github.com/lablnet/alibaba_scraper
This is a robust web scraper that extracts data from the Alibaba website. It's multi-threaded and utilizes Playwright to efficiently scrape data from the website. This script is capable of scraping the entire Alibaba site, which would take approximately 4-6 months to complete.
alibaba data ecom mit-license open-source products scraper
Last synced: 15 Mar 2025
https://github.com/nmelgar/healthy_child_dataviz
Data visualization project to analyze what a healthy child is.
analysis data data-analysis data-science data-visualization dataviz research tableau visualization
Last synced: 23 Feb 2026
https://github.com/sadratehranian/data-collection-and-machine-learning
create a model using logistic regression to predict whether the fire alarm of a smoke detector should sound or not. Second, predicts whether an electric drive in a production plant may be faulty or not.
data data-analysis data-science datacollection logistic-regression machine-learning ml nn
Last synced: 05 Jan 2026
https://github.com/hemangsharma/dataanalysis
This repo contains analysis like a dashboard and time series forecast on NASDAQ data
analysis data data-analysis data-visualization python
Last synced: 10 Mar 2026
https://github.com/mmaithani/kaggle-projects
Collection of all the resources from competition, kernal And data section also all the magic code i have been using to get most of out of a problem
computer-vision data data-science image-processing machine-learning python
Last synced: 30 Apr 2026
https://github.com/vidushibhadana/covid19-data-exploration-using-sql
Deployed diverse SQL techniques to analyze COVID-19 data for an improved understanding of pandemic's regression.
data database database-management sql
Last synced: 19 Aug 2025
https://github.com/fatihilhan42/olympics-data-analysis-with-python
I will examine the Data Analysis of the Olympics between 1896-2016, which we have done on Python.
data data-science dataanalysis datavisualization jupyter-notebook olympics python
Last synced: 30 Apr 2026
https://github.com/omarcodex/data_analysis
My repository of past and present research and data-driven projects.
data ecodev ecology science sustainability yale
Last synced: 18 Jan 2026