data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-30 00:07:50 UTC
- JSON Representation
https://github.com/pchaparro/search-engine
Full stack search-engine created from youtube videos obtained using "web-scraping"
data opensearch python python3 react scraper scraping scraping-websites search search-engine semantic-search sentence-transformers typescript website
Last synced: 17 Apr 2026
https://github.com/trollmii/bunnybase
An efficient data managing system
bunnybase data data-science data-structures database datascience python python3
Last synced: 22 Apr 2025
https://github.com/armand-sauzay/datasets
Datasets for machine learning
ai data datasets machine-learning ml
Last synced: 18 Jan 2026
https://github.com/sweta-kaundilya/power-bi-learning-projects
This repository contains completed exercises while learning Power BI
data datavisualization dax powerbi powerquery
Last synced: 27 Feb 2026
https://github.com/seif-elkateb/dataset-analysis-r
cu-boulder data data-analysis datamodeling datascience ms-ds msds434 r
Last synced: 01 Apr 2025
https://github.com/anandanraju/power_bi_dashboard_projects
The goal of this project is to provide insights into consumer behavior and purchasing trends across different platforms. By analyzing data from Amazon and other sources, we aim to uncover valuable insights that can inform marketing strategies, product development, and decision-making processes.
amazon dashboard data data-visualization healthcare powerbi project
Last synced: 11 Feb 2026
https://github.com/eudesgccunha/automated-management-panel
Automated management panel using Power BI
data data-analysis data-visualization database excel powerbi
Last synced: 04 Feb 2026
https://github.com/kunalthakur204/visualization-on-flower
🌸 Flower Dataset Visualization Visualizing patterns and relationships in flower data through charts and plots. Perfect for exploring floral characteristics and trends! 📊
data data-visualization dataanalysis flowerdataset python
Last synced: 16 Apr 2026
https://github.com/natanast/euroleaguebasketball
An R package providing data on Euroleague Basketball
Last synced: 01 Apr 2025
https://github.com/merekat/flight-delay-prediction
This project focuses on predicting flight delays using historical data from a Tunisian airline. We analyzed patterns in airport operations and flight schedules to build a machine learning model that can forecast potential delays.
aviation data data-science machine-learning machine-learning-algorithms machinelearning prediction predictive-modeling
Last synced: 08 Apr 2025
https://github.com/infinitode/pyautoplot
PyAutoPlot is an open-source Python library designed to make dataset analysis much easier by generating helpful detailed plots using matplotlib. It automatically generates appropriate plots based on the dataset you feed it.
analysis automatic csv data dataset dataset-analysis generation matplotlib pandas plots plotting-in-python plotting-library python
Last synced: 16 Mar 2025
https://github.com/ekoepplin/dbt-bigquery-core
How to get data to BigQuery (or duckDB) and setup dbt tests for SODA cloud monitoring
bigquery data data-quality dbt dlt duckdb gcp soda
Last synced: 06 May 2026
https://github.com/word2vect/beijing-new-house-data-visualization
Beijing New House Data Visualization for Python Programming 2024 Fall Data Visualization Lab
Last synced: 13 Jun 2026
https://github.com/iv4n-ga6l/functional-dataprocessing-pipeline
A functional data processing pipeline that accepts an input file, allows specifying both input and output formats, applies specified transformations, and produces a resulting output file.
csv data datapreprocessing excel json pandas parquet pipeline python
Last synced: 06 May 2026
https://github.com/shahsuvarli/election-voters-data-analysis-pandas
Educational project analyzing Azerbaijan voter demographics with pandas, focusing on data cleaning, grouping, and visualization.
cleaning data grouping matplotlib numpy pandas python visualization
Last synced: 12 Apr 2026
https://github.com/nouraalgohary/fifa-world-cup-data-analysis
data dataanalysis powerbi powerbi-visuals
Last synced: 19 Mar 2026
https://github.com/cpietsch/breitband
developer repo of breitband-berlin
d3js data threejs visualization
Last synced: 02 May 2026
https://github.com/kirillsemyonkin/lsd
LSD (Less Syntax Data) configuration/data transfer format.
configuration data java parsing rust
Last synced: 27 Feb 2026
https://github.com/fabsdevx/file-format-converter-handout
Data Engineering project for learning purposes. Credits to itversity
csv csv-import data data-engineering database pandas python
Last synced: 06 May 2026
https://github.com/fatihilhan42/hollywood-theatrical-market-synopsis-1995-to-2021
In this project, the data of hollywood film production companies from 1995 to 2021 were examined. Significant tables and graphs were created using data visualization algorithms, with the tickets sold divided into categories.
data data-analysis data-science data-visualization
Last synced: 23 Mar 2025
https://github.com/noedemange/orderedheatmapanalysis
OrderedHeatMapAnalysis (OHMA) is a direct data analysis framework allowing to simultaneously visualize and analyze the structure of complex datasets. An optimized seriation of rows and columns of the input data table is performed, resulting in a mapping of the whole dataset into an ordered heatmap.
analysis bi-seriation data dataanalysis heatmap r rstats seriation shiny shiny-apps
Last synced: 27 Feb 2025
https://github.com/pawamoy/keycut-data
Keyboard shortcuts data stored in YAML files
Last synced: 12 Feb 2026
https://github.com/45harry/potato_disease_classification
Potato Disease Classification - Traning, Rest Api and FrontEnd to Test
cnn-classification data data-science datapreprocessing deep-learning fastapi flaskapi frontend keras restapi tensorflow
Last synced: 12 Apr 2026
https://github.com/foundationallm/.github
A platform accelerating delivery of secure, trustworthy enterprise copilots.
agent ai data enterprise generative-ai large-language-model llm ml tool
Last synced: 12 Feb 2026
https://github.com/bishtrishu/super_store_sales_dashboard
This repository contains a comprehensive sales analysis dashboard for a Superstore, created using Power BI. The objective is to contribute to the success of a business by utilizing data analysis technique, specially focusing on time series analysis, to provide valuable insights and accurate sales forecasting.
analytics data data-science dataanalysis dataanalyst datacleaning datascience datavisualization-project excel microsoft-azure microsoft-excel powerbi report sql
Last synced: 28 Feb 2026
https://github.com/rrwen/r-reference
Quick reference to learning R
analysis beginner data guide introduction learn r reference statistics stats syntax
Last synced: 02 Jul 2025
https://github.com/ashleydavis/brisjs-web-scraping-talk
Code to accompany my talk on web scraping for the Brisbane JavaScript meeting in September 2018
cheerio data data-acquisition data-acquisiton electron headless-browsers javascript nightmare nightmarejs nodejs web-scraping
Last synced: 06 May 2026
https://github.com/cannt39t/wylsacom-analysis-reflinks-datamining
data data-analysis data-mining python3 sql
Last synced: 13 Jun 2026
https://github.com/sumaiyyaf/british-airline-dashboard
This Tableau dashboard visualizes British Airways customer reviews, showcasing key metrics like average ratings for service, entertainment, and seat comfort. It features interactive filters for exploring ratings by aircraft type, country, and traveler type, along with trend analysis over time.
analysis dashboard data tableau visualization
Last synced: 13 Feb 2026
https://github.com/xp-forge/marshalling
Marshalling
data marshalling object-mapping xp-framework
Last synced: 02 Jul 2025
https://github.com/shantanujpk/bigdatacloud
Exploration of PySpark for data processing and interview prep — demonstrates handling corrupted records, applying transformations/actions, and building efficient data pipelines with practical examples.
big-data data jupyter-notebook pipeline pyspark python spark sparksql
Last synced: 07 May 2026
https://github.com/hackersandslackers/hackers-jupyter-posts
:red_circle: :closed_book: Our repository for Jupyter Notebook to serve as blog posts.
blog data data-engineering gatsbyjs jupyter jupyter-notebook python python3
Last synced: 07 May 2026
https://github.com/abhijeetdasbakshi/ecommerce-insights
A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization — enabling actionable insights into e-commerce
airflow airflow-dags ci-cd-pipeline clustering dags data data-pipelines docker docker-compose docker-container dockerfile git great-expectations kafka mongodb pca-analysis postgresql pyspark t-sne umap-learn
Last synced: 04 Apr 2026
https://github.com/bastianolea/plebiscitos_chile
Datos de resultados electorales de los plebiscitos constitucionales de 2022 y 2023
chile comunas data elecciones politica social
Last synced: 15 Jun 2026
https://github.com/word2vect/beijing-pm2.5-data-process
Beijing PM2.5 Data Process for Python Programming 2024 Fall Data Visualization Lab 2
Last synced: 15 Jun 2026
https://github.com/ailixter/gears-dictionary
The project, which Gears Dictionary
arrays data dictionaries dictionary php struct utilities
Last synced: 19 Jul 2025
https://github.com/spiraldb/spiraldb-nemo-curator
SpiralDB connectors for NVIDIA NeMo Curator
computer-vision data data-curation data-prep data-preparation data-processing data-quality datacuration datarecipes deduplication fast-data-processing multimodal multimodal-ai nvidia-nemo physical-ai python spiral vortex
Last synced: 15 Jun 2026
https://github.com/smaug6739/data-bit
This project is a module for converting a structured dataset into a number that can be stored in a database taking up little space.
Last synced: 14 May 2026
https://github.com/pocketfullofdata/electric-vehicles-market-size-analysis
This project analyzes the growth, adoption trends, and future projections of the electric vehicle (EV) market. Using data analysis and visualization techniques, it examines key factors like sales trends, and consumer adoption to understand the evolving landscape of the EV industry.
analysis data jupyter-notebook matplotlib numpy python seaborn vscode
Last synced: 07 May 2026
https://github.com/bscript07/softuni-javascript-applications
Javascript for Applications course at SoftUni -Oct 2023
architecture-component authentication client-side-rendering-seo data lit-html-template routing
Last synced: 15 Mar 2025
https://github.com/dvaser/heart-attact-analysis-prediction
DATA ANALYSIS
classification data data-analysis data-visualization jupyter jupyter-notebook lineer-regresyon machine-learning python regression
Last synced: 20 Jan 2026
https://github.com/chardos/get-git-data
Access git repository data in node.
Last synced: 07 May 2026
https://github.com/molinsagustin/cinedata
# CineData Trabajo práctico grupal para la materia Ingeniería de Datos I en la Universidad Argentina de la Empresa. El mismo consistió en el desarrollo de una base de datos relacional en Microsoft SQL Server Managment Studio utilizando metodología Ágil SCRUM, que se utilizó desde el relevamiento de requisitos hasta la implementación final.
agile data data-modeling database diagram entity-relationship-diagram microsoft-sql-server relational-databases relational-model scrum scrum-agile sql sqlserver
Last synced: 28 Feb 2026
https://github.com/shudhanshusaurabh001/super_market-data-analysis-using-python
This project focuses on analyzing supermarket sales data using Python. The goal is to extract meaningful insights from the dataset, such as sales trends, customer purchasing behavior, and product performance.
analysis csv data insights matplotlib numpy pandas project python seaborn
Last synced: 06 Apr 2026
https://github.com/sakan811/honkai-star-rail-characters-damage-simulation
Honkai Star Rail Characters' Damage Simulation
data data-science data-visualization honkai honkai-star-rail honkai-starrail powerbi powerbi-visuals python sqlite
Last synced: 29 Jun 2026
https://github.com/tgorka/amplify-datastore-rxjs
RxJs Subjects to work with AWS Amplify and Amplify Datastore.
amplify amplifydatastore angular aws awsamplify data datastore fetch graphql graphql-client ionic rxjs scroll typescript
Last synced: 14 Feb 2026
https://github.com/bdr-pro/graphyml
A powerful, interactive Streamlit application to explore, edit, visualize, and query a graph-based database of YAML nodes — ideal for movie metadata, research articles, or structured knowledge graphs.
Last synced: 23 Jul 2025
https://github.com/murshidazher/client-side-data-storage
🚌 A workspace containing client-side data storage implementations
cache cache-storage client-side data indexeddb localstorage sessionstorage storage websql
Last synced: 02 Sep 2025
https://github.com/mikeqfu/network-rail-track-fixity-layer
This project develops a data mining tool for analysing and predicting track movements using asset data, environmental factors and track design knowledge to model key parameters and generate fixity values for the GB rail network.
data data-integration data-mining data-science information-management knowledge-discovery point-cloud rail rail-alignment rail-track track-fixity
Last synced: 02 Sep 2025
https://github.com/madhuresh2011/genai-powered-data-analytics-by-tata
I recently participated in Tata iQ's job simulation on the Forage platform, and it was incredibly useful to understand what it might be like to be on a data analytics team in an AI transformation consulting role.
chatgpt data dataanalytics eda excel gemini generative-ai internships powerpoint presentation
Last synced: 14 Feb 2026
https://github.com/danyal-faheem/project-logs-analyzer
This repo contains scripts to analyze project logs and display some charts related to the data
data data-visualization matplotlib pandas python streamlit
Last synced: 07 May 2026
https://github.com/bhenk/msdata-d
MySql DAO
dao data data-layer database mysql mysql-database mysqli
Last synced: 07 May 2026
https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas
SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.
data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase
Last synced: 18 May 2026
https://github.com/lijesh010/roadaccidentanalysisproject
This data analysis project was completed using MS Excel, and includes the creation of a dashboard.
data data-analytics data-exploration data-visualization msexcel
Last synced: 15 Feb 2026
https://github.com/agdturner/ccg-data
A modularised Java library for processing data sets with classes for: data records; collections of data records; and identifiers.
Last synced: 12 Jan 2026
https://github.com/rubyonworld/ldpath
This is a ruby implementation of LDPath, a language for selecting values linked data resources.
Last synced: 12 Nov 2025
https://github.com/plateformeio/docs
The official documentation of the Plateforme framework
api app asgi async data db docs fastapi plateforme pydantic python restx services sqlalchemy
Last synced: 11 Apr 2026
https://github.com/mochsyahrizal/jkfkjabar_studycase
First Data Analytics Study Case
Last synced: 15 Feb 2026
https://github.com/stdlib-js/ndarray-vector-int8
Create a signed 8-bit integer vector (i.e., a one-dimensional ndarray).
constructor ctor data int8 javascript ndarray node node-js nodejs stdlib structure types vec vector
Last synced: 24 Apr 2026
https://github.com/zoekelepiri/winedataprediction
A machine learning application in wine quality prediction
data descriptive-statistics machine-learning-algorithms
Last synced: 05 Jan 2026
https://github.com/jigyasag18/iit-guhawati
Empower Sakhi is a data-driven platform that uses machine learning to identify women at risk of domestic violence in India. It offers confidential self-assessments, survivor stories, and emergency resources through a trauma-informed, privacy-focused web app. The project also provides NGOs with actionable insights via Power BI dashboard for support.
aiml data dataset datavisualization domestic-violence eda jupyter-notebook label-encoding machine-learning machine-learning-algorithms machine-learning-models machinelearning machinelearningprojects powerbi python python-app random-forest random-forest-classifier streamlit streamlit-webapp
Last synced: 08 May 2026
https://github.com/aidan-zamfir/the-iliad
Data analysis & relationship network for the characters of Homers Iliad
data data-analysis dataframes networks networkx python selenium spacy webscraping
Last synced: 08 May 2026
https://github.com/sharoonjoseph321/insurance_fraud_detection
Fraud Detection using machine learning algorithm-KN Neighbors .Data exploration using Pyspark and matplotlib.
analytics data data-science eda high-performance knn-algorithm knn-classification machine-learning matplotlib-pyplot pyspark python seaborn spark statistics
Last synced: 23 Mar 2025
https://github.com/loosenthedark/going-for-gold
A fairer, more measured look at the Tokyo 2020 Olympic medal count. Countries are ranked in relative (per capita) instead of absolute medal-winning terms. Users can toggle between two different ranking breakdowns, search for countries, contact the site owner and enable dark mode. Mobile-first React application leveraging the REST Countries API as well as a local JSON Olympic dataset. EmailJS and React Context API integration with custom form validation and error handling.
api create-react-app css data es6 fetch-api frontend html5 interactive-front-end-development javascript mobile-first olympics react react-components react-context-api react-hooks react-router react-router-dom reactjs responsive-web-design
Last synced: 07 May 2026
https://github.com/mohammad-malik/covid-visualizations-d3
This project provides a dashboard with five different perspectives on the pandemic, from patient-infection relationships to regional trends and hierarchical distributions. This was developed as part of a project for the course Data Analysis and Visualization (DS3001).
covid-19 d3 d3-visualization d3js data data-analysis data-analytics data-science visualization
Last synced: 28 May 2026
https://github.com/zsvoboda/olympics
Self service analytics of 120 years of Olympics data
analytics dashboards data datavisualization dataviz olympics open-data open-datasets opendata reports
Last synced: 08 May 2026
https://github.com/amir76717/healthai-pro
HealthAI Pro revolutionizes the healthcare experience by leveraging cutting-edge AI technologies to provide intelligent, personalized healthcare solutions to patients and medical professionals alike. This platform incorporates machine learning, natural language processing, and robust data management to enhance the quality of healthcare services.
Last synced: 31 Mar 2025
https://github.com/jamiew/void-runners-analysis
basic data analysis for the Void Runners Genesis Fleet spaceships
Last synced: 29 Mar 2025
https://github.com/abhash-rai/regression-car-price-prediction
This repository contains my first complete data science project from web scrapping for data to data preprocessing, cleaning, exploratory data analysis, model training and deployment.
data data-science data-visualization eda exploratory-data-analysis machine-learning neural-network prediction prediction-model regression
Last synced: 08 May 2026
https://github.com/lablnet/alibaba_scraper
This is a robust web scraper that extracts data from the Alibaba website. It's multi-threaded and utilizes Playwright to efficiently scrape data from the website. This script is capable of scraping the entire Alibaba site, which would take approximately 4-6 months to complete.
alibaba data ecom mit-license open-source products scraper
Last synced: 15 Mar 2025
https://github.com/reshmaaiman/liver-patient-prediction
Liver Disease Prediction
data data-science data-visualization dataanalysis jupyter-notebook numpy pandas python seaborn
Last synced: 16 Apr 2026
https://github.com/soenneker/soenneker.attributes.mapto
A C# attribute for generic data mapping translation
attributes columns csharp data datatables dotnet mapping mapto maptoattribute object
Last synced: 02 Mar 2026
https://github.com/hemangsharma/dataanalysis
This repo contains analysis like a dashboard and time series forecast on NASDAQ data
analysis data data-analysis data-visualization python
Last synced: 10 Mar 2026
https://github.com/blackhatdevx/leetcode
LeetCode Solutions by Jash Gro
algorithm algorithms dart data datastructures datastructures-algorithms dsa java javascript leetcode leetcode-java leetcode-python leetcode-solutions neetcode
Last synced: 08 May 2026
https://github.com/omarcodex/data_analysis
My repository of past and present research and data-driven projects.
data ecodev ecology science sustainability yale
Last synced: 18 Jan 2026
https://github.com/anuppm9917/data-processing-and-csv-to-json-using-python-project
This project guides you through processing data from CSV to JSON format using Python. You'll learn to cleanse, validate, and transform data with pandas, numpy, csv, and json libraries, ensuring it's ready for POS system integration. This will help improve data integrity and streamline integration.
csv-files data data-analysis data-cleaning data-collection data-transformation data-validation python3 transformation
Last synced: 16 Apr 2026
https://github.com/chaewonkong/kaggle-competitions
kaggle competitions and lessions
Last synced: 15 Mar 2025
https://github.com/hupili/djworkshop-cuc2018
data data-journalism data-visualization
Last synced: 27 Mar 2026
https://github.com/anburocky3/cbse-schools-data
Fetch CBSE Schools in seconds and use it for your data projects
cbse data data-analysis data-science grabber nextjs
Last synced: 24 Jun 2026
https://github.com/juanpablo70/pgad-assignment01
Breast Cancer Coimbra data set analysis
data data-science dataframe dataset jupyter-notebook matplotlib numpy pandas python
Last synced: 08 May 2026
https://github.com/coderjolly/spotify-api-data-analysis
The project leverages Apache Airflow for automating Spotify API data analysis, focusing on user activity. Extracting, transforming, and loading data efficiently, it provides insights via PowerBI dashboards.
airflow airflow-dags data data-engineering etl etl-pipeline microsoft-sql-server power-bi python scripting sql
Last synced: 27 Mar 2026