data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-03 00:07:49 UTC
- JSON Representation
https://github.com/rishitabansal9/adult-census-income-prediction
This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.
data data-analysis data-science feature-engineering random-forest-classifier
Last synced: 25 Mar 2025
https://github.com/Coko7/vegapull-records
Cards dataset for One Piece TCG
data one-piece one-piece-card-game one-piece-tcg tcg
Last synced: 28 Apr 2025
https://github.com/nitheshgoutham/sentinel-2-data-processing-for-pichavaram-mangrove-forest-using-cnn
Image Processing using CNN
cnn cnn-classification cnn-keras data deep-learning matplotlib ploty python seaborn-python visualization
Last synced: 29 Jun 2026
https://github.com/austinhartzheim/career-fair-backend
Backend for ECS Career Fair app
Last synced: 13 Apr 2026
https://github.com/rosette-api/mock-data
Mock data that is used for unit testing of the Babel Street Analytics bindings
data entity-extraction entity-level-sentiment entity-linking entity-relationship entity-resolution language-detection machine-learning mock-data morphology natural-language-processing nlp relation-extraction sentiment-analysis test-framework testing text-mining text-processing tokenization
Last synced: 04 Mar 2026
https://github.com/jstafford5380/provausio.testing.generators
Generate fake data for testing and/or mocking
data fake-data generator testing
Last synced: 14 Jan 2026
https://github.com/blackroad-os-inc/blackroad-portal
BlackRoad Portal — unified search routing to 30+ BlackRoad services.
blackroad cloudflare-workers data search
Last synced: 04 Apr 2026
https://github.com/romaintailhurat/dagster-playground
Playing with Dagster 🐙
Last synced: 14 Jun 2025
https://github.com/deliprofesor/health-score-prediction-model-the-impact-of-lifestyle-and-demographic-factors
A machine learning project predicting health scores based on lifestyle and demographic factors like age, BMI, diet, and exercise. Techniques include Random Forest, Polynomial Regression, and Linear Regression, with a focus on model performance and actionable health insights.
cross-validation data data-science data-visualization feature-engineering linear-regression machine-learning polynomial-regression random-forest
Last synced: 10 Apr 2025
https://github.com/vatshayan/youtube-user-analysis
Analysis of Youtube Users about their choice and preferences
data data-analysis data-mining data-science data-visualization dataset machine-learning machine-learning-algorithms
Last synced: 05 Feb 2026
https://github.com/vdoninav/real_estate_analysis
real estate analysis
data data-analysis data-analysis-python data-science pandas pandas-dataframe pandas-python plotly plotly-express scipy seaborn streamlit streamlit-application streamlit-dashboard streamlit-webapp
Last synced: 12 Apr 2026
https://github.com/musamairshad/dsa-python
This repository contains all the material related to Data Structures and Algorithms implemented in Python.
algorithms data datastructures efficiency python searching-algorithms sorting-algorithms
Last synced: 25 Mar 2025
https://github.com/pathilink/ebury_case
Technical case study in Analytics Engineering using BigQuery, focusing on dimensional modeling and SQL queries for payment and client analysis.
Last synced: 05 Oct 2025
https://github.com/ompreetham/fylo-data-storage-component
Flyo Data Storage Component Challenge on Frontend Mentor.io.
component css data front-end front-end-development frontend frontend-mentor frontendmentor-challenge fylo html react render scss storage vite website
Last synced: 11 Apr 2026
https://github.com/amethyst-php/catalogue-product
amethyst amethyst-catalogue-product api catalogue-product data laravel
Last synced: 20 May 2026
https://github.com/quangandrei1003/france_air_pollution_pipeline
End-to-end air pollution data pipeline for French metropolitan cities using Airflow, Python, dbt, BigQuery.
airflow bigquery data data-analytics data-engineering data-modeling data-visualization dbt docker etl pandas python terraform
Last synced: 13 Apr 2026
https://github.com/tsbarr/belly-button-challenge
Using front-end development tools (javascript, html and css) I built an interactive dashboard to explore the Belly Button Biodiversity dataset, which catalogs the microbes that colonize human navels.
data data-visualization javascript
Last synced: 04 Mar 2026
https://github.com/abdullahashfaqvirk/earth-engine-data-scraper
A Python based web scraper designed to extract and organize dataset metadata from the Google Earth Engine Datasets Catalog for research, and analysis purposes.
beautifulsoup data data-science python requests scraper web-scraping
Last synced: 10 May 2026
https://github.com/andykee/aurora
A lightweight tool for indexing, cataloging, and browsing data.
catalog data data-catalog data-discovery indexing metadata metadata-extraction search-and-discovery
Last synced: 17 Jan 2026
https://github.com/white-gecko/lineage-dump
RDF dump of the device information from the lineage wiki
Last synced: 28 May 2026
https://github.com/prajjwol09/sql_retail_analysis_project
This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.
data data-analysis datacleaning dataexploration pgadmin4 sql
Last synced: 15 Feb 2026
https://github.com/iankitnegi/tableautales
"Discover my Tableau journey! Dive into data-driven stories, visualizations, and projects as I explore the power of data visualization."
data data-visualization tableau
Last synced: 21 Jan 2026
https://github.com/maximiliancw/completely
Measure your data completeness
data data-cleaning data-quality data-science missing-data
Last synced: 25 Jun 2025
https://github.com/danieljdufour/fast-b64
Quickly Convert between B64 and Binary Strings
b64 base64 base64-decoding base64-encoding binary bits compression data
Last synced: 08 Oct 2025
https://github.com/poojaharihar03/wellness-cities-case-study
A case study for dats analysis of city health centers
Last synced: 11 Jun 2026
https://github.com/shubhamsoni98/classification-with-random-forest-1
To classify sales into categories (Low, Moderate, High) using Random Forests to inform strategic decisions and optimize marketing strategies.
algorithms anaconda data data-science datacleaning eda jupyter-notebook machine-learning pyhton random-forest scikit-learn visualization
Last synced: 18 Jan 2026
https://github.com/prakhargpt/sql-data-warehouse-project
Building Data Warehouse project using SQL Server, including ETL processes, data modelling and analytics.
analytics data data-analysis data-cleaning data-engineering data-engineering-pipeline data-lakehouse data-science data-warehouse etl etl-job etl-pipeline medallion-architecture sql sql-server
Last synced: 12 Jun 2026
https://github.com/stefanpietrusky/factsv2
Repository for the article in the online magazine TDS.
ai arxiv-papers beautifulsoup data flask-application gensim llama matplotlib ollama plotly pyldavis python selenium webdriver
Last synced: 09 Apr 2025
https://github.com/cburmeister/disc-golf-courses
All the disc golf courses i've played at. Maintained with http://geojson.io/.
Last synced: 21 Jan 2026
https://github.com/psyteachr/sdg-data
Data relevant to the UN Sustainable Development Goals
Last synced: 09 Oct 2025
https://github.com/lulloooo/article-fromfitto55tofittoeveryone
Analysis leading to an article published in the EcoSprinter 2024 Annual edition about an Analysis of EU "Fit for 55" packages under a different perspective 🔎
analysis data environment european-union
Last synced: 12 Jun 2026
https://github.com/ismailhakkii/digital_vault
This project can be used for securing data, similar to a real vault.
data digital security-data vault
Last synced: 25 Mar 2025
https://github.com/steventhompson6460-stack/octoparse-government-listings-scraper
Octoparse workflow for structured government data
data extraction government listings octoparse public-records python scraper scrapy structured web-crawling workflow
Last synced: 31 May 2026
https://github.com/rod-persky/sungrowdatacollector
Data collector for a SunGrow SG8.0RT Inverter
Last synced: 19 Jan 2026
https://github.com/redgoose-dev/baguni
이미지를 보관하고 탐색하는 웹 프로그램
data explorer file management upload
Last synced: 14 Apr 2026
https://github.com/theopenwebjp/theopenweb-data-loader
Package for loading data to local project
data downloader import javascript typings
Last synced: 10 Oct 2025
https://github.com/amirreza81/kaggle-pandas-course-solutions
Kaggle Pandas Course - Solved exercises in another way of sample solution
data data-analysis data-cleaning data-manipulation data-science dataframe jupyter-notebook kaggle machine-learning open-source pandas
Last synced: 14 Apr 2026
https://github.com/chowington/bg-counter-tools
A set of tools that can pull data from Biogents BG-Counter smart mosquito traps and convert them into a Darwin Core compliant format.
bg-counter biogents darwin-core data internet-of-things mosquito-prevalence population-dynamics
Last synced: 10 Oct 2025
https://github.com/ikcede/hinge-data-ts-wrapper
Typescript wrapper for exported Hinge data
Last synced: 10 Oct 2025
https://github.com/myavuzokumus/simplemodelcomparison
This application allows users to upload datasets, handle missing data, and compare different imputation strategies.
algorithm data data-science machine-learning preprocessing streamlit
Last synced: 21 Jan 2026
https://github.com/jerboaburrow/uk-counties-and-unitary-authorities-may-2023-geojson
UK "Counties" Extracted from Office for National Statistics data
Last synced: 29 Mar 2025
https://github.com/ahmad-ali-rafique/linear-regression-modeling
In-depth exploration of linear regression models, including data cleaning, model building, and performance evaluation on various datasets.
artificial-intelligence data dataanalytics linear-models linear-regression model multilinear-regression regression regression-models
Last synced: 19 Apr 2026
https://github.com/nsandoya/python_scrp_project
This is a tool specially made for Dipaso ecommerce website. You can extract data from there, analyze it and see keywords, brands, and categories frecuency, prices distribution and other market tendencies as well —all in a group of friendly stadistic tables and graphics (exported from a Jupyter notebook) :)
beautifulsoup4 data data-analysis jupyter-notebook pandas python3
Last synced: 28 Apr 2026
https://github.com/eng-gabrielscardoso/data-science-formation
Data science course walkthrough
data data-science data-visualisation google-colab google-colaboratory google-colaboratory-notebooks python r r-lang
Last synced: 28 Feb 2025
https://github.com/rikiitokazu/dataprojects
Data analysis practice using SQL and Python
Last synced: 12 Apr 2026
https://github.com/dahsie/machine_learning_from_scratch
This project aims to implement some machine learning basic techniques(e.g. MinMaxScaler, StandardScaler, TD-IDF, PCA, Logistic Regression, LDA, KNN, Naive Bayes Classifier) using only pyton, numpy and pandas. This will enable me to have hone my data scientist skills
classification clustering data data-processing datascience machienlearning nlp nltk numpy pandas python regression
Last synced: 04 May 2026
https://github.com/0xnu/nfl-picks
NFL match prediction with scores using historical data (1999-Present).
american-football data nfl prediction
Last synced: 12 Oct 2025
https://github.com/adadalshabab/data-engineering-gcp-project
An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.
bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi
Last synced: 19 Jan 2026
https://github.com/davorg/cookingvinyl
Web site with info about Cooking Vinyl records
cooking-vinyl data hacktoberfest music perl
Last synced: 02 Apr 2025
https://github.com/nikolatechie/spotify-playlist
Data pipeline that fetches recently played songs in the past 24 hours using Spotify API and saves the data in the SQLite database. Scheduled to run daily using Apache Airflow.
apache-airflow api data data-engineering python spotify sql sqlite
Last synced: 30 Apr 2026
https://github.com/ginga1402/data_visualization_on_honey_production_dataset
Data Visualization using Matplotlib & Seaborn Libraries
college-project data data-visualization
Last synced: 25 Aug 2025
https://github.com/nrrso/ex_quickfs
A wrapper / elixir client / SDK to access the quickfs.net API.
data elixir financial financial-data
Last synced: 04 Sep 2025
https://github.com/fnu-ankit/8-week-sql-challenge
My attempt on solving Case studies from #8WeeksSQLChallenge
8-week-sql-challenge 8-weeks-sql-challenge 8weeksqlchallenge case-study data data-analysis data-analysis-sql data-analytics database datawithdanny sql sqlserver
Last synced: 19 Apr 2026
https://github.com/digital-media/cv_data
Datasets used for courses/tutorials at the Digital Media Department
computer-vision data image-processing images
Last synced: 14 Oct 2025
https://github.com/yash-chauhan-dev/sf_analytics
Business teams often rely on data analysts to extract insights using SQL. This tool eliminates that dependency by bridging the gap between humans and data using AI.
aiml analytics data dbt langchain llm python snowflake streamlit
Last synced: 07 May 2026
https://github.com/arush-codes/lgmvip-data-science-task-1
data data-science iris-classification lgmvip virtual-internship
Last synced: 14 Oct 2025
https://github.com/mominurr/fire-gas-leak-detection-system
A real-time fire prevention system integrating IoT sensors and computer vision to trigger evacuations.
ai computer-vision data datascience machine-learning ml python yolo
Last synced: 27 Jan 2026
https://github.com/jurooravec/knwldg
Datasets, scrapers, pipelines
companies crawler data dataset non-profit-organizations scraper scrapy
Last synced: 13 Jun 2026
https://github.com/cassandrajm/reddit-dashboard
INTERACTIVE DASHBOARD: Analyzing Political Discourse on Reddit: A Multi-Faceted NLP Approach to Toxicity, Bias, and Political Stance
capstone data data-analysis data-science politics python reddit
Last synced: 09 Apr 2025
https://github.com/jigyasag18/project-diwali-sales-analysis
This project analyzes retail sales data during the Diwali festival using exploratory data analysis (EDA) to identify buyer demographics and product preferences. The findings reveal that the primary purchasers are married women aged 26-35 from Uttar Pradesh, Maharashtra, and Karnataka, working in IT, Healthcare, and Aviation.
analysis data datapr datapro eda jupyter-notebook python realtimedata
Last synced: 01 Jun 2026
https://github.com/coko7/vegapull-records
Cards dataset for One Piece TCG
data dataset one-piece one-piece-card-game one-piece-tcg tcg
Last synced: 26 Feb 2025
https://github.com/j-sephb-lt-n/personal-projects
A history of my personal projects and professional development
ai api auth cloud data llms personal-development web
Last synced: 24 Jan 2026
https://github.com/tyriek-cloud/statistical-work-sample
The purpose of this study is to observe if a sample of people that has siblings is independent of a sample of people that possess an opinion of whether patients with incurable diseases should be allowed to die.
analysis data spss statistics t-test
Last synced: 22 Jan 2026
https://github.com/bkataru/spotigo
AI-powered local music intelligence platform with a task runner server core to retrieve and backup spotify account data to storage(s) at set periodic intervals
ai backup cron data go intelligence local-llm music ollama rag runner spotify task-runner tool-calling
Last synced: 16 Jan 2026
https://github.com/nadahamdy217/movies-data-etl-using-python-gcp
Developed a comprehensive ETL pipeline for movie data using Python, Docker, and a GCP Pub/Sub emulator. Successfully processed and published the data in a local Docker environment, showcasing advanced data engineering skills.
analytics data data-engineering data-ingestion data-preparation data-preprocessing data-processing data-project docker etl etl-pipeline gcp matplotlib matplotlib-pyplot numpy pandas pubsub python scipy seaborn
Last synced: 06 Jan 2026
https://github.com/neuro-mechatronics-interfaces/ros2_data_agent
Code for a multipurpose file explorer specializing in reading ROS2 topic data from '.bag' or '.db3' files
Last synced: 13 Jun 2026
https://github.com/lorenzobloise/client_satisfaction_classification
Jupyter notebook in which satisfaction from clients reviewing European hotels is analyzed using Python libraries such as pandas, numpy and scikit-learn. Various classification models are trained and tested to predict client satisfaction.
classification data data-mining jupyter jupyter-notebook machine-learning pandas python
Last synced: 21 Feb 2026
https://github.com/prajakta1321/streetml-a-cityscape-traffic-volume-prognostication
StreetML leverages ML learning techniques to revolutionize urban traffic prediction through precise volume prognostication, aiming to enhance cityscape mobility through data-driven insights.
catboostregressor data datavisualisation exploratory-data-analysis lightgbm-regressor linearregression machine-learning machine-learning-algorithms predictive-analytics random-forest-regression xgboost-regression
Last synced: 08 Apr 2025
https://github.com/gabrieldim/complete-analysis-covid-19
Analysis of the Covid 19.
analysis covid-19 covid19 data data-science science virus
Last synced: 23 Jan 2026
https://github.com/ompreetham/data-structures
binary-search-tree c data data-structures datastructures graph linked-list list stack structures tree
Last synced: 25 Mar 2025
https://github.com/alextanhongpin/node-github-api
:page_with_curl: sample github api queries with nodejs for scraping purposes
Last synced: 06 May 2026
https://github.com/mattjesc/ddo-semiconductor
Data-Driven Optimization of Semiconductor Processes and Forecasting
ai artificial-intelligence data data-science data-visualization deep-learning keras machine-learning manufacturing ml prophet python pytorch semiconductor semiconductor-manufacturing semiconductors tensorflow
Last synced: 23 Feb 2026
https://github.com/thais81/gamesbox
Another desktop app in JSE/Jswing with hangman game and tic-tac-toe game. This project was made at LDNR school with 4 friends
data database hangman-game jse tictactoe tictactoe-game
Last synced: 28 Jan 2026
https://github.com/lananolana/test_data_generator
Generate test data with Telegram bot in one click: random users, files, texts and credit cards.
credit-card data data-generation fake-data random telegram-bot test-data test-data-generator test-file-generator testing testing-tools text-generation user-generator
Last synced: 18 Jan 2026
https://github.com/johnelliott/wb-web
Moved —> https://github.com/johnelliott/waybot
arduino browser data iot raspberry-pi web
Last synced: 12 Apr 2026
https://github.com/harmanveer-2546/reducing-data-entries
Way to delete data entries from csv/excel file using. For excel file, use excel instead of csv in the code.
csv data data-entry delete-data excel numpy pandas python
Last synced: 05 May 2026
https://github.com/ztgx/muvera
MUVERA: Making multi-vector retrieval as fast as single-vector search
algorithms data google muvera retrieval rust search structure vector
Last synced: 25 Oct 2025
https://github.com/prajjwol09/power-bi-project
The Data Survey Breakdown is an interactive Power BI dashboard designed to present insights gathered from a survey of professionals and enthusiasts in the data industry.
dashboard data interactive powerbi survey
Last synced: 15 Mar 2026