data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-02 00:07:45 UTC
- JSON Representation
https://github.com/dms-codes/scrape-kesaintblanc-id
Kesaintblanc Data Scraper This Python script is designed to scrape product data from the Kesaintblanc website. It collects information about products, including product name, URL, price, image URLs, status, stock, and more. The scraped data is saved to a CSV file for further analysis.
data kesaintblanc python webscraper
Last synced: 27 May 2026
https://github.com/sajjad425/missingvalue
This repository provides a guide on handling missing values in Python, covering identification methods, imputation techniques (mean, median, mode, fill, interpolation), advanced methods (KNN, multiple imputation), and best practices. It includes practical examples for both numerical and categorical data.
data data-analysis-python data-science missing-value-handling missing-value-imputation
Last synced: 04 Apr 2025
https://github.com/coko7/vegapull-records
Cards dataset for One Piece TCG
data dataset one-piece one-piece-card-game one-piece-tcg tcg
Last synced: 26 Feb 2025
https://github.com/survi218/angular-http-service
client-server communication using http service in angular
angularjs client-server communication data get http-client http-requests http-response http-server post
Last synced: 16 Mar 2025
https://github.com/plandes/datdesc
Describe and optimize data
data hyperparameter-optimization hyperparameter-tuning latex table
Last synced: 04 Sep 2025
https://github.com/yorkearwaker/data
Data things; representation, transformation, pipelines, governance,
actuality data epistemology information knowledge ontology
Last synced: 07 Apr 2025
https://github.com/tttardigrado/fq
Graffs for the MEDEA project
bokehplots data data-science dataanalysis pandas physics python3
Last synced: 12 Apr 2026
https://github.com/ashu3291/blinkit-app-store-
conducted a comprehensive analysis of Blinkit's sales performance, customer satisfaction and inventory distribution to improve the sales performance.
cleaning-data data dataanalysis-projects powerbi-visuals powerbidashboard sql
Last synced: 05 Jan 2026
https://github.com/rohitblaze10/netflix_analysis_using_tableau
The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.
data data-analysis data-science data-visualization netflix tableau
Last synced: 04 Feb 2026
https://github.com/zcebeci/odetector
Outlier Detection Using Cluster Analysis
anomaly-detection cluster-analysis clustering clustering-methods data datapreparation datapreprocessing exception-handling fcm fraud-detection fuzzy-clustering novelty-detection outlier-detection outlier-removal outliers partitioning pcm r surprise-exploration
Last synced: 29 Oct 2025
https://github.com/sanad343/complete-data-analyst
Data analysis is the process of turning raw data into useful information for decision-making.
data data-visualization datamanipulation eda excel exploratory-data-analysis powerbi python-3 sql tableau
Last synced: 30 Jun 2025
https://github.com/srgchrksv/datacamp-projects
Datacamps projects
analytics data data-science dataanalysis education jupyter-notebook learning pandas projects python sql
Last synced: 06 May 2026
https://github.com/dahsie/machine_learning_from_scratch
This project aims to implement some machine learning basic techniques(e.g. MinMaxScaler, StandardScaler, TD-IDF, PCA, Logistic Regression, LDA, KNN, Naive Bayes Classifier) using only pyton, numpy and pandas. This will enable me to have hone my data scientist skills
classification clustering data data-processing datascience machienlearning nlp nltk numpy pandas python regression
Last synced: 04 May 2026
https://github.com/publici/state-integrity-data
Data from a comprehensive assessment of state government accountability and transparency
Last synced: 04 Feb 2026
https://github.com/muhamedlabs/muhamed_onedrive
Muhamed_OneDrive - це надійне і зручне хмарне сховище для файлів, розроблене для безпечного зберігання і легкого обміну даними.
data html5 onedrive programming style
Last synced: 04 Jan 2026
https://github.com/amethyst-php/user
amethyst amethyst-package api data laravel user
Last synced: 12 Apr 2026
https://github.com/ahmad-ali-rafique/linear-regression-modeling
In-depth exploration of linear regression models, including data cleaning, model building, and performance evaluation on various datasets.
artificial-intelligence data dataanalytics linear-models linear-regression model multilinear-regression regression regression-models
Last synced: 19 Apr 2026
https://github.com/neha-adnani/sql_music-store-analysis
SQL-based data analysis of a digital music store's sales and customer data.
business-analysis data data-analysis database follow-along-projects pgadmin4 portfolio-project postgres queries sql
Last synced: 18 Jun 2025
https://github.com/bcongdon/nid-data
National Inventory of Dams Data
data datasette government-data
Last synced: 21 Apr 2026
https://github.com/lexiortiz/advanced-data-analytics
Structured learning notes, code snippets, and key takeaways from the Google Advanced Data Analytics Professional Certificate. Serves as a personal reference for reinforcing concepts and as a resource for others on a similar learning journey.
data data-analysis data-engineering google python-3 sql
Last synced: 29 May 2026
https://github.com/so-cool/junction
My solution to the University of Bristol "Bristol Journey Time" Data Challenge https://So-Cool.github.io/junction
competition data modelling timeseries
Last synced: 02 Apr 2025
https://github.com/fuzzt/location-analyzer
The Location Data Analyzer is a Spring Boot application that offers insights on location data, such as counting locations by type, calculating average ratings, and identifying the most reviewed and incomplete entries. It features a simple frontend (HTML, CSS, JavaScript) and is deployed on Render.
analysis api average css data deployment docker fetch-api frontend html javascript location maven ratings render restful-api reviews spring-boot techstack
Last synced: 11 Apr 2026
https://github.com/shadmanshaikh/data-analysis-and-ml-work
All of my work in Data Analysis and Machine learning
analytics artificial-intelligence data machine-learning
Last synced: 05 Jul 2025
https://github.com/oliver021/helppad-net
Versatile .NET Toolkit: A Comprehensive Set of Miscellaneous Helpers, Classes, and Utilities
assert async checks cryptographic-algorithms data date dotnet fluent functional functional-programming hash helpers parallel pipe pipeline pointers review supports tasks
Last synced: 15 Jun 2026
https://github.com/iqbalmind/learn-python-data-scientist
IqbalMind Playground for python data scientist
data data-analysis data-visualization datascience datascientist datascientisttraining python python-playground
Last synced: 16 Mar 2025
https://github.com/chrisabruce/scrapling-rs
Adaptive web scraping, built in Rust. A high-performance port of Python Scrapling.
ai ai-scraping automation crawler crawling crawling-rust data data-extraction mcp mcp-server playwright rust-lang scraping selectors stealth web-scraper web-scraping web-scraping-rust webscraping xpath
Last synced: 26 Jun 2026
https://github.com/vidushibhadana/eda-on-nyc-taxi-data
About Conducting an Exploratory Data Analysis (EDA) on New York City taxi data and visualizing it through countplots, distribution plots (displot), and histograms using Python and it's libraries.
data data-visualization jupyter-notebook matplotlib numpy pandas python seaborn
Last synced: 11 Apr 2026
https://github.com/jigyasag18/fake-news-prediction-app
The Fake News Prediction App Repository offers a machine learning project that focuses on identifying the authenticity of news articles as fake or real. It uses a dataset of 20,000 articles and employs methods such as TF-IDF vectorization and the Lemmatization algorithm, achieving ~95% classification accuracy with random forest classifier model
data datapreprocessing logistic-regression machine-learning machine-learning-algorithms numpy pandas prediction stemming streamlit streamlit-webapp vectorization
Last synced: 11 Apr 2026
https://github.com/mukhlishga/data-engineering
all about data engineering
airflow beam data data-engineering pyspark python
Last synced: 13 Apr 2026
https://github.com/stdlib-js/ndarray-vector-uint32
Create an unsigned 32-bit integer vector (i.e., a one-dimensional ndarray).
constructor ctor data javascript ndarray node node-js nodejs stdlib structure types uint32 vec vector
Last synced: 25 Apr 2026
https://github.com/luminati-io/ZoomInfo-dataset-samples
A sample dataset of over 1000 ZoomInfo companies, extracted using the Bright Data API, ideal for market growth, lead generation, and market analysis.
b2b business companies data data-extraction database dataset datasets web-scraping zoominfo
Last synced: 09 Apr 2025
https://github.com/mukul273/spring-data-rest-jpa-demo
Spring Data Rest JPA Demo
data jpa rest spring spring-boot spring-mvc
Last synced: 20 Apr 2026
https://github.com/vishwas-chakilam/twitter-sentiment-analysis
Twitter Sentiment Analysis is a Python project that analyzes the sentiment of tweets based on a user-defined keyword. It uses Tweepy to fetch tweets from the Twitter API and TextBlob for sentiment analysis. The application features a user-friendly GUI with Tkinter, displaying tweet sentiment as positive, negative, or neutral.
api data data-science dataanalysis python3 textblob-sentiment-analysis tkinter tweepy-api
Last synced: 11 Mar 2025
https://github.com/agustinmusanti/sqlchallenge-7
Resolución de un extenso desafío de SQL propuesto por el profesor Diego Moisset De Espanes, quien comparte ejercicios para aprender y practicar SQL Server a través de su canal de YouTube.
challenge data learning sqlserver
Last synced: 15 Apr 2025
https://github.com/j-sephb-lt-n/plotly-dash-dashboard-template
A data dashboard template
dash data data-visualisation data-vizualization dataviz google-cloud google-cloud-platform plotly plotly-dash python responsive-design responsive-web-design
Last synced: 18 Jun 2025
https://github.com/tomcardoso/journalism-data-intersection
A talk on working at the intersection of journalism and data science
data data-journalism journalism
Last synced: 15 May 2025
https://github.com/allanotieno254/powerbi-dax-filter-context
This repository contains a Power BI project that explores **DAX Filter Context**, a crucial concept in DAX calculations. The project focuses on **Bank Loan Analysis**, demonstrating how different filter contexts affect DAX formulas.
business-intelligence data data-analysis dax dax-functions powerbi powerbi-visuals visualization
Last synced: 08 Jan 2026
https://github.com/microsoftcloudessentials-learninghub/demosscenarios-techtalks
This repository showcases demonstrations and scenarios using Microsoft Cloud technologies. Please note that these demos are intended as a guide and are based on my personal experiences.
ai analytics azure copilot data data-science fabric m365 microsoft-general ml powerapps powerbi privatebot security sharepoint
Last synced: 14 Mar 2026
https://github.com/Coko7/vegapull-records
Cards dataset for One Piece TCG
data one-piece one-piece-card-game one-piece-tcg tcg
Last synced: 28 Apr 2025
https://github.com/rishitabansal9/adult-census-income-prediction
This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.
data data-analysis data-science feature-engineering random-forest-classifier
Last synced: 25 Mar 2025
https://github.com/primetdmomega/webscraper
A data web scraper that looks for jobs on Glassdoor.com
Last synced: 25 Mar 2025
https://github.com/fiedsch/data_util
misc. Utilities for data files like variable name lists
Last synced: 14 Jun 2025
https://github.com/buffdelta/basketball_ref_webscraper
Python package to make webscraping from basketball-reference easy
basketball data python python-library webscraping
Last synced: 14 Jan 2026
https://github.com/mladen/ds-ml-and-ai-experiments
:1234: My Data Science, Machine learning and Artificial Intelligence experiments and projects
data data-mining data-science datascience dataset
Last synced: 09 Jun 2026
https://github.com/nukopian/shell-flatten
Flatten a series into a single record
Last synced: 18 Jun 2025
https://github.com/dahmansphi/analysis_from_start_to_end
The Big Bang of Data Science- Analysis from the Start to The End- [Book Two]
analysis data data-analytics data-mining data-science hypothesis-testing jamovi machine-learning
Last synced: 08 Jan 2026
https://github.com/open-geodata/sp_bh_pcj-2020-2035
Dados Espaciais da Agência das Bacias PCJ, com informações apresentadas no Plano de Bacias 2020-2035
Last synced: 16 Jan 2026
https://github.com/ahmad-ali-rafique/heart-disease-detection-model
A comprehensive project for detecting heart disease using machine learning, including data processing, model training, and evaluation metrics with AUC curve analysis.
artificial-intelligence data datascience heart-disease machine-learning modeling prediction-model
Last synced: 11 Aug 2025
https://github.com/elijah-1994/pre-process-e-commerce-dataset
Importing, Cleaning, and Pre-Processing E-Commerce Data for Analysis Using MySQL.
analytics data dataanalytics datacleaning dataprocessing mysql mysql-database sql
Last synced: 11 Mar 2025
https://github.com/laguer/jupyt-nb
Mathematical and Physical Constants ratios in Cosmology and micro physics
analysis constants cosmology data dimensional julia mathematical micro notebook physical physics python ratios science
Last synced: 13 Apr 2026
https://github.com/amazingandyyy/dataviz
amazingandyyy data data-visualization
Last synced: 08 Jan 2026
https://github.com/anand-sony/mttr-dashboard
Streamlit dashboard for MTTR analysis with shift-wise loss insights and machine-level downtime tracking.
analytics business-analytics dashboard data python statistical-analysis
Last synced: 30 May 2026
https://github.com/nafisalawalidris/nafisalawalidris
Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and Bitcoin converge.
artifical-intelligence bitcoin config data data-science developer github-config github-pages machine-learning
Last synced: 16 May 2026
https://github.com/iamfrerot/userverse
creating api for data analysis
data data-analytics spring-boot users
Last synced: 23 Mar 2025
https://github.com/fehmitahsindemirkan/web-scrapper
Professional and high performance web scraping project.
data ecommerce emailsender fileexplorer logging python web webscraping
Last synced: 10 Jan 2026
https://github.com/mightymetrika/holi
holi: Higher Order Likelihood Inference Web Applications
data data-science r statistics
Last synced: 10 Feb 2026
https://github.com/welli7ngton/mysql-server-formacao-alura
repositório para guardar códigos escritos em SQL de cursos da formação em mysql server da alura
Last synced: 19 Apr 2026
https://github.com/gagolews/datafusion
Data Fusion (open-access research monograph, 2015)
aggregation data fusion fuzzy-logic mean multidimensional-analysis multidimensional-data spread statistics strings variance
Last synced: 16 Mar 2025
https://github.com/nmelgar/healthy_child_dataviz
Data visualization project to analyze what a healthy child is.
analysis data data-analysis data-science data-visualization dataviz research tableau visualization
Last synced: 23 Feb 2026
https://github.com/diordany/spicemill
Tool for plotting Ngspice simulation results with Pyplot.
analysis data electrical-engineering electronics frontend integrated-circuit integrated-circuits ngspice plot plotting post-processing pyplot python raw simulation spice
Last synced: 13 Jan 2026
https://github.com/austinhartzheim/career-fair-backend
Backend for ECS Career Fair app
Last synced: 13 Apr 2026
https://github.com/blueheron786/quranic-universal-library-mushaf-layouts
The Quranic Universal Library (QUL)'s Qur'an mushaf 15-line layouts (madini, uthmani)
data database layout mushaf quran sqlite uthmani uthmani-quran
Last synced: 13 Apr 2026
https://github.com/stupidcucumber/elephant-crawler
System for mining texts from websites.
data data-mining-python python
Last synced: 25 Apr 2026
https://github.com/andrii04/ga4-gcs-to-bigquery-etl
Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.
automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql
Last synced: 18 May 2026
https://github.com/sharoonjoseph321/insurance_fraud_detection
Fraud Detection using machine learning algorithm-KN Neighbors .Data exploration using Pyspark and matplotlib.
analytics data data-science eda high-performance knn-algorithm knn-classification machine-learning matplotlib-pyplot pyspark python seaborn spark statistics
Last synced: 23 Mar 2025
https://github.com/anilanadella/facebook_user_data_analysis_project
Jupyter project for analyzing Facebook user data
data data-visualization dataanalysis dataanalysis-projects dataanalysisusingpython jupyter jupyter-notebook pandas-library pandas-python python python3
Last synced: 13 May 2026
https://github.com/vdoninav/real_estate_analysis
real estate analysis
data data-analysis data-analysis-python data-science pandas pandas-dataframe pandas-python plotly plotly-express scipy seaborn streamlit streamlit-application streamlit-dashboard streamlit-webapp
Last synced: 12 Apr 2026
https://github.com/cityofnewyork/nyco-wp-open-data-transients
Interface for saving Open Data endpoints as WordPress Transients. Maintained by @NYCOpportunity
civic-tech composer data nycopportunity open-data plugin transients wordpress
Last synced: 10 Apr 2026
https://github.com/samhollings/nhs_data_cleansing
A repo of reusable functions for cleansing data
cleansing data data-cleaning data-cleansing preprocessing pyspark python python3
Last synced: 05 Oct 2025
https://github.com/lukakerr/us-surnames
US Surname data visualisation using R. Displays top 25 US surnames and race/ethnic percentage per name.
Last synced: 05 Oct 2025
https://github.com/affan005-ai/tesla-stock-prediction
This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models
data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn
Last synced: 05 Oct 2025
https://github.com/hit07/fitgpt-hacksc
AI-Powered Fitness Coach; 🥈 Runner up at HackSC's SoCal Tech Week hackathon
data elasticsearch gpt-4o-mini llm pipeline
Last synced: 28 Feb 2025
https://github.com/sebastianhochreiter/sql-projects
business-intelligence data datascience microsoft microsoft-sql-server sql
Last synced: 22 Feb 2026
https://github.com/rysteq/abstract-data-structures
This repository contains two programs written in C about the stack and queue ADT's
abstract-data-structures c data queue stack
Last synced: 06 Oct 2025
https://github.com/prajjwol09/sql_retail_analysis_project
This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.
data data-analysis datacleaning dataexploration pgadmin4 sql
Last synced: 15 Feb 2026
https://github.com/rahulthedevil/metric-converter
A simple utility package for converting between metric units such as meters, kilometers, grams, kilograms, liters, and more. Simple and powerful way for Units Convert solution
convert converter data fraction imperial length mass measurements metric metrics ratio system temperature unit unit-conversion unit-converter units uom utilities weight
Last synced: 08 Oct 2025
https://github.com/cmda-tt/course-25-26
🎓 tech track · 2025-2026 · curriculum and syllabus 📊
d3 data datavis functional javascript programming research svelte visualization
Last synced: 20 Jan 2026
https://github.com/rahul1582/bank-loan-classification
Classifying whether a person is taking personal loan or not using all the Classification Algorithms.
algorithm analysis classi data
Last synced: 08 Oct 2025
https://github.com/shubhamsoni98/classification-with-random-forest-1
To classify sales into categories (Low, Moderate, High) using Random Forests to inform strategic decisions and optimize marketing strategies.
algorithms anaconda data data-science datacleaning eda jupyter-notebook machine-learning pyhton random-forest scikit-learn visualization
Last synced: 18 Jan 2026
https://github.com/kaijagahm/2023-10-20-stlzoo
Data Carpentry workshop, hosted at the St. Louis Zoo. Beta testing the new ecology data lesson.
data data-science ecology r rstudio
Last synced: 05 Feb 2026
https://github.com/quetz-al/quetzal-openapi-client
Autogenerated Python client for the Quetzal API
client data data-science openapi-client openapi3 python quetzal
Last synced: 10 Oct 2025
https://github.com/loaiwalid07/automation_data_overviwe
This is Streamlit app that gives an overview for a dataset you upload
automation data data-analysis data-exploration data-science data-transformation data-visualization
Last synced: 19 May 2026
https://github.com/theopenwebjp/theopenweb-data-loader
Package for loading data to local project
data downloader import javascript typings
Last synced: 10 Oct 2025
https://github.com/amirreza81/kaggle-pandas-course-solutions
Kaggle Pandas Course - Solved exercises in another way of sample solution
data data-analysis data-cleaning data-manipulation data-science dataframe jupyter-notebook kaggle machine-learning open-source pandas
Last synced: 14 Apr 2026
https://github.com/chowington/bg-counter-tools
A set of tools that can pull data from Biogents BG-Counter smart mosquito traps and convert them into a Darwin Core compliant format.
bg-counter biogents darwin-core data internet-of-things mosquito-prevalence population-dynamics
Last synced: 10 Oct 2025
https://github.com/badranalyst/data-professional-survey-breakdown-power-bi-dashboard
This project presents an interactive Power BI dashboard analyzing data professionals' insights. Key focus areas include job satisfaction, challenges in entering the data field, career priorities, demographics, and more. The visualization helps uncover trends and factors impacting data professionals globally.
charts dashboard dashboards data data-cleaning data-visualization dataset dax power-bi powerbi
Last synced: 23 Feb 2026
https://github.com/dumkydewilde/mcp-memory-layer
A template for building your own BI MCP with dbt, LLMs and multi-user corrections
Last synced: 13 Mar 2026
https://github.com/jatin-mehra119/paris_housing_price-kaggle-
Paris Housing Price Kaggle Competiton
data data-visualization kaggle-competition machine-learning numpy pandas predictive-modeling scikit-learn
Last synced: 29 Apr 2026
https://github.com/aldro61/mmit-data
The data used in the Maximum Margin Interval Trees paper
data machine-learning machine-learning-algorithms reproducible-research
Last synced: 19 Feb 2026