data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-02 00:07:45 UTC
- JSON Representation
https://github.com/darrendavy12/earthquake-events-and-risks-project---azure-data-pipeline---api-connection-
Earthquake Events and Risks Project - Azure Data Pipeline - API Connection
azure blob-storage cloud cloudstorage data databricks databricks-notebooks databricks-workspace dataengineer dataengineering microsoft python
Last synced: 28 Apr 2026
https://github.com/sgbasaraner/cs50
my cs50 solutions
algorithms c cs50 cs50x data harvard python structures
Last synced: 29 Apr 2026
https://github.com/sn0wfree/factor_table
an universal connector for all kind data source and manage all kind data as factor type by one package
connector data database factor
Last synced: 29 Apr 2026
https://github.com/sushmashreeps/data-science-with-python
This repository showcases a comprehensive data science project utilizing Python, demonstrating expertise in data analysis, visualization, and machine learning. Built with Python 3.x, the project leverages popular libraries like Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, and TensorFlow. The project features data preprocessing, feature engine
cnn data dataanalysis datascience keras linear-regression matplotlib python python3 regression rnn visualization
Last synced: 14 Apr 2026
https://github.com/mr-dhan/eda-sales-customer-transactions
Dalam dunia bisnis ritel yang kompetitif, pemahaman mendalam terhadap perilaku pelanggan merupakan fondasi penting untuk pengambilan keputusan strategis. Namun, data transaksi pelanggan seringkali berjumlah besar dan kompleks, sehingga memerlukan proses analisis yang efektif untuk mengungkap insight yang berharga.
dashboard data data-analysis data-analysis-python data-science data-visualization eda python
Last synced: 29 Apr 2026
https://github.com/chandansoren/financial-budget-analysis
Financial budget for 2021
Last synced: 29 Apr 2026
https://github.com/mirzayasirabdullahbaig07/advanced-sql-in-python
This repository covers advanced SQL concepts implemented using Python. It demonstrates how to interact with databases, run complex queries, perform joins, aggregations, window functions, and more using libraries like sqlite3, SQLAlchemy, or pandas. Ideal for data analysts and developers looking to integrate SQL power into Python workflows.
data databases dbms mysql nosql programing-language python sql
Last synced: 29 Apr 2026
https://github.com/ozgrozer/electron-store-data
A Node.js module to store Electron data in the computer
Last synced: 29 Apr 2026
https://github.com/supunlakmal/coronavirus-covid-19-status
Covid 19 cases and death count for each country in a json file.
coronavirus count country covid-19 covid-data covid19 data data-science data-visualization geographical geographical-information-system json
Last synced: 21 Jun 2026
https://github.com/johnelliott/wb-web
Moved —> https://github.com/johnelliott/waybot
arduino browser data iot raspberry-pi web
Last synced: 12 Apr 2026
https://github.com/dhruvsrikanth/superconductor-regression-kaggle-challenge
Kaggle challenge based on superconductor dataset.
data data-science jupyter-notebook kaggle kaggle-challenge kaggle-competition lasso-regression linear-regression machine-learning python random-forest regression sklearn support-vector-regression
Last synced: 30 Apr 2026
https://github.com/peterhellberg/bugsnag-data
Dump Bugsnag data using the Data access API
Last synced: 22 Jun 2026
https://github.com/chompfoods/sdk-jaxrs-cxf
JAXRS-CXF SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.
apache-cxf api branded chomp cxf data database food grocery ingredients java jax-rs nutrition raw recipe-api recipes sdk
Last synced: 30 Apr 2026
https://github.com/ddeepanshu-997/datascience-e-commerce-shopping-details-
in this project i am going to apply data preprocessing technique on the dataset in order to clean the data using libraries, etc. make some insights/analyses to findout the hotpicks of the shopping along with some data visualsation libraries to get the trends and many more aspects in order to make a small contribution to the field of data science
cleaning-data data data-science data-visualization dataframe datapreprocessing dataset libraries matplotlib-pyplot numpy pandas plots python visualization
Last synced: 30 Apr 2026
https://github.com/dantetrb/diabetes-readmission-dbt
Predictive analytics on diabetic patient readmissions using dbt, DuckDB and Python – with explainability and clustering.
clustering data dataengineering dbt diabetes duckdb hdbscan healthcare jupyter lime readmission-prediction sql
Last synced: 01 May 2026
https://github.com/shauryauppal/mydatatoolkit
A toolkit for data scientists to get work done faster, easier, and in a smarter way.
analytics awesome-list data data-science hacktoberfest
Last synced: 08 Jun 2026
https://github.com/linguini1/edueval
The BorealisAI Let's Solve It mentorship project: summarizing student feedback submissions on their professor into one cohesive paragraph for faculty consideration during performance reviews.
ai data data-analysis data-science machine-learning machinelearning nlp python pytorch sentiment-analysis
Last synced: 01 May 2026
https://github.com/dineshdhamodharan24/data-analysis
probability Analysis to customers and bascis analysis
analysis data powerbi probability python visualization
Last synced: 23 Jun 2026
https://github.com/anchanung/til
Computer Science
computer-science data database docker infra k8s kafka operating-system
Last synced: 01 May 2026
https://github.com/eshitakundu/disease-outbreak-predictor
Disease Outbreak Predictor: A Streamlit-based web application for predicting diabetes, heart disease, and Parkinson's disease using machine learning models.
data data-science disease-prediction healthcare-application jupyter-notebook machinelearning ml notebook prediction python streamlit streamlit-webapp
Last synced: 01 May 2026
https://github.com/pchaparro/search-engine
Full stack search-engine created from youtube videos obtained using "web-scraping"
data opensearch python python3 react scraper scraping scraping-websites search search-engine semantic-search sentence-transformers typescript website
Last synced: 17 Apr 2026
https://github.com/gcoronelc/cepsuni-disbd-64505
Taller de Modelamiento de de Base de Datos con Gustavo Coronel
data database databases db2 db2-database modeling oracle oracle-database relational-database relational-database-design relational-databases relationships sql sql-server
Last synced: 02 May 2026
https://github.com/0xhericles/spamdetector
:email: A Simple Python Spam Detector with Scikit-Learn
data ham machine-learning python sklearn spam
Last synced: 02 May 2026
https://github.com/gcoronelc/ucv_gdi-1_202302-a2
Taller de Gestión de Datos e Información I con Gustavo Coronel.
data data-science database databases machine-learning machinelearning oracle sql sql-server
Last synced: 02 May 2026
https://github.com/hafs96/prediction_consommation-de-carburant
Dans ce projet, l'objectif est de développer un modèle permettant de prédire si une voiture a une consommation de carburant élevée ou faible en fonction de ses caractéristiques techniques.
analysis data data-visualization machine-learning testing training
Last synced: 09 Jun 2026
https://github.com/radekbednarik/covid-czech-data-api
Library to make it easy to work with REST API of official Czech Covid data.
api covid-19 data deno library typescript
Last synced: 02 May 2026
https://github.com/jesuscc1993/data-cleaner-extension
Clears browser data in a single click.
application-data chrome chrome-extension data
Last synced: 02 May 2026
https://github.com/viniddev/active_finance
Nesse projeto busquei solucionar um problema corriqueiro que é a dificuldade de se manter atualizado sobre as variações do mercado de ações e fundos imobiliários. Usei selenium webdriver para buscar informações e uma API do Telegram para enviar relatórios para o usuário
automation data data-analisis rpa selenium-webdriver telegram-bot
Last synced: 03 May 2026
https://github.com/antoineaugusti/youtubers-tips
Collecting data about tips given to Youtubers
data economy youtube youtubers
Last synced: 03 May 2026
https://github.com/charityeverett/gobackfetchit
Award Winning WebXR Data Journalism Storytelling Project
3d aframe ar css data html html-css-javascript nodejs visuzalization vr webxr xr
Last synced: 03 May 2026
https://github.com/arnavk-09/phishing-detection
🎣 Detect Phishing URLs with Data Pre-fitted... API & Web UI
csv data fastapi flask python scikit-learn
Last synced: 03 May 2026
https://github.com/yash-chauhan-dev/spark_cluster_docker
Set-up local spark cluster, hadoop (hdfs), airflow, postgresql on docker with ease, without any local installations
apache-spark data data-engineering data-engineering-pipeline deployment docker docker-compose hadoop hdfs local-development localhost pyspark python
Last synced: 04 May 2026
https://github.com/fallaciousreasoning/nz-mountains
A list of mountains in NZ, scraped from https://climbnz.org.nz
alpine climbing climbnz data json json-api maps mountaineering scraping
Last synced: 04 May 2026
https://github.com/soham7998/data-analysis-projects
My Data Analysis Projects which are completed by me and gain a hands on Experience from each project. the project showcase different Concepts , Visualization and many things.
data data-analysis data-science machine-learning nlp python soham visualization
Last synced: 04 May 2026
https://github.com/armand-sauzay/datasets
Datasets for machine learning
ai data datasets machine-learning ml
Last synced: 18 Jan 2026
https://github.com/dimitryzub/russo-ukraine-war-prediction-losses
Highlights rusian losses with predictions based on historic data from Ministry Defence of Ukraine 🐱👤
data dataanalysis dataanalytics matplotlib pandas prophet python
Last synced: 04 May 2026
https://github.com/jdanielgoh/cobertura-campanias
En una democracia ¿caben todas las voces? Proyecto para visualizar el monitoreo de radio y TV que realiza el INE de las candidaturas presidenciales 2024
d3js data datavisualization vue
Last synced: 09 Jun 2026
https://github.com/gabya06/twitter_models
Repository used for twitter impression models
data data-science impressions machinelearning python ridge-regression sklearn twitter
Last synced: 04 May 2026
https://github.com/kasunjayasanka/simple-backend-database-data-retrieval
Simple HTML form with inserting and retrieving data from Firebase Realtime Database
bootstrap css3 data firebase firebase-realtime-database html5 insert-data javascript retrieve-data
Last synced: 05 May 2026
https://github.com/munas-git/codm-review-analysis-and-predictions
Sentiment analysis on Call of Duty Mobile Google Play Store user reviews with ML model to classify new reviews.
data flask machine-learning python sentiment-analysis
Last synced: 05 May 2026
https://github.com/suchi25sathavara/r-projects
R projects in Real world Scenerios for Data Analysis
data data-analysis datavisualization r
Last synced: 01 Apr 2025
https://github.com/anburocky3/cbse-schools-data
Fetch CBSE Schools in seconds and use it for your data projects
cbse data data-analysis data-science grabber nextjs
Last synced: 24 Jun 2026
https://github.com/mito-ds/mitosheet_helper_config
The mitosheet_helper_config package used by enterprises to configure the mitosheet package.
data data-analytics data-science data-visualization jupyter pandas python
Last synced: 05 May 2026
https://github.com/donmaruko/python-eda-toolkit
CLI-runned EDA with 30 commands utilizing text-related functions, statistical calculations, data visualization, and data manipulation.
data data-analysis data-science data-visualization matplotlib pandas scipy seaborn statistical-analysis statistics wordcloud
Last synced: 06 May 2026
https://github.com/lexz-08/sharpdata
Easily manage DataGridViews or create one with the struct 'DataGridManager' provided.
csharp data datagridview ui user-interface windows windows-forms winforms
Last synced: 06 May 2026
https://github.com/beriberikix/senml-zephyr
A codec for encoding and decoding Sensor Measurement Lists (SenML) for Zephyr
codec data iot senml sensor zephyr-rtos
Last synced: 24 Mar 2025
https://github.com/fabsdevx/file-format-converter-handout
Data Engineering project for learning purposes. Credits to itversity
csv csv-import data data-engineering database pandas python
Last synced: 06 May 2026
https://github.com/sakan811/honkai-star-rail-characters-damage-simulation
Honkai Star Rail Characters' Damage Simulation
data data-science data-visualization honkai honkai-star-rail honkai-starrail powerbi powerbi-visuals python sqlite
Last synced: 29 Jun 2026
https://github.com/shantanujpk/bigdatacloud
Exploration of PySpark for data processing and interview prep — demonstrates handling corrupted records, applying transformations/actions, and building efficient data pipelines with practical examples.
big-data data jupyter-notebook pipeline pyspark python spark sparksql
Last synced: 07 May 2026
https://github.com/whis99/data_analysis_journey
A repositories of my data analysis projects.
data data-analysis data-analysis-python data-visualization dataset jupyter-notebook matplotlib python visualization
Last synced: 07 May 2026
https://github.com/hudson-newey/data-miner
A simple data miner that collects information from an API and stores it in a file
api api-client big-data bigdata data logger logging
Last synced: 10 Jun 2026
https://github.com/murtaza-arif/all-you-need-to-know-for-data-engineer
This repository is designed to showcase various aspects of data engineering, including tools, frameworks, and end-to-end projects. It covers everything from data ingestion and transformation to data warehousing and cloud-based solutions.
cassandra data data-engineering data-science kafka kafka-consumer kafka-streams pyarrow spark
Last synced: 07 May 2026
https://github.com/jigyasag18/iit-guhawati
Empower Sakhi is a data-driven platform that uses machine learning to identify women at risk of domestic violence in India. It offers confidential self-assessments, survivor stories, and emergency resources through a trauma-informed, privacy-focused web app. The project also provides NGOs with actionable insights via Power BI dashboard for support.
aiml data dataset datavisualization domestic-violence eda jupyter-notebook label-encoding machine-learning machine-learning-algorithms machine-learning-models machinelearning machinelearningprojects powerbi python python-app random-forest random-forest-classifier streamlit streamlit-webapp
Last synced: 08 May 2026
https://github.com/suchi25sathavara/data-wrangling-with-r
Analyzing Road Accidents in Victoria, Australia
data r reporting rstudio wrangling-data
Last synced: 01 Apr 2025
https://github.com/blackhatdevx/leetcode
LeetCode Solutions by Jash Gro
algorithm algorithms dart data datastructures datastructures-algorithms dsa java javascript leetcode leetcode-java leetcode-python leetcode-solutions neetcode
Last synced: 08 May 2026
https://github.com/vaxdata22/cyclistic-ride-sharing-company
This is my Google Data Analytics Certificate case study for the Cyclistic ride-sharing company
actionable-insights business-analytics business-intelligence data data-analytics data-cleaning data-mining data-visualization data-wrangling exploratory-data-analysis google-data-analytics spreadsheets sql sql-server sql-server-management-studio statistical-analysis t-sql tableau transact-sql
Last synced: 10 Jun 2026
https://github.com/miniql/miniql-csv
A MiniQL query resolver that loads data from CSV files.
comma-separated-values csv data query query-language
Last synced: 08 May 2026
https://github.com/taquece/goals-per-match
basic script to calculate average football goals per match from .CSV
beginner csv data football nodejs python sports-analytics
Last synced: 09 May 2026
https://github.com/tupizz/python-data-manipulation
Data manipulation and visualization with Python 2.x
Last synced: 09 May 2026
https://github.com/flexthink/matricize
A convenience library to convert between pure Python objects and their vectorized representations
data machine-learning numpy python
Last synced: 09 May 2026
https://github.com/dilkushsingh/webscraping-with-selenium-and-beautifulsoup
Web Scrapped a popular tech gadgets website using Selenium and BeautifulSoup, also performed Data Analysis on scrapped data.
beautifulsoup data datacleaning datagathering eda exploratory-data-analysis python selenium webscraping
Last synced: 24 Feb 2026
https://github.com/cemc-oper/nmc-typhoon-db-client
A CLI client for NMC Typhoon Database.
Last synced: 01 Jun 2026
https://github.com/erencelik/binance-public-data-node
Nodejs downloader and unzipper script for Binance Public Data
binance data downloader nodejs public script
Last synced: 15 May 2026
https://github.com/ssanthosh010303/collection-data-training
A collection of challenges exercised during data training program.
airflow apache azure azure-data-factory azure-databricks azure-logic-apps bigdata data hadoop spark
Last synced: 27 Jan 2026
https://github.com/athari22/analyzing-the-yelp-dataset
SQL for Data Science
analytics data data-science data-structures er sql
Last synced: 27 Jan 2026
https://github.com/gabrieldim/complete-analysis-covid-19
Analysis of the Covid 19.
analysis covid-19 covid19 data data-science science virus
Last synced: 23 Jan 2026
https://github.com/shubhamsoni98/prediction-with-binomial-logistic-regression
To predict client subscription to term deposits and optimize marketing strategies by identifying potential subscribers.
binomial data data-science eda machine-learning matplotlib pipeline python scikit-learn seaborn sklearn sql visualization
Last synced: 06 Feb 2026
https://github.com/andrewl/danelaw
Geopackage containing the boundary of the Danelaw
data geospatial medieval viking
Last synced: 23 Jan 2026
https://github.com/meokullu/colorizenumber
ColorizeNumber - Bodrum Papatya, visualizes numeric data into colors which creates an image.
color colorize colors data data-visualization visualization vizualize-data
Last synced: 01 Jun 2026
https://github.com/j-sephb-lt-n/data-warehouse-and-etl-best-practice
A catalogue of best practices for managing data
data data-cleaning data-engineering data-validation data-warehouse etl
Last synced: 23 Jan 2026
https://github.com/vanduc1102/parse-stackoverflow-data
Parse stackoverflow data
Last synced: 16 Oct 2025
https://github.com/thais81/gamesbox
Another desktop app in JSE/Jswing with hangman game and tic-tac-toe game. This project was made at LDNR school with 4 friends
data database hangman-game jse tictactoe tictactoe-game
Last synced: 28 Jan 2026
https://github.com/tyriek-cloud/statistical-work-sample
The purpose of this study is to observe if a sample of people that has siblings is independent of a sample of people that possess an opinion of whether patients with incurable diseases should be allowed to die.
analysis data spss statistics t-test
Last synced: 22 Jan 2026
https://github.com/sumitkundu102022/air-quality-report
Air Quality Report using PowerBI
data data-analysis data-visualization powerbi
Last synced: 23 Jan 2026
https://github.com/intersystems-ib/workshop-smart-data-fabric
Learn the main ideas involved in developing a Smart Data Fabric using InterSystems IRIS
analytics data datafabric interoperability smart
Last synced: 14 Apr 2026
https://github.com/dhanish03/reliance-sales-report-dashboard
This project, Reliance Sales Report Dashboard, showcases a dynamic and interactive Power BI dashboard designed to analyze sales performance. The dashboard provides key insights into various aspects of sales data, including product-wise performance, region-based revenue, and profitability trends.
data datavisualization-project powerbi visualization
Last synced: 23 Jan 2026
https://github.com/harmanveer-2546/reducing-data-entries
Way to delete data entries from csv/excel file using. For excel file, use excel instead of csv in the code.
csv data data-entry delete-data excel numpy pandas python
Last synced: 05 May 2026
https://github.com/knowcnu12/metamask-wallet-recovery-funds-phrase-data-seed-token
This repository provides tools and guidelines for securely recovering MetaMask Wallet funds using recovery phrases, seed data, and tokens. It ensures safe and reliable methods for recovering access to your wallet and managing your cryptocurrency assets.
bitcoin blockchain cryptocurrencies cryptocurrency data ethereum funds metamask metamask-bot metamask-desktop metamask-extension metamask-plugin metamask-snap metamask-wallet phrase recovery seed token wallet wallet-security
Last synced: 08 Mar 2026
https://github.com/soenneker/soenneker.data.email.disposables
Simply adds a list of compiled disposable/temporary email domains, updated daily (if available)
csharp data disposable disposables domain dotnet email mailinator
Last synced: 29 May 2026
https://github.com/fnu-ankit/8-week-sql-challenge
My attempt on solving Case studies from #8WeeksSQLChallenge
8-week-sql-challenge 8-weeks-sql-challenge 8weeksqlchallenge case-study data data-analysis data-analysis-sql data-analytics database datawithdanny sql sqlserver
Last synced: 19 Apr 2026
https://github.com/amethyst-php/courier
amethyst amethyst-package api courier data laravel
Last synced: 17 May 2026
https://github.com/ztgx/muvera
MUVERA: Making multi-vector retrieval as fast as single-vector search
algorithms data google muvera retrieval rust search structure vector
Last synced: 25 Oct 2025
https://github.com/byndyusoft/byndyusoft.data.relational
Relational abstractions for Byndyusoft.Data.Relational.
byndyusoft data dataaccess db relational-databases
Last synced: 25 Oct 2025
https://github.com/luminati-io/httpx-web-scraping
Web scraping using HTTPX in Python, covering setup, advanced features, comparisons with Requests, and more.
beautifulsoup data html httpx python web-scraper web-scraping
Last synced: 13 Oct 2025
https://github.com/uznetdev/smoking-prediction
This project focuses on analyzing the "Smoking" dataset and building a predictive model for smoking status based on various health metrics. The goal is to identify factors influencing smoking behavior and develop a reliable model for prediction.
ai classification data data-science kaggle-competition machine-learning ml roc-auc sklearn smoking
Last synced: 17 Apr 2026
https://github.com/tyriek-cloud/nyc-dca-etl
Created an ETL pipeline to merge two CSV files (converted to JSON) into a parquet file using Azure Data Factory, The data was extracted from NYC Open Data: https://opendata.cityofnewyork.us/ and I created a Blob Container within an existing storage account.
azure azure-data-factory blob-storage data data-engineering etl-pipeline
Last synced: 21 Jan 2026