data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-23 00:07:41 UTC
- JSON Representation
https://github.com/codbex/codbex-number-generator-data
Number Generator for Documents Module - Data
Last synced: 05 Apr 2026
https://github.com/mi7773/advanced_sql_data_analytics_project
A hands-on SQL project simulating data analysis using fact and dimension tables, covering trends over time, cumulative metrics, performance breakdowns, segmentation, and reporting via SQL.
analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics database query reporting sql sql-queries sql-query sql-server window-functions window-functions-in-sql
Last synced: 18 Apr 2026
https://github.com/thicclatka/tetration
New file format for tensors
cli data fileformat mmap tensors
Last synced: 26 May 2026
https://github.com/anuragagarwal96/hospital-mortality-rate-sql-analysis
In this project, I have taken a hospital dataset from Kaggle, analysed it and predicted the mortality rate of patients who have been admitted in hospitals. I have utilised a combination of SQL, Tableau and Microsoft Excel for this project.
data data-visualization dataanalysis dataanalysisusingsql excel msexcel mssqlserver sql tableau tableau-public
Last synced: 09 Mar 2026
https://github.com/ahmad-ali-rafique/decision-tree-classifier-modeling
👏Comprehensive exploration of decision tree classifiers, including data cleaning, model building🏩, and performance evaluation on various datasets.
analytics classification classification-models data data-science dataanalytics datacleaning dataset decision-tree-classifier models
Last synced: 20 Apr 2026
https://github.com/montanaz0r/suicide-rate-analysis
Testing a significance of the correlation between a suicide rate and a number of psychiatrists and psychologists working in the mental health sector
analysis correlation data data-analysis data-science jupyter-notebook jupyter-notebooks matplotlib numpy pandas psychology python python-3 seaborn statistics suicide-rate
Last synced: 20 Apr 2026
https://github.com/crypt596-rubykz/metaai-data-explorer-scraping-tool
MetaAI data explorer tool
api-research automation data explorer html-parsing metaai playwright python rate-limiting scraping
Last synced: 20 Apr 2026
https://github.com/gianlucatruda/titanic
An exhibition of my experience in data processing and visualisation. Python script to process and visualise the Titanic survivor data.
data database flask info matplotlib python science scrape server titanic visualisation web
Last synced: 10 Apr 2026
https://github.com/machinecyc/lotteryinsight
Use crawler to collect Taiwan Lotto data, and save data into local MySQL server.
crawler data docker lottery mysql-database python3 taiwan
Last synced: 09 May 2026
https://github.com/prashhhant213/data_analysis_and_visualization-_for_streaming_platform
Data Analysis and Visualization for streaming platform to provide insights and recommendations to improve their userbase.
colab-notebook data datavisualization matplotlib numpy pandas python seaborn
Last synced: 20 Apr 2026
https://github.com/petermeissner/suuntor
Data from a Suunto watch extracted by R - !because!
automation data r rstats suunto windows
Last synced: 20 Apr 2026
https://github.com/open-geodata/sp_bh_pcj-2020-2035
Dados Espaciais da Agência das Bacias PCJ, com informações apresentadas no Plano de Bacias 2020-2035
Last synced: 16 Jan 2026
https://github.com/fastpix/android-data-kaltura
This SDK enables seamless integration with Kaltura Player, offering advanced video analytics via the FastPix Dashboard
analytics android-sdk data fastpix kaltura kaltura-player metrics sdk video video-metrics
Last synced: 21 Apr 2026
https://github.com/critocrito/data-scores-map
Data scores in the UK web app.
algorithmic-decision-making data data-investigation data-scores investigation
Last synced: 21 Apr 2026
https://github.com/vck9521/traffic-accidents
In this project, we analyze the effects of various factors that correlate to traffic fatalities in the United States. Logistic regression is used, with the y variable being Fatality Rate (coded 0 for Survived, 1 for Fatality).
analysis data fatalities r regression rstudio traffic visualization
Last synced: 05 Jun 2026
https://github.com/schijioke-uche/data-analysis-with-python-an-spss-model
With this Python notebook algorithm, you can use SPSS Model notebook to build machine learning pipelines that you can use to iterate rapidly during the model building process in data analysis. Whether you're trying to find the right algorithm or experimenting with different ways of preparing your data, you can create reproducible research that's easily understood by any member of your team with Hypothesis definition.
anova cp4a cp4d cp4i cp4s data ibm ibm-cloud jeffrey-chijioke-uche jeffrey-solomon-chijioke-uche openshift python python3 redhat t-test
Last synced: 22 Apr 2026
https://github.com/nicholas-owen/rdm-calendar
A small utility to manage conference and event information
calendar conference data event research
Last synced: 26 May 2026
https://github.com/checco9811/data-engineering-bootcamp-homework
Homework solutions for DataExpert.io data engineering bootcamp
apache-spark data data-engineering sql
Last synced: 14 Mar 2025
https://github.com/marielachirinosr/cyclistic-data-analytics-project
This project explores user behavior within a fictional bike-sharing system, modeled after Cyclistic, operating in Chicago.
data data-visualization pandas powerbi-report powerbi-visuals python
Last synced: 24 Apr 2026
https://github.com/rubix982/product-quality-classification
This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.
data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp
Last synced: 25 Apr 2026
https://github.com/thanh-wutan/chess-opening-comparator
Interactive web app using R to visualize and compare chess opening performance and popularity.
chess-openings data databases datavisualisation r
Last synced: 09 May 2026
https://github.com/sebastianbrzustowicz/flight-quality-overview-microservice
Go + Docker. Microservice with parallel computations to convert raw vehicle flight data into overview raport with visualisation.
container control csv data docker drone flight go goroutines http microservice parallel-computing pdf quadcopter raport rms sse vehicle
Last synced: 10 May 2026
https://github.com/datannur/datannur
datannur is an open source, lightweight and sovereign data catalog
catalog data data-catalog data-governance data-management dcat dcat-ap dcat-ap-ch metadata open-data open-source public-sector svelte swiss switzerland
Last synced: 07 Jun 2026
https://github.com/gman-au/white-knight-neo4j
Neo4j implementation of White Knight data abstraction library
abstractions data datastore dotnet neo4j repository-pattern specification-pattern
Last synced: 20 Jan 2026
https://github.com/diordany/spicemill
Tool for plotting Ngspice simulation results with Pyplot.
analysis data electrical-engineering electronics frontend integrated-circuit integrated-circuits ngspice plot plotting post-processing pyplot python raw simulation spice
Last synced: 13 Jan 2026
https://github.com/ioanzicu/batch_loading_one-to-many_data_model
Unesco Batch Loading One-to-Many Data using Django
Last synced: 27 Apr 2026
https://github.com/demkeys/lazydatatransfer
Lazy method to transfer upto 64kb of data over the network using UDP
data data-trans network python transfer udp
Last synced: 07 Jun 2026
https://github.com/bhumitbedse/machine-learning-projects
AI Machine learning Deep learning Computer vision NLP Projects with code
computer-vision data data-science deep-learning machine-learning natural-language-processing python
Last synced: 27 Apr 2026
https://github.com/dushansenadheera/web_scraper
web scraper using Python along with BeautifulSoup and Selenium
beautifulsoup data python selenium web-scraping
Last synced: 19 Jun 2026
https://github.com/mohamedezzeldeenhassanmohamed/data-mining-project
Data minnig GUI project to predict laptop prices,I uses most of ML algorithmes here
data data-mining-assignments datamining-algorithms datapreprocessing decision-trees entropy gini k-means-clustering knn-classification laptop-dataset laptop-price-prediction linear-regression logistic-regression ml mlalgotithms naive-bayes-classifier pca python svm-classifier visualization
Last synced: 27 Apr 2026
https://github.com/andrii04/ga4-gcs-to-bigquery-etl
Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.
automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql
Last synced: 18 May 2026
https://github.com/ccworld1000/cccomposition
CCComposition for code style, Accept code style conversion business(接受code style转换业务)
cccomposition composit construction data structure visual
Last synced: 04 Jan 2026
https://github.com/sagarkhese40/python-assginment
python assignment
assignment data data-science data-visualization python seaborn-plots
Last synced: 28 Apr 2026
https://github.com/schoolsquirrel/holiday-data
Automatically updated holiday data for SchoolSquirrel
data holidays schoolsquirrel scripts vacation
Last synced: 03 Oct 2025
https://github.com/mrlynn/sizing-exercise-data-generator
Data Generator for December 2017 Sizing Exercise
Last synced: 28 Apr 2026
https://github.com/mnkanout/patients_medication_prediction
The aim of the project is to create a model that can help medical professionals select the proper medication for patients based on their symptoms. The model uses historical data of other patients to predict what could be the most suitable medication based on the patient's symptoms.
data data-analysis data-science data-visualization decision-tree-classifier machine-learning python3
Last synced: 29 Jun 2025
https://github.com/shef4793/hackerrank-sql-challenges-solutions
The solutions of all SQL challenges on HackerRank executed on either MySQL or MS SQL environment.
data data-engineering hackerrank hackerrank-challenges hackerrank-solutions mssql mssql-server mysql problem-solving solutions sql sql-challenges sql-query
Last synced: 11 Mar 2026
https://github.com/zevio/acl
ACL Anthology corpus sample
data dataset scholarly-articles
Last synced: 01 Mar 2026
https://github.com/albanecoiffe/jo2024_visualization
Tableau de bord avec Streamlit sur les JO de Paris 2024.
Last synced: 30 Apr 2026
https://github.com/stoyank7/football-prediction
This is my Semester 7 Project for my "AI for Society" minor at Fontys University of Applied Sciences.
ai betting data football machine-learning university-project
Last synced: 25 Mar 2025
https://github.com/vidushibhadana/covid19-data-exploration-using-sql
Deployed diverse SQL techniques to analyze COVID-19 data for an improved understanding of pandemic's regression.
data database database-management sql
Last synced: 19 Aug 2025
https://github.com/sn0wfree/factor_table
an universal connector for all kind data source and manage all kind data as factor type by one package
connector data database factor
Last synced: 29 Apr 2026
https://github.com/paezha/bsantiago
A data package with the results of a travel and well-being survey conducted in Santiago in 2016
data equity package r santiago survey travel well-being
Last synced: 18 Mar 2025
https://github.com/anisimov-anthony/data_forest
Implementation of various types of trees
algorithms-and-data-structures data lib rust tree
Last synced: 28 Apr 2025
https://github.com/amethyst-php/catalogue-product
amethyst amethyst-catalogue-product api catalogue-product data laravel
Last synced: 20 May 2026
https://github.com/danielrosehill/global-value-factors-explorer-dataset
Derivative database of IFVI Global Value Factors for data analysis and visualization use cases.
data environmental-data sustainability-data
Last synced: 23 Feb 2026
https://github.com/rse/nebulize
Nebulize Security-Sensitive Information
data dsgvo gdpr information nebulize security sensitive
Last synced: 16 Mar 2025
https://github.com/hit07/fitgpt-hacksc
AI-Powered Fitness Coach; 🥈 Runner up at HackSC's SoCal Tech Week hackathon
data elasticsearch gpt-4o-mini llm pipeline
Last synced: 28 Feb 2025
https://github.com/kashifkhan7/cleaning-analysis_cli
Analyze sales data easily with our CLI app. Gain insights on revenue trends and visualize results using Python, Pandas, and Matplotlib. 🚀📊
conditional-statements css data datacleaning exception-handling exiftool html json matplotlib-pyplot metadata metadata-extraction pandas-python python sales-analysis seaborn-python speech-to-text transcription youtube
Last synced: 13 Apr 2026
https://github.com/zazza123/hamana
A python library for seamless data extraction, storage, and SQL-based analysis using pandas and SQLite.
Last synced: 14 Jan 2026
https://github.com/soenneker/soenneker.datatables.attributes.column
A C# attribute for Datatables.js column building
attributes column columns csharp data datatablecolumnattribute datatables dotnet mapping object
Last synced: 12 Mar 2026
https://github.com/soenneker/soenneker.cloudflare.origincerts.thumbprints
The current Cloudflare origin certificate thumbprints
cloudflare csharp data dotnet origincerts thumbprint thumbprints
Last synced: 23 Apr 2026
https://github.com/ychaaby/text-classification-chat
ChatBot Boutique USPN
classification data python pytorch
Last synced: 05 Feb 2026
https://github.com/abdellah-laassairi/thyroid-disease-analysis
Thyroid dataset visualization dashboard in R
dashboard data flexdashboard imputation-methods rshiny visualization
Last synced: 18 Jan 2026
https://github.com/ohspc89/better_call_jin
A repository containing mentoring materials for a Ph.D. student in Neuroscience
data matlab spss-statistics visualization visualization-tools wrangling-data
Last synced: 08 Oct 2025
https://github.com/q-aware-labs/bias-insights
Bias detection project for the Chicago Face Database (CFD)
ai chicago-data-portal data data-science llm statistical-analysis
Last synced: 21 Jan 2026
https://github.com/bmcollier/contiguous
Provides COBOL-style contiguous data structures in Python
Last synced: 14 Jan 2026
https://github.com/idhruvs/angular4-smart-table-demo
Angular4 Smart Table Demo Project
angular4 data tables typescript
Last synced: 21 Apr 2026
https://github.com/karosi12/ng-data-share
Angular communication with input and output properties
angular communication data data-binding input output sharing typescript
Last synced: 16 Jan 2026
https://github.com/djdhairya/whatsapp-chat-analysis
WhatsApp chat analysis is a multidimensional process that delves into the content, structure, and dynamics of conversations within the platform. It provides valuable insights for personal reflection, organizational decision-making, and improving communication strategies.
data data-science dataanalytics datapreprocessing machine-learning ml
Last synced: 08 Oct 2025
https://github.com/luminati-io/ZoomInfo-dataset-samples
A sample dataset of over 1000 ZoomInfo companies, extracted using the Bright Data API, ideal for market growth, lead generation, and market analysis.
b2b business companies data data-extraction database dataset datasets web-scraping zoominfo
Last synced: 09 Apr 2025
https://github.com/reshmaaiman/fifa
FIFA20
data data-science data-visualization dataanalysisusingpython github jupyter-notebook matplotlib numpy pandas python seaborn-python
Last synced: 10 Apr 2026
https://github.com/equinor/fmu-sumo-uploader
Upload to Sumo in the FMU context
data fmu python subsurface sumo
Last synced: 06 May 2026
https://github.com/diegoperea20/pytorch-vs-tensorflow
Testing the differences of the pytorch and tensorflow libraries in the different prediction and classification applications, each of them gives improvements depending on the problem they are assigned or data set assigned.
classification data images prediction pytorch tensorflow
Last synced: 29 Apr 2026
https://github.com/deliprofesor/breast-cancer-detection-using-svm-with-smote-and-model-optimization
This project analyzes health and lifestyle factors influencing heart attack risk using statistical methods and machine learning, with Ridge Regression identified as the best predictive model.
classification data data-preprocessing data-science data-visualization gridsearchcv machine-learning python roc-curve smote svm
Last synced: 10 Apr 2025
https://github.com/istinnew/eniac_ab_insight
Dive into a comprehensive analysis aimed at boosting iPhone 13 sales by optimizing the Click-Through Rate (CTR) of the “SHOP NOW” button, compare different button designs and determine the most effective strategy for increasing engagement.
ab-testing data data-analysis data-engineering data-science data-visualization google googlecolab libraries python testing testing-tools visual-studio-code
Last synced: 29 Apr 2026
https://github.com/alimghmi/bdlc
Bloomberg API integration, handling data requests, processing, and SQL database insertion.
api-client bloomberg data data-processing financial-data oauth2 python sql-database transformation
Last synced: 10 Jun 2026
https://github.com/raghavendranhp/youtube_data_harvesting
The "YouTube Data Analyzer" is a versatile tool for businesses and content creators, enabling them to gather, analyze, and harness valuable insights from multiple YouTube channels. With streamlined data collection, storage in MongoDB, migration to SQL, and a user-friendly Streamlit interface, it empowers users to make data-driven decisions
apiintegration data datacollection eda googleapi googleapiclient matplotlib mongodb mysql mysqlconnector numpy oops pandas pymongo python pythonoops sql sqlalchemy streamlit youtube-api
Last synced: 13 Apr 2026
https://github.com/srgchrksv/stream-crypto
Crypto trades streaming with azure services
azure binance crypto data databricks dataengineering pyspark python streaming websocket
Last synced: 30 Apr 2026
https://github.com/white-gecko/lineage-dump
RDF dump of the device information from the lineage wiki
Last synced: 28 May 2026
https://github.com/sillyash/untappd-viz
A data visualisation page using public datasets and HTML/CSS/JS with D3.js.
beer beer-statistics data data-analysis data-visualization kaggle kaggle-dataset public-dataset school-project
Last synced: 18 May 2026
https://github.com/redgoose-dev/baguni
이미지를 보관하고 탐색하는 웹 프로그램
data explorer file management upload
Last synced: 14 Apr 2026
https://github.com/j-sephb-lt-n/joes_giant_toolbox
A large collection of general python functions and classes that I use in my daily work
ascii browser classifier data dataviz gcp mime nlp python regex search statistics supervised web-scraping
Last synced: 10 Oct 2025
https://github.com/stefanocoretta/aelfric-relatives
data old-english research-project
Last synced: 23 Feb 2026
https://github.com/myavuzokumus/simplemodelcomparison
This application allows users to upload datasets, handle missing data, and compare different imputation strategies.
algorithm data data-science machine-learning preprocessing streamlit
Last synced: 21 Jan 2026
https://github.com/dumkydewilde/mcp-memory-layer
A template for building your own BI MCP with dbt, LLMs and multi-user corrections
Last synced: 13 Mar 2026
https://github.com/ayush-raj8/godata
Write data to file. Standardizes the format for easy parsing and read by other programs.
Last synced: 18 Jan 2026
https://github.com/aldro61/mmit-data
The data used in the Maximum Margin Interval Trees paper
data machine-learning machine-learning-algorithms reproducible-research
Last synced: 19 Feb 2026
https://github.com/burythehammer/foosbot-results
Foosball results for the OpenCredo foosbot
data foosball machine-learning python
Last synced: 13 Apr 2026
https://github.com/miraclx/split-merge
Efficient, flexible data stream chunker and merger
chunk data efficient merge middleware nodejs pipeline split stream
Last synced: 07 May 2026
https://github.com/patrickdavies100/pipeline38
An application to automate the creation and execution of SQL queries.
data pandas-dataframe pipeline postgresql psycopg2 sqlalchemy
Last synced: 30 Apr 2026
https://github.com/afnanenayet/ds-a
Some interview prep I've been doing. This repo is reimplementations of algorithms and data structures in Python3
algorithms data interview prep python structures
Last synced: 05 Apr 2025
https://github.com/ayushverma135/dbms-labfile
Created for practical learning, this DBMS lab file offers hands-on exercises covering SQL queries, normalization, indexing, and more. With clear instructions and sample datasets, students gain invaluable experience in database design and management.
Last synced: 04 Feb 2026
https://github.com/davorg/towerbridge
When is Tower Bridge lifting?
data hacktoberfest london perl web-scraping
Last synced: 25 Oct 2025
https://github.com/nikolatechie/spotify-playlist
Data pipeline that fetches recently played songs in the past 24 hours using Spotify API and saves the data in the SQLite database. Scheduled to run daily using Apache Airflow.
apache-airflow api data data-engineering python spotify sql sqlite
Last synced: 30 Apr 2026