An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/unkaktus/pktconn

wrapper around io.ReadWriteCloser that implements gopacket's 'device'

connection data gopacket packet

Last synced: 29 May 2026

https://github.com/chaewonkong/kaggle-competitions

kaggle competitions and lessions

ai data kaggle-competition ml

Last synced: 15 Mar 2025

https://github.com/dahsie/machine_learning_from_scratch

This project aims to implement some machine learning basic techniques(e.g. MinMaxScaler, StandardScaler, TD-IDF, PCA, Logistic Regression, LDA, KNN, Naive Bayes Classifier) using only pyton, numpy and pandas. This will enable me to have hone my data scientist skills

classification clustering data data-processing datascience machienlearning nlp nltk numpy pandas python regression

Last synced: 04 May 2026

https://github.com/rikiitokazu/dataprojects

Data analysis practice using SQL and Python

data python sql web-scraping

Last synced: 12 Apr 2026

https://github.com/theanujsinha01/mcdonalds-customer-analysis

This project analyzes customer feedback data to understand what drives people to like or dislike McDonald’s. Using Python and data visualization tools in a Jupyter Notebook, we explore how different factors—such as taste, price, health, and visit frequency—affect customer satisfaction.

case-study data data-visualization dataanalysis

Last synced: 05 Sep 2025

https://github.com/ashishsingh789/titanic_dataset_eda_and_visualization

This repository contains an exploratory data analysis (EDA) of the Titanic dataset. Key analyses include survival rates by gender, passenger class, age distribution, family size, and correlation heatmaps.

data data-science dataanalysis matplotlib numpy pandas pandas-dataframe python seborn visualisation

Last synced: 11 Apr 2026

https://github.com/sakan811/honkai-star-rail-a-few-fun-insights-with-data-analysis

The project gives insights that delve into the Honkai Star Rail's character's stats of all available characters as of the given date.

data data-analysis data-science data-visualization docker flask game honkai honkai-star-rail honkai-starrail seaborn webscraping webscraping-data webscraping-selenium

Last synced: 10 Jun 2026

https://github.com/mateogiuffra/estrd2024s1

trabajos prácticos realizados en la materia Estructura de Datos de la Universidad Nacional de Quilmes (UNQ)

c cpp data data-structures-and-algorithms eficiency functional-programming haskell unq

Last synced: 12 May 2026

https://github.com/lotfiferaga/instagram-reach-analysis

The Instagram Reach Analysis project aims to develop a Python-based tool to analyze the reach and engagement metrics of Instagram posts.

analytics data data-science datavisualization python

Last synced: 18 Jun 2026

https://github.com/rishikesh-jadhav/track_deep_learning

Data collected from the Udacity simulator comprising RGB images with steering and throttle annotations for each frame, specifically gathered for behavioral cloning purposes.

data datacollection udacity-self-driving-car

Last synced: 03 Jan 2026

https://github.com/praxtube/dogg

CLI tool to log data manually

data data-logger log logger

Last synced: 10 Jun 2026

https://github.com/suryadev99/stream_processing_website_click_data

Stream Processing of website click data using Kafka and monitored and visualised using Prometheus and Grafana

clickdata data dataengineering docker flink-kafka flink-metrics flink-stream-processing git grafana kafka kafka-streams kafka-topic prometheus psql python

Last synced: 10 Mar 2026

https://github.com/so-cool/junction

My solution to the University of Bristol "Bristol Journey Time" Data Challenge https://So-Cool.github.io/junction

competition data modelling timeseries

Last synced: 02 Apr 2025

https://github.com/mecha-cms/x.route

Custom route files.

custom data extension file folder path route url

Last synced: 23 Mar 2025

https://github.com/etmendz/mendz.data.oracle

Provides a generic Mendz.Data-aware context for ADO.Net-compatible access to Oracle databases.

ado-net context data database datasettings mendz oracle

Last synced: 13 Apr 2026

https://github.com/unknownsoup/budget_tracker

A personal budget tracker to build my knowledge of working with databases and data analysis. In this case using SQL and python for the analysis.

data data-science databases python sql

Last synced: 26 Jan 2026

https://github.com/ayush-raj8/godata

Write data to file. Standardizes the format for easy parsing and read by other programs.

data golang

Last synced: 18 Jan 2026

https://github.com/white-gecko/lineage-dump

RDF dump of the device information from the lineage wiki

data dataset lineageos rdf

Last synced: 28 May 2026

https://github.com/equinor/fmu-sumo

Interaction with Sumo in the FMU context

analytics data fmu python subsurface sumo visualization

Last synced: 01 May 2025

https://github.com/quangandrei1003/france_air_pollution_pipeline

End-to-end air pollution data pipeline for French metropolitan cities using Airflow, Python, dbt, BigQuery.

airflow bigquery data data-analytics data-engineering data-modeling data-visualization dbt docker etl pandas python terraform

Last synced: 13 Apr 2026

https://github.com/codegouvfr/codegouvfr-data

🧢 Data for code.gouv.fr

bluehats codegouvfr data

Last synced: 05 Mar 2026

https://github.com/karo23361/toy-store-kpi-power-bi

PowerBI Portfolio Project

csv data data-visualization powerbi

Last synced: 03 Feb 2026

https://github.com/ngofilho/scripts-db

Repository containing several dbs scripts samples.

cache data database db mariadb mongodb mysql oracle redis sql-server

Last synced: 11 Apr 2026

https://github.com/pdoup/enegry

Time-Series dataset combining multiple sources to explain the broader Greek energy market

data dataset day-ahead-auction energy-markets exploratory-data-analysis forecasting futures-market greek-energy-market renewable-energy time-series-data weather-data

Last synced: 07 May 2025

https://github.com/charon25/weatherdata

17 000 weather measurements collected by a weather station created for a college project.

csv data dataset datasets json measurements strasbourg weather weather-data

Last synced: 16 Jan 2026

https://github.com/gabrielcsapo/bluse

⚗️ blend and fuse data with ease

data normalize utility

Last synced: 15 Mar 2025

https://github.com/cvinicius987/projetos-bigdata

Estudos de caso envolvendo projetos de BigData e Engenharia de Dados.

bigdata data data-engineering spark

Last synced: 13 May 2026

https://github.com/purarue/HPI-personal

Personal HPI modules/scripts

data history lifelogging

Last synced: 30 Mar 2025

https://github.com/ashu3291/blinkit-app-store-

conducted a comprehensive analysis of Blinkit's sales performance, customer satisfaction and inventory distribution to improve the sales performance.

cleaning-data data dataanalysis-projects powerbi-visuals powerbidashboard sql

Last synced: 05 Jan 2026

https://github.com/vlamug/ratibor

Ratibor is a service for making metrics from data

data metrics prometheus

Last synced: 10 Mar 2026

https://github.com/nmsud/formdata

🗃️ Data from the NMSUD Form submissions

api data json unification-day

Last synced: 16 May 2026

https://github.com/barbosa89/vue-table

A classical data table component in VueJS and Bootstrap 4, optimized for Laravel applications.

bootstrap4 data datatable javascript laravel php table vuejs

Last synced: 11 Apr 2026

https://github.com/nyo16/megas_pinakas

Bigtable elixir grpc client

bigtable data elixir grpc

Last synced: 13 Jan 2026

https://github.com/luminati-io/ZoomInfo-dataset-samples

A sample dataset of over 1000 ZoomInfo companies, extracted using the Bright Data API, ideal for market growth, lead generation, and market analysis.

b2b business companies data data-extraction database dataset datasets web-scraping zoominfo

Last synced: 09 Apr 2025

https://github.com/mitevpi/poli-parse

Political news scraping & NLP parsing from web pages.

data election javascript library module nlp npm package parse politics scrape sentiment

Last synced: 13 May 2026

https://github.com/m0nica/datalogues-refresh

:bar_chart: Programming blog focused on data with an emphasis on exploration in Python.

data jekyll python technical-writing

Last synced: 14 May 2026

https://github.com/gappeah/layoffs-exploratory-data-analysis

This project uses MySQL to perform data cleaning and exploratory data analysis (EDA) on a dataset detailing company layoffs. The primary goal is to process, clean, and explore the data to gain insights into trends and patterns related to layoffs across various sectors.

data dataanalysis eda mysql sql

Last synced: 12 Jul 2025

https://github.com/zulfachafidz/titanic_explorer_predicting_survival_with_classification_using_knn_algorithm

Tracking Life Safety with the KNN Predictive Analysis Approach. Leveraging the Titanic Dataset, we apply classification analysis to predict the fate of passengers based on a variety of features.

algorithm algorithms data data-analysis data-mining data-science datamodeling datapreprocessing dataset knn-algorithm knn-classification machine-learning machine-learning-algorithms prediction-model

Last synced: 01 Sep 2025

https://github.com/muhamedlabs/muhamed_onedrive

Muhamed_OneDrive - це надійне і зручне хмарне сховище для файлів, розроблене для безпечного зберігання і легкого обміну даними.

data html5 onedrive programming style

Last synced: 04 Jan 2026

https://github.com/karosi12/ng-data-share

Angular communication with input and output properties

angular communication data data-binding input output sharing typescript

Last synced: 16 Jan 2026

https://github.com/filiprokita/tobase64

This Python program encodes a file in base64 format and saves the result to a new file with a ".b64" extension. It is a command-line tool that can be used to automate file encoding tasks.

base64 command-line data data-conversion data-manipulation data-privacy data-prottection data-security encoding file file-conversion file-handling python python-script python3 tobase64

Last synced: 30 Jun 2025

https://github.com/justinjjlee/simulation-discrete

Employing data transformations and simulations to answer random questions

analytics data data-science julia python simulation spark

Last synced: 30 Apr 2026

https://github.com/fuzzt/location-analyzer

The Location Data Analyzer is a Spring Boot application that offers insights on location data, such as counting locations by type, calculating average ratings, and identifying the most reviewed and incomplete entries. It features a simple frontend (HTML, CSS, JavaScript) and is deployed on Render.

analysis api average css data deployment docker fetch-api frontend html javascript location maven ratings render restful-api reviews spring-boot techstack

Last synced: 11 Apr 2026

https://github.com/oliver021/helppad-net

Versatile .NET Toolkit: A Comprehensive Set of Miscellaneous Helpers, Classes, and Utilities

assert async checks cryptographic-algorithms data date dotnet fluent functional functional-programming hash helpers parallel pipe pipeline pointers review supports tasks

Last synced: 15 Jun 2026

https://github.com/shadeglare/genum

The ES Next tools to process data in a LINQ manner

data linq processing typescript

Last synced: 13 Apr 2026

https://github.com/mnz1365/saving-record-time-text

date saving in text file with python

data python txt-files writefile

Last synced: 18 Jul 2025

https://github.com/rishitabansal9/adult-census-income-prediction

This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.

data data-analysis data-science feature-engineering random-forest-classifier

Last synced: 25 Mar 2025

https://github.com/juangesino/research-project

Course files for Research Project @ University of Amsterdam

data data-science economics stata

Last synced: 02 Jan 2026

https://github.com/atiqurcode/scrap-spec

Scrap data from the html to table html code / json

data html-table json-data scarp

Last synced: 05 Feb 2026

https://github.com/bkestelman/dasy-ml

DaSy DataSynthesizer - Create synthetic data with desired statistical properties for machine learning research.

data data-science machine-learning

Last synced: 14 Jan 2026

https://github.com/fiedsch/data_util

misc. Utilities for data files like variable name lists

data helper management php

Last synced: 14 Jun 2025

https://github.com/poojaharihar03/wellness-cities-case-study

A case study for dats analysis of city health centers

analytics data r rstudio

Last synced: 11 Jun 2026

https://github.com/vidushibhadana/eda-on-nyc-taxi-data

About Conducting an Exploratory Data Analysis (EDA) on New York City taxi data and visualizing it through countplots, distribution plots (displot), and histograms using Python and it's libraries.

data data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 11 Apr 2026

https://github.com/buffdelta/basketball_ref_webscraper

Python package to make webscraping from basketball-reference easy

basketball data python python-library webscraping

Last synced: 14 Jan 2026

https://github.com/sakshamarora07/whatsapp-chat-analyser

This repository contains code for a WhatsApp Chat Analyzer that uses Python libraries to extract insights from chat messages.

chat data dataanalytics datascience matplotlib pandas python seaborn statistics streamlit whatsapp

Last synced: 04 Jan 2026

https://github.com/dahmansphi/analysis_from_start_to_end

The Big Bang of Data Science- Analysis from the Start to The End- [Book Two]

analysis data data-analytics data-mining data-science hypothesis-testing jamovi machine-learning

Last synced: 08 Jan 2026

https://github.com/isaacmaffeis/imad-2023

Model Identification and Data Analysis (IMAD) | University course

data data-analysis data-science model model-identification

Last synced: 09 May 2026

https://github.com/2kabhishek/pybank

Data Analysis for the silliest Bank 💰🏦

csv data data-science learning pandas python topic1 topic2

Last synced: 12 May 2026

https://github.com/illustratien/toolphd

Make your analysis simple and reproducible

academic analysis data phd publications r r-package reproducible-research scientific

Last synced: 26 Jan 2026

https://github.com/elijah-1994/pre-process-e-commerce-dataset

Importing, Cleaning, and Pre-Processing E-Commerce Data for Analysis Using MySQL.

analytics data dataanalytics datacleaning dataprocessing mysql mysql-database sql

Last synced: 11 Mar 2025

https://github.com/yeti-robotics/past-scouting-data

❄️ Scouting Data from Previous Events/Seasons ❄️

data first frc

Last synced: 06 Jan 2026

https://github.com/agustinmusanti/sqlchallenge-7

Resolución de un extenso desafío de SQL propuesto por el profesor Diego Moisset De Espanes, quien comparte ejercicios para aprender y practicar SQL Server a través de su canal de YouTube.

challenge data learning sqlserver

Last synced: 15 Apr 2025

https://github.com/kunalshelke90/kunalshelke90

💻 Machine Learning Enthusiast | Data Science Explorer | eager about solving problems with help of data.

data data-science dataanalysis database machine-learning mlops

Last synced: 06 Jul 2025

https://github.com/nafisalawalidris/nafisalawalidris

Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and Bitcoin converge.

artifical-intelligence bitcoin config data data-science developer github-config github-pages machine-learning

Last synced: 16 May 2026

https://github.com/tomcardoso/journalism-data-intersection

A talk on working at the intersection of journalism and data science

data data-journalism journalism

Last synced: 15 May 2025

https://github.com/dcmox/algorithms

General purpose data structures and algorithms

algorithms binary data hash linked list structures tree

Last synced: 10 Jun 2026

https://github.com/robthree/cfnreader

Provides a simple way to read FNIRSI's CFN files (*.cfn) produced by the FNIRSI UsbMeter tool

cfn csv data fnirsi usb usb-tester

Last synced: 01 Mar 2025

https://github.com/sanchittechnogeek/overscripted-analysis

Geolocation and user language extraction analysis from Mozilla Overscripted dataset

analysis data data-analysis mozilla

Last synced: 23 Mar 2025

https://github.com/sadratehranian/data-collection-and-machine-learning

create a model using logistic regression to predict whether the fire alarm of a smoke detector should sound or not. Second, predicts whether an electric drive in a production plant may be faulty or not.

data data-analysis data-science datacollection logistic-regression machine-learning ml nn

Last synced: 05 Jan 2026

https://github.com/stoyank7/football-prediction

This is my Semester 7 Project for my "AI for Society" minor at Fontys University of Applied Sciences.

ai betting data football machine-learning university-project

Last synced: 25 Mar 2025

https://github.com/asjadnaqvi/stata-tidytuesday

A Stata package for fetching Tidy Tuesday meta data and files

ado data r stata tidytuesday

Last synced: 13 Jun 2026

https://github.com/arkanovicz/skorm

Simple Kotlin Object Relational Mapping

data database model orm sql

Last synced: 19 Apr 2026

https://github.com/mnkanout/patients_medication_prediction

The aim of the project is to create a model that can help medical professionals select the proper medication for patients based on their symptoms. The model uses historical data of other patients to predict what could be the most suitable medication based on the patient's symptoms.

data data-analysis data-science data-visualization decision-tree-classifier machine-learning python3

Last synced: 29 Jun 2025

https://github.com/themost-framework/mysql

Most Web Framework MySQL Adapter

data database mariadb mysql orm query sql

Last synced: 07 Mar 2026

https://github.com/anuppm9917/super-store-sales-analysis-power-bi-project

My drive to know which products, regions, categories and customer segments a company should target or avoid, I search and selected an appropriate dataset on kaggle which will match a standard superstore requirement.

data data-analysis data-visualization datacleansing excel exploratory-data-analysis jupyter-notebook numpy pandas plotly powerbi python3

Last synced: 10 Apr 2026

https://github.com/mbrsagor/mysql

MySql database command line

data mysql mysql-database sql

Last synced: 14 Jun 2025

https://github.com/stdlib-js/array-base-last-index-of-same-value

Return the index of the last element which equals a provided search element according to the same value algorithm.

array data find generic index javascript locate node node-js nodejs same scan search stdlib structure types

Last synced: 13 Apr 2026

https://github.com/henryssondaniel/teacup-service-report-mysql-java

Connect your Teacup report data to a MySQL database

data logs mysql reports teacup

Last synced: 13 Apr 2026

https://github.com/krakozaure/pyzzy

Set of packages to simplify development in Python

configuration data formats json library logging logs python3 toml utils yaml

Last synced: 14 Jan 2026

https://github.com/zevio/acl

ACL Anthology corpus sample

data dataset scholarly-articles

Last synced: 01 Mar 2026