An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/dahsie/machine_learning_from_scratch

This project aims to implement some machine learning basic techniques(e.g. MinMaxScaler, StandardScaler, TD-IDF, PCA, Logistic Regression, LDA, KNN, Naive Bayes Classifier) using only pyton, numpy and pandas. This will enable me to have hone my data scientist skills

classification clustering data data-processing datascience machienlearning nlp nltk numpy pandas python regression

Last synced: 04 May 2026

https://github.com/rod-persky/sungrowdatacollector

Data collector for a SunGrow SG8.0RT Inverter

data opentelemetry sungrow

Last synced: 19 Jan 2026

https://github.com/redgoose-dev/baguni

이미지를 보관하고 탐색하는 웹 프로그램

data explorer file management upload

Last synced: 14 Apr 2026

https://github.com/theopenwebjp/theopenweb-data-loader

Package for loading data to local project

data downloader import javascript typings

Last synced: 10 Oct 2025

https://github.com/bastianolea/mineduc_desvinculacion

Tasas de incidencia de desvinculación de estudiantes de enseñanza básica y media, por año, comuna y género.

chile comunas data educacion social tiempo

Last synced: 10 Oct 2025

https://github.com/pietrapaz/bootcamp_dio_ciencia_de_dados

Bootcamp Potência Tech powered by iFood | Ciência de Dados - Dio ⚠️

cienciadedados dados data datascience python

Last synced: 09 Apr 2025

https://github.com/thesfinox/dup-backup

Simple script to backup data with Duplicity to a personal WebDAV server.

backup bash data duplicity script server webdav

Last synced: 28 Apr 2026

https://github.com/code-str8/time-series-forecasting

Developing a model that effectively forecasts the unit sales of numerous items across various Favorita stores with precision.

data dataanalysis forcasting machine-learning time-series visualizations

Last synced: 31 Mar 2025

https://github.com/kaiepi/ra-annotations

Thread-safe static buffer

data type

Last synced: 13 Jul 2025

https://github.com/azkarmoulana/winter-of-data-2019

:snowflake: :snowman: Winter of Data is coming..... :wolf:

data data-science machine-learning mathematics

Last synced: 05 Feb 2026

https://github.com/chowington/bg-counter-tools

A set of tools that can pull data from Biogents BG-Counter smart mosquito traps and convert them into a Darwin Core compliant format.

bg-counter biogents darwin-core data internet-of-things mosquito-prevalence population-dynamics

Last synced: 10 Oct 2025

https://github.com/badranalyst/data-professional-survey-breakdown-power-bi-dashboard

This project presents an interactive Power BI dashboard analyzing data professionals' insights. Key focus areas include job satisfaction, challenges in entering the data field, career priorities, demographics, and more. The visualization helps uncover trends and factors impacting data professionals globally.

charts dashboard dashboards data data-cleaning data-visualization dataset dax power-bi powerbi

Last synced: 23 Feb 2026

https://github.com/syed-bakhtawar-fahim/dsa_algorithm_code

Assalam o Alikum Guys, This is the repo of Data Structure and Algorithm in C programming language. I hope it will help you in learning Data Structure and Algorithm in C. I'm also learning Data Structure and algorithm in Python in better and easy way you can also explore it

algorithm algorithms-and-data-structures c data data-structures-and-algorithms dsa-algorithm dsa-learning-series dsa-practice

Last synced: 12 Apr 2025

https://github.com/myavuzokumus/simplemodelcomparison

This application allows users to upload datasets, handle missing data, and compare different imputation strategies.

algorithm data data-science machine-learning preprocessing streamlit

Last synced: 21 Jan 2026

https://github.com/axetroy/stone

build data stuck like a stone, Sturdy!

axetroy data stone stuck

Last synced: 04 Jul 2025

https://github.com/rohitblaze10/netflix_analysis_using_tableau

The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.

data data-analysis data-science data-visualization netflix tableau

Last synced: 04 Feb 2026

https://github.com/getconversio/dig-the-data

Data visualizations for the Conversio blog

d3 data data-visualization

Last synced: 12 Apr 2026

https://github.com/yash-chauhan-dev/sf_analytics

Business teams often rely on data analysts to extract insights using SQL. This tool eliminates that dependency by bridging the gap between humans and data using AI.

aiml analytics data dbt langchain llm python snowflake streamlit

Last synced: 07 May 2026

https://github.com/berviantoleo/bervdata

Temporary data definition as db

data

Last synced: 01 Apr 2025

https://github.com/dev88jerry/cs304

Bishop's University - CS304 Data Structures

bishops bu data data-structures python structure university

Last synced: 11 Jun 2026

https://github.com/aldro61/mmit-data

The data used in the Maximum Margin Interval Trees paper

data machine-learning machine-learning-algorithms reproducible-research

Last synced: 19 Feb 2026

https://github.com/ghomashudson/ao3_style_change

Style change detection dataset using AO3 fics

ao3 data dataset datasets fanfiction long-document style-change-detection

Last synced: 11 Oct 2025

https://github.com/mitevpi/poli-parse

Political news scraping & NLP parsing from web pages.

data election javascript library module nlp npm package parse politics scrape sentiment

Last synced: 13 May 2026

https://github.com/brayflex/spy-sector-rotation-google-sheet

Creates a dynamic spreadsheet to visualize SPY and it's 11 largest sector ETFs. See market trends and identify potential sector rotation opportunities.

data etf google-sheets index price rotation script sector spreadsheet spy stock-market

Last synced: 29 Jun 2026

https://github.com/dhruvil-26/tableau-projects

This repository contains Tableau visualization projects focused on data analysis across different domains. Projects include: 1. IPL Visualization - Insights into IPL match, Team and player statistics. 2. EV Analysis - Visualizations exploring the adoption of electric vehicles. 3. Road Accident Analysis - Analysis of road accident patterns

analysis data data-analysis data-analytics electric-vehicles ipl road-accident-analysis tableau tableau-public

Last synced: 19 Jan 2026

https://github.com/palutz/rust_nextstep

A series of exercises to play with more advanced topics in Rust

data deltalake graphql grpc p2p protobuf rust rust-lang xml

Last synced: 01 May 2026

https://github.com/sorairolake/japanese-era-dataset

日本の元号のデータセット / Dataset of the Japanese era

data dataset date japanese-calendar japanese-era json toml wareki yaml

Last synced: 01 May 2026

https://github.com/mr-chang95/udacity-starbucks-challenge

Data Science Project for Udacity's Data Scientist Program. Using Python in Jupyter Notebook.

data data-science data-visualization numpy pandas sklearn

Last synced: 14 Apr 2026

https://github.com/cassandrajm/reddit-dashboard

INTERACTIVE DASHBOARD: Analyzing Political Discourse on Reddit: A Multi-Faceted NLP Approach to Toxicity, Bias, and Political Stance

capstone data data-analysis data-science politics python reddit

Last synced: 09 Apr 2025

https://github.com/sajjad425/missingvalue

This repository provides a guide on handling missing values in Python, covering identification methods, imputation techniques (mean, median, mode, fill, interpolation), advanced methods (KNN, multiple imputation), and best practices. It includes practical examples for both numerical and categorical data.

data data-analysis-python data-science missing-value-handling missing-value-imputation

Last synced: 04 Apr 2025

https://github.com/elimu-ai/ml-event-simulator

🤖 Simulation of learning events and assessment events

data learning-analytics machine-learning ml

Last synced: 28 Feb 2025

https://github.com/denisecase/dc-mailer

Send an email using Python

alerts data email python streaming

Last synced: 11 Apr 2025

https://github.com/sebastianbrzustowicz/github-data

Java + Spring Boot. Application for sending requests to GitHub API and collecting received data.

api ci data github json junit mapping parallel repository rest-api stream

Last synced: 01 May 2026

https://github.com/acovaci/orbit

ORBIT: an Open source Rust-based implementation of a data Build Tool, inspired by DBT

cargo clap-rs data data-warehouse dbt rust rust-lang tokio-rs

Last synced: 16 Mar 2025

https://github.com/jneidel/animal-names

Dataset of 100 common animal names

animals data dataset json names opendata

Last synced: 25 Mar 2025

https://github.com/eshitakundu/disease-outbreak-predictor

Disease Outbreak Predictor: A Streamlit-based web application for predicting diabetes, heart disease, and Parkinson's disease using machine learning models.

data data-science disease-prediction healthcare-application jupyter-notebook machinelearning ml notebook prediction python streamlit streamlit-webapp

Last synced: 01 May 2026

https://github.com/equinor/sumo-wrapper-python

Thin python wrapper to interact with Sumo API

analytics data fmu python subsurface sumo

Last synced: 19 Jan 2026

https://github.com/keminghe/osu

Unofficial and publicly-available NPM data-package about The Ohio State University.

college data majors ohio-state organizations public students university unofficial

Last synced: 06 Jan 2026

https://github.com/thedevreda/jadaerospace

A Real life project showing how to improve selling aircraftparts and helping salers to focus more on effective products at JadAero

data data-analysis data-cleaning data-visualization jupyter-notebook powerbi python

Last synced: 02 Aug 2025

https://github.com/robwiederstein/covid-19-ky

Monitor US covid-19 cases w/ Johns Hopkins data

data data-visualization leaflet plotly r shell

Last synced: 02 May 2026

https://github.com/anct-cartographie-nationale/mednum-cli

✨ Interface en ligne de commande pour la transformation des données de lieux de médiation numériques collectées dans un format non standard vers le schéma de la mednum et leur publication sur data.gouv

anct betagouv data donnees gouvernement mediation-numerique nodejs open-data transformation

Last synced: 02 Aug 2025

https://github.com/bkataru/spotigo

AI-powered local music intelligence platform with a task runner server core to retrieve and backup spotify account data to storage(s) at set periodic intervals

ai backup cron data go intelligence local-llm music ollama rag runner spotify task-runner tool-calling

Last synced: 16 Jan 2026

https://github.com/entropyorg/p5-data-testimage

:notebook::camera: interface for retrieving test images

cpan data image-analysis

Last synced: 29 May 2026

https://github.com/thanhleviet/vietnam_antibiotics_bidding

This repo contains data of bidding for multiple drugs and antibiotics reported to Vietnam Ministry of Health in 2015, 2016, 2017.

antibiotics data vietnam

Last synced: 23 Feb 2026

https://github.com/filipnet/infoscreen

Arduino subscribes values by MQTT and view info on an OLED I2C display

arduino data display i2c mqtt oled-display-ssd1306 visualization weather weatherstation

Last synced: 12 Apr 2026

https://github.com/plurid/defocus

Apophatic User Content Resolution [Desearch Concept]

data

Last synced: 08 Nov 2025

https://github.com/plurid/delog

Cloud Service for Centralized Logging

cloud data logging

Last synced: 08 Nov 2025

https://github.com/igor-starostenko/sabre

Slice your files like a champ with **sabre**

data golang package

Last synced: 28 Mar 2025

https://github.com/nadahamdy217/movies-data-etl-using-python-gcp

Developed a comprehensive ETL pipeline for movie data using Python, Docker, and a GCP Pub/Sub emulator. Successfully processed and published the data in a local Docker environment, showcasing advanced data engineering skills.

analytics data data-engineering data-ingestion data-preparation data-preprocessing data-processing data-project docker etl etl-pipeline gcp matplotlib matplotlib-pyplot numpy pandas pubsub python scipy seaborn

Last synced: 06 Jan 2026

https://github.com/team-hydrogen/nasa-adc-data

All files relating to the computation of the data provided

data jupyter-notebook nasa-app-development-challenge

Last synced: 25 Mar 2025

https://github.com/0xnu/nfl-picks

NFL match prediction with scores using historical data (1999-Present).

american-football data nfl prediction

Last synced: 12 Oct 2025

https://github.com/lorenzobloise/client_satisfaction_classification

Jupyter notebook in which satisfaction from clients reviewing European hotels is analyzed using Python libraries such as pandas, numpy and scikit-learn. Various classification models are trained and tested to predict client satisfaction.

classification data data-mining jupyter jupyter-notebook machine-learning pandas python

Last synced: 21 Feb 2026

https://github.com/ournet/ournet.web.data

Ournet web data module

data ournet web

Last synced: 04 Apr 2025

https://github.com/drzax/light-up-brisbane

Where, what and why various public places in Brisbane are lit up.

brisbane data git-scraping

Last synced: 19 Jan 2026

https://github.com/0xhericles/spamdetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 02 May 2026

https://github.com/etmendz/mendz.data

Provides tools and guidance for creating data access contexts and repositories.

context data datasettings entity-framework mendz paginginfo repository resultinfo

Last synced: 11 Jun 2025

https://github.com/tyriek-cloud/nyc-dca-etl

Created an ETL pipeline to merge two CSV files (converted to JSON) into a parquet file using Azure Data Factory, The data was extracted from NYC Open Data: https://opendata.cityofnewyork.us/ and I created a Blob Container within an existing storage account.

azure azure-data-factory blob-storage data data-engineering etl-pipeline

Last synced: 21 Jan 2026

https://github.com/jhpoelen/bees

Content-based iDigBio prototype

biodiversity data ecololgical informatics provenance

Last synced: 18 Mar 2026

https://github.com/rbreeze/dashboard

My personal health dashboard, with daily stats on food and sleep. Undergone several redesigns since 2015.

css dashboard data data-visualization design front-end google-sheets google-sheets-api health html javascript personal-health-record personal-website running static static-site visualization

Last synced: 02 May 2026

https://github.com/flexiui-labs/flexi-grid

Flexi Grid is an advanced, lightweight, and customizable Angular 19 data grid component

angular data filter grid search select sort table

Last synced: 14 May 2026

https://github.com/bhojpur/dlm

The Bhojpur DLM is a software-as-a-service product used for Data Lifecycle Management based on Bhojpur.NET Platform for data delivery.

data lifecycle-management

Last synced: 19 Feb 2026

https://github.com/luminati-io/httpx-web-scraping

Web scraping using HTTPX in Python, covering setup, advanced features, comparisons with Requests, and more.

beautifulsoup data html httpx python web-scraper web-scraping

Last synced: 13 Oct 2025

https://github.com/plurid/datasign

Single Source of Truth Data Contract Specifier

data file-format

Last synced: 08 Nov 2025

https://github.com/srvanderplas/statistical_atlas

Framed Charts and the Statistical Atlas of 1870

census data ggplot2 graphics r statistics visualization

Last synced: 29 May 2026

https://github.com/prajakta1321/streetml-a-cityscape-traffic-volume-prognostication

StreetML leverages ML learning techniques to revolutionize urban traffic prediction through precise volume prognostication, aiming to enhance cityscape mobility through data-driven insights.

catboostregressor data datavisualisation exploratory-data-analysis lightgbm-regressor linearregression machine-learning machine-learning-algorithms predictive-analytics random-forest-regression xgboost-regression

Last synced: 08 Apr 2025

https://github.com/mubashirsidiki/olympics-data-enigeering

Worked with Azure Data Factory, Databricks, Data Lake Storage, and Synapse Analytics to build an ETL pipeline for processing and analyzing Olympic Games data from Kaggle.

analytics azure big-data data dataengineering devops pipeline

Last synced: 02 May 2026

https://github.com/jacoblincool/moodle-export

A streamlined library for retrieving data from Moodle.

data moodle

Last synced: 07 May 2025

https://github.com/thingston/extractor

Collection of PHP classes to extract data from HTML pages.

data html php

Last synced: 14 Jan 2026

https://github.com/zulfachafidz/telco_churn_insight_customer_loss_prediction_with_random_forest_and_decision_tree-algorithms

The main problem in the business world is customer churn, or losing customers, especially in the telecommunications industry, which experiences very tight competition. To overcome this problem, an analysis was carried out to help the company understand how many customers have the potential to switch providers.

data data-science data-visualization dataanalysis dataanalyst dataanalytics datadrivenwithdataprovider decision-tree decision-tree-classifier decision-trees random-forest random-forest-classifier

Last synced: 01 May 2026

https://github.com/donghquinn/gopandas

gopandas

data go golang

Last synced: 14 Oct 2025

https://github.com/lohithgsk/dynamic-qr-generator

A Python-based QR generator application was developed using the qrcode and Pillow libraries, dynamically generating QR codes for custom data inputs. Designed for a college grievance management system, the application creates QR codes containing block, floor, room, and machine numbers, allowing easy placement and identification on each floor.

data pillow python qrcode qrcode-generator

Last synced: 16 Mar 2025

https://github.com/lightdash/quickstart-github

Instant analytics for Github

analytics business-intelligence data dbt github

Last synced: 14 Sep 2025