An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/bastianolea/minsal_suicidios

Casos de intento de suicidio y suicidio consumado en Chile

chile comunas data genero salud tiempo

Last synced: 19 Jan 2026

https://github.com/chowington/bg-counter-tools

A set of tools that can pull data from Biogents BG-Counter smart mosquito traps and convert them into a Darwin Core compliant format.

bg-counter biogents darwin-core data internet-of-things mosquito-prevalence population-dynamics

Last synced: 10 Oct 2025

https://github.com/ikcede/hinge-data-ts-wrapper

Typescript wrapper for exported Hinge data

data hinge typescript

Last synced: 10 Oct 2025

https://github.com/myavuzokumus/simplemodelcomparison

This application allows users to upload datasets, handle missing data, and compare different imputation strategies.

algorithm data data-science machine-learning preprocessing streamlit

Last synced: 21 Jan 2026

https://github.com/alexmcvay/uber-data

UBER sql clone

data data-visualization sql

Last synced: 19 Jan 2026

https://github.com/mcraiha/datagensharp

C# managed library for generating data

csharp data generator

Last synced: 11 Aug 2025

https://github.com/ashita-ai/ashita-ai.github.io

Ashita AI - The island of misfit data tools

ai data

Last synced: 19 Feb 2026

https://github.com/0xnu/nfl-picks

NFL match prediction with scores using historical data (1999-Present).

american-football data nfl prediction

Last synced: 12 Oct 2025

https://github.com/mikeschinkel/go-testdata-defaulter

Simple package for Go to set table-driven test data defaults so that tables in tests only need include data that differs from defaults.

data defaults package testing tests

Last synced: 13 Oct 2025

https://github.com/ahmad-ali-rafique/heart-disease-detection-model

A comprehensive project for detecting heart disease using machine learning, including data processing, model training, and evaluation metrics with AUC curve analysis.

artificial-intelligence data datascience heart-disease machine-learning modeling prediction-model

Last synced: 11 Aug 2025

https://github.com/srindot/fwuav-average-flight-data-collection

This repository is designed for collecting average data for a flapping wing UAV. The script acg_coeff_data_collection.py runs the necessary data collection, and the resulting data is saved into a CSV file called AverageFlightData.csv.

data flaping-uav

Last synced: 10 Aug 2025

https://github.com/digital-media/cv_data

Datasets used for courses/tutorials at the Digital Media Department

computer-vision data image-processing images

Last synced: 14 Oct 2025

https://github.com/isandyawan/simplelinearregression

A application to analyze data using simple linear regression. This application can make regression model from variable and give advice to user if the model break regression assumsion

data linear r regression rstudio shiny statistic

Last synced: 14 Oct 2025

https://github.com/mominurr/fire-gas-leak-detection-system

A real-time fire prevention system integrating IoT sensors and computer vision to trigger evacuations.

ai computer-vision data datascience machine-learning ml python yolo

Last synced: 27 Jan 2026

https://github.com/jpcurada/exploralytics

A python package for creating intermediate plotly visualizations

data eda plotly python visualization

Last synced: 05 Feb 2026

https://github.com/ometman/vet-clinic

This is a database project for vetinary data management for animals, owners, clinic employees and visits; and applicable to any data management need. It uses Postgresql, a relational database management system. It allows storing, updating and querying.

data database normalization postgresql postgresql-database queries sql sql-server-database tables transactions

Last synced: 13 May 2026

https://github.com/fabsdevx/files-to-database-loader-handout

Data Engineering project for learning purposes. Credits to itversity

csv data data-engineering database json pandas python

Last synced: 09 Apr 2026

https://github.com/mchenryspagg/wrangle-and-analyze-data

This project which is known as 'wrangle and analyze data' involves the wrangling of WeRateDogs twitter archive data from the period of 2015 to 2017

api data dataanalysis datacollection datawrangling datetime json numpy os pandas pil python requests tweepy-api visualization

Last synced: 09 Apr 2026

https://github.com/sourceduty/clock_metadata

🕒 Recording time data and statistical metadata to .csv files.

clock data data-science metadata practice python time timing

Last synced: 08 Aug 2025

https://github.com/psgebeline/harvard-data-science

My work for the nine courses in Harvard's data science program, each with notes/assignments. Work in progress.

data linear-regression machine-learning modeling probability-theory r visualization wrangling

Last synced: 19 Oct 2025

https://github.com/parvezk/d3-fundamentals

D3 library API fundamentals

charts d3 data graphs visualization

Last synced: 19 Oct 2025

https://github.com/sourceduty/data_marketer

💰 Analyze uploaded data and prepare a data marketing plan for selling data. Create data product plans.

ai ai-data ai-tool artificial-intelligence business chatgpt company custom-gpt customgpts data data-business data-market data-marketer data-marketing data-tool gpt gpt-store gpts gptstore openai

Last synced: 03 Sep 2025

https://github.com/Alpine418/DataHandler

Data handler for PHP arrays.

data data-handler php73

Last synced: 01 Oct 2025

https://github.com/dilkushsingh/webscraping-with-selenium-and-beautifulsoup

Web Scrapped a popular tech gadgets website using Selenium and BeautifulSoup, also performed Data Analysis on scrapped data.

beautifulsoup data datacleaning datagathering eda exploratory-data-analysis python selenium webscraping

Last synced: 24 Feb 2026

https://github.com/politicaargentina/opinar

📈 ICG toolbox for R - Indice de Confianza en el Gobierno 🇦🇷 (Universidad Torcuato Di Tella)

argentina data political-science politics public-opinion

Last synced: 22 Oct 2025

https://github.com/ddeepanshu-997/support_vector_regression--svr-

In this repository i performed a support vector regression on real life data , initially i performed some data preprocessing technique in order to filter out the data flaws then undergoes the process of model building i.e SVM regression in order to make a machine learning regression model.

data data-science regression-analysis regression-models svm-model svm-regression

Last synced: 03 Aug 2025

https://github.com/brianlesko/r_data_science_stat5730

Written by Brian Lesko, the repository contains R Scripts demonstrating data science topics largely originating from study at Ohio State. Contents are written in R studio using the R markdown file. As of 1/21/23 Future projects concerning data science, statistics, and machine learning will be in python in my machine learning Repository

data data-analysis flight-data ggplot2 olympics-data r-markdown tidyverse

Last synced: 23 Jan 2026

https://github.com/harmanveer-2546/reducing-data-entries

Way to delete data entries from csv/excel file using. For excel file, use excel instead of csv in the code.

csv data data-entry delete-data excel numpy pandas python

Last synced: 05 May 2026

https://github.com/louis-heraut/dataverseur

🫖 A dataverse API R wrapper to enhance the deposit procedure using only R variable declarations

data data-repository data-science datascience dataset dataverse dataverse-api json metadata metadata-management metadata-parser r

Last synced: 24 Oct 2025

https://github.com/uznetdev/smoking-prediction

This project focuses on analyzing the "Smoking" dataset and building a predictive model for smoking status based on various health metrics. The goal is to identify factors influencing smoking behavior and develop a reliable model for prediction.

ai classification data data-science kaggle-competition machine-learning ml roc-auc sklearn smoking

Last synced: 17 Apr 2026

https://github.com/thedevreda/jadaerospace

A Real life project showing how to improve selling aircraftparts and helping salers to focus more on effective products at JadAero

data data-analysis data-cleaning data-visualization jupyter-notebook powerbi python

Last synced: 02 Aug 2025

https://github.com/merekat/hb-oil-assets

Eine Analyse der Assetentwicklung im Zusammenhang mit schockartigen Anstiegen des Ölpreises seit des Markteintritts von Brent-Öl in 1986.

analyze asset data datajournalism oil python

Last synced: 16 Mar 2026

https://github.com/bhojpur/dlm

The Bhojpur DLM is a software-as-a-service product used for Data Lifecycle Management based on Bhojpur.NET Platform for data delivery.

data lifecycle-management

Last synced: 19 Feb 2026

https://github.com/alsult/alsult

Aliia Sultanova Portfolio

data datascience programming python

Last synced: 23 Jan 2026

https://github.com/zainea-bogdan/data_engineer_project_wowcinema

WoWCinema is a project based on a fictional scenario where I stepped into the role of a Data Engineer, designing and building an end-to-end Data Infrastructure. A ETL pipeline ingests data from multiple sources, transforms it, and loads it into a centralized PostgreSQL data warehouse to power analytics, KPI tracking, and reporting

analytics big-data data datawarehousing etl-pipeline postgres python sql

Last synced: 19 May 2026

https://github.com/theipster/property-data

Tooling to track real estate / property market events, analyse trends and generate insights.

data property real-estate

Last synced: 24 Jan 2026

https://github.com/woctezuma/hidden-gems-data

Data available to compute regional rankings of hidden gems.

data hidden-gems steam steam-reviews

Last synced: 06 Feb 2026

https://github.com/atharvapathak/twitter_sentiment_analysis_project

Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.

api bag-of-words bert cnn data gbm nltk rnn spacy twitter

Last synced: 28 Jan 2026

https://github.com/cmdrvl/rvl

rvl reveals the smallest set of numeric changes that explain what actually changed between two datasets — or confidently tells you nothing changed.

cli csv data data-quality data-validation diff finance numerical-analysis open-source ops rust tooling

Last synced: 25 Feb 2026

https://github.com/aimin-nur/data-analyst

Sebuah project Data Analyst (Mechine Learning) untuk melakukan analisa harga mobil bekas Ford berdasarkan dataset yang sudah ada, serta mengetahui apa saja feature atau kolom yang mempengaruhi harga mobil bekas Ford.

analytics data mechine-learing visualization

Last synced: 29 Jan 2026

https://github.com/aaronspindler/selfdrivingcar

Learning deep learning and making a self driving car in the process

car data deep deep-learning driving keras learning machine machine-learning python self self-driving-car

Last synced: 09 Apr 2026

https://github.com/kayahr/datastream

Data stream classes for writing and reading all kinds of data types, even single bits

data datastream input output stream typescript

Last synced: 01 Aug 2025

https://github.com/themost-framework/cache

MOST Web Framework Caching Module

cache caching data

Last synced: 12 Feb 2026

https://github.com/mreshboboyev/elastic-search-dotnet

A powerful and easy-to-use .NET library for integrating Elasticsearch, enabling fast full-text search, scalable indexing, and advanced data analytics in your applications.

analytics c-sharp data dotnet-core elastic-search full-text indexing open-source scalable search

Last synced: 30 Jan 2026

https://github.com/cunfuu/network-bubbles

For Easier to manage organizations and keeping notes about them to organize events and easy access their needs

data data-visualization organizations organizations-volunteer

Last synced: 31 Jul 2025

https://github.com/bubblymaps/bubblymaps

The open source bubbler map. Mapping the world's water fountains. Open Code, Open Data.

bubbler bubbly-maps data fountain map open-source water

Last synced: 31 Jan 2026

https://github.com/opendatach/alds

a colaborative list of resources and ideas to enable "Amt Local Data Stewards" to manage the (open) data of their respective federal office

awesome-list data datagovernance dataliteracy datamanagement datastewardship opendata opengovernmentdata

Last synced: 31 Jan 2026

https://github.com/beastbytes/n6l-phone-number-data-php

NationalPhoneNumerInterface implementation using PHP for storage

data itu-t0202 phone-number php yii3

Last synced: 08 Feb 2026

https://github.com/michaelfromyeg/lyrics

Lyric-store and API hosted on Git.

data lyrics

Last synced: 08 Feb 2026

https://github.com/bishtrishu/netflix_movies_dashboard

This project is a comprehensive dashboard for analyzing Netflix movies and shows. Using a combination of Power BI, Python, and Excel, this dashboard provides insights into various aspects of Netflix's content library.

ai artifical-intelligense dashboard data dataanalysis dataanalyst dataanalytics datacleaning datahandling datascience datavisualization excel machine-learning msexcel powerbi report

Last synced: 09 Feb 2026

https://github.com/neurazum-ai-department/tumor-stages-dataset---v1

Synthetic MRI data generated by the ‘HF’ and 'Vbai' models based on real data.

brain data dataset datasets image mri neuroscience tumor tumor-segmentation

Last synced: 18 Mar 2026

https://github.com/haroontrailblazer/machine_learning

About This Repository A curated resource hub for learning machine learning, featuring tutorials, code examples, datasets, and hands-on projects to build foundational skills and explore real-world applications.

data data-analysis data-visualization database dataset gradient-descent machine-learning pandas python3 random-forest sklearn statistics

Last synced: 16 Apr 2026

https://github.com/dysnomia-studio/achieve-games-dump

Dump parts of achieve.games database to public including Steam Games List

data dump games steam steam-api steam-game steam-games

Last synced: 27 Feb 2026

https://github.com/softloud/spunk

Nutritional interventions for male infertility: a systematic review and meta-analysis

cochrane data evisynth living

Last synced: 18 Mar 2026

https://github.com/davorg/towerbridge

When is Tower Bridge lifting?

data hacktoberfest london perl web-scraping

Last synced: 29 Jun 2026

https://github.com/vatshayan/songs-datasets

Datasets for Songs and Music for Dancing, Emotional, Happy and scenic view

1000dataset classfication csv data datapackage datapackages dataset datasets excel free freedata freedatasets genre machine music sgenre song songs

Last synced: 18 Mar 2026

https://github.com/utrechtuniversity/momentum-dataflow

Repository for publishing website about data management practices of the Momentum project

data datageneration datamanagement

Last synced: 27 Feb 2026

https://github.com/sweta-kaundilya/power-bi-learning-projects

This repository contains completed exercises while learning Power BI

data datavisualization dax powerbi powerquery

Last synced: 27 Feb 2026

https://github.com/praveendecode/retail-revenue-forecasting

Designed an end-to-end ML model pipeline, forecasting department-wide sales by accounting for holiday markdown effects, spanning data collection to inferencing.

azure collection data datapreprocessing docker exploratory-data-analysis feature-engineering featureimportance model modelbuilding modeldeployment modelselction python report tableau

Last synced: 16 Apr 2026

https://github.com/kunalthakur204/visualization-on-flower

🌸 Flower Dataset Visualization Visualizing patterns and relationships in flower data through charts and plots. Perfect for exploring floral characteristics and trends! 📊

data data-visualization dataanalysis flowerdataset python

Last synced: 16 Apr 2026

https://github.com/vianneymi/amplifai

Amplifai is a package that allows you to transform your raw unstructured text into structured data in a few lines of codes.

data data-mining extraction langchain llm pydantic

Last synced: 27 Feb 2026

https://github.com/soenneker/soenneker.dtos.requestdataoptions

A flexible request options object for paging, sorting, and filtering queryable data, similar to OData-style parameters.

controller coordinator csharp data dotnet dto dtos http manager object odata options request requestdataoptions

Last synced: 12 Mar 2026

https://github.com/sumaiyyaf/british-airline-dashboard

This Tableau dashboard visualizes British Airways customer reviews, showcasing key metrics like average ratings for service, entertainment, and seat comfort. It features interactive filters for exploring ratings by aircraft type, country, and traveler type, along with trend analysis over time.

analysis dashboard data tableau visualization

Last synced: 13 Feb 2026

https://github.com/sanand0/iss-location

Tracks the International Space Station position. A demo of how to use GitHub Actions to schedule commits weekly.

data

Last synced: 14 Feb 2026

https://github.com/e-kotov/albofr-data-archive

Tiger Mosquito Colonisation in France data

aedes-albopictus colonisation data france tiger-mosquito

Last synced: 23 May 2026

https://github.com/gusenov/open-data-scripts

Scripts to explore public datasets. Скрипты для работы с открытыми данными.

charts data data-visualisation data-visualization datavisualization highcharts kazakhstan open-data opendata qazaqstan

Last synced: 28 Feb 2026

https://github.com/nia-cloud-official/influx-agents

Influx-CRD is a web application designed to facilitate data collection, recovery, and distribution for agents uploading data to a centralized database. It provides an intuitive interface for managing data collection from various sources, recovering lost or corrupted data.

broker collection data data- influx influx-agent

Last synced: 30 Jul 2025

https://github.com/anuppm9917/data-processing-and-csv-to-json-using-python-project

This project guides you through processing data from CSV to JSON format using Python. You'll learn to cleanse, validate, and transform data with pandas, numpy, csv, and json libraries, ensuring it's ready for POS system integration. This will help improve data integrity and streamline integration.

csv-files data data-analysis data-cleaning data-collection data-transformation data-validation python3 transformation

Last synced: 16 Apr 2026

https://github.com/thomasjewson/cci-data-science-textbook

This is a short, interactive textbook aimed at introducing data science to non-IT university undergraduates. Funded by Erasmus+.

data data-science learning python textbook

Last synced: 16 Apr 2026

https://github.com/arjunrao87/world-countries-graphql-api

GraphQL API for retrieving information about countries of the world

countries data database geographic-data geography graphql world

Last synced: 10 May 2026

https://github.com/derhuerst/uic-codes

UIC country codes.

data dataviz i18n transit

Last synced: 05 Mar 2026

https://github.com/jwszolek/accelerated-data-generator

Ultra-fast random data generator. It gives you an ability to generate almost 1M of rows in around second.

bash csv data data-generator generator shell

Last synced: 02 Apr 2026

https://github.com/rawdaabdelsalam42/data-cleaning-sql-python-powerbi

Data cleaning project for an e-commerce sales dataset using Python (Pandas) for preprocessing, SQL Server for queries, and Power BI for building an interactive dashboard visualization.

dashboard data data-engineering pandas powerbi python sql

Last synced: 17 Apr 2026

https://github.com/umrlastig/global-local

The Global-Local loop: bridging the gap between geospatial communities

challenges communities data fusion gaps geospatial perspectives

Last synced: 03 Apr 2026

https://github.com/madhuresh2011/50-days-sql-challenge

Start a 50days-sql-challenge journey to SQL mastery and transform how we interact with data!

consistency data data-analytics database problem-solving query question-answering real-world-data sql

Last synced: 03 Jun 2026

https://github.com/shsiddhant/womens-wc

ML project to predict match outcomes for Women's Cricket World Cup 2025.

cricket-prediction data feature-engineering postgresql python

Last synced: 04 Apr 2026