An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/rohitblaze10/netflix_analysis_using_tableau

The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.

data data-analysis data-science data-visualization netflix tableau

Last synced: 04 Feb 2026

https://github.com/metapsy-project/data-psychosis-psyctr

Database of psychological interventions for schizophrenia and psychosis compared to control conditions.

data

Last synced: 16 Mar 2026

https://github.com/elissorokin/data-analyst-portfolio

Это репозиторий, в котором я демонстрирую свои навыки, делюсь проектами и отслеживаю прогресс в области анализа данных и Data Science.

ab-testing data data-analysis datalense matplotlib numpy pandas plotly portfolio postgresql python scipy seaborn sql statistical-analysis

Last synced: 09 Apr 2026

https://github.com/rabeal21/tea

Generate random TEA wallet addresses in bulk with this simple utility. Perfect for testing and exploring the TEA blockchain. 🌱💻

bucklescript bucklescript-tea chinese-translation cli data earlgrey educators hacking ios-automation ios-test ocaml peer-evaluations php red-team teachyourselfcs test-framework translation tui

Last synced: 04 May 2026

https://github.com/miriswisdom/coral.bells

Guiding and Reassuring Safety, Holistically and Empathetically

civic community data engagement govhack open safety

Last synced: 28 Jan 2026

https://github.com/karashiiro/lodestone-character-data-scraper

Lodestone character data scraper.

data ffxiv ffxiv-character lodestone

Last synced: 23 Apr 2026

https://github.com/tomquirk/sunshine-coast-council-rates-data

Rates data for the Sunshine Coast, Australia

australia data property rates real-estate

Last synced: 24 Feb 2026

https://github.com/johndelatto/automate-your-job-search-ai-applies-to-1000-positions

Automate Your Job Search: AI Applies to 1000 Positions Overnight & Get 100+ Interviews! In today’s fast-paced and highly competitive job market, finding and securing your dream job can be both time-consuming and exhausting.

ai data non-profit open-ai open-source

Last synced: 28 Jan 2026

https://github.com/dimitryzub/russo-ukraine-war-prediction-losses

Highlights rusian losses with predictions based on historic data from Ministry Defence of Ukraine 🐱‍👤

data dataanalysis dataanalytics matplotlib pandas prophet python

Last synced: 04 May 2026

https://github.com/getconversio/dig-the-data

Data visualizations for the Conversio blog

d3 data data-visualization

Last synced: 12 Apr 2026

https://github.com/zainea-bogdan/data_engineer_project_wowcinema

WoWCinema is a project based on a fictional scenario where I stepped into the role of a Data Engineer, designing and building an end-to-end Data Infrastructure. A ETL pipeline ingests data from multiple sources, transforms it, and loads it into a centralized PostgreSQL data warehouse to power analytics, KPI tracking, and reporting

analytics big-data data datawarehousing etl-pipeline postgres python sql

Last synced: 19 May 2026

https://github.com/word2vect/beijing-new-house-data-visualization

Beijing New House Data Visualization for Python Programming 2024 Fall Data Visualization Lab

data python visualization

Last synced: 13 Jun 2026

https://github.com/raulmaulidhino-dev/ml_modelling_regression

There are many factors that influence the grades/scores of students. One of the factors is study hours. In this mini analysis project, there are 3 models that will learn and predict the relation between study hours of students and their scores in an exam/test. This project will result the best ML model to solve the problem.

data data-analysis-python data-science eda machine-learning scikit-learn

Last synced: 28 Jan 2026

https://github.com/yash-chauhan-dev/sf_analytics

Business teams often rely on data analysts to extract insights using SQL. This tool eliminates that dependency by bridging the gap between humans and data using AI.

aiml analytics data dbt langchain llm python snowflake streamlit

Last synced: 07 May 2026

https://github.com/adilsaid64/real-time-data-monitoring

Exploring what a real-time data drift monitoring solution could look like within MLOps

data datadrift grafana machine-learning mlops mlops-workflow prometheus python software-engineering

Last synced: 04 Aug 2025

https://github.com/ddeepanshu-997/support_vector_regression--svr-

In this repository i performed a support vector regression on real life data , initially i performed some data preprocessing technique in order to filter out the data flaws then undergoes the process of model building i.e SVM regression in order to make a machine learning regression model.

data data-science regression-analysis regression-models svm-model svm-regression

Last synced: 03 Aug 2025

https://github.com/iamfrerot/userverse

creating api for data analysis

data data-analytics spring-boot users

Last synced: 23 Mar 2025

https://github.com/emanoelcampos/power-bi-fundamentals

Datacamp's Power BI Fundamentals Skill Track

data data-analyst data-analyst-power-bi datacamp power-bi powerbi

Last synced: 24 Jan 2026

https://github.com/dataglyder/data-analysis-tools-to-get-you-started

This repository describes a few tools for a beginner Data Analyst.

analytics data python r sql

Last synced: 18 Apr 2026

https://github.com/semcod/code2llm

Python Code Flow Analysis Tool - Static analysis for control flow graphs (CFG), data flow graphs (DFG), and call graph extraction

ast cfg code code2data code2logic code2process data dfg diagram flow graphs llm

Last synced: 01 Jun 2026

https://github.com/sanchittechnogeek/overscripted-analysis

Geolocation and user language extraction analysis from Mozilla Overscripted dataset

analysis data data-analysis mozilla

Last synced: 23 Mar 2025

https://github.com/atharvapathak/twitter_sentiment_analysis_project

Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.

api bag-of-words bert cnn data gbm nltk rnn spacy twitter

Last synced: 28 Jan 2026

https://github.com/word2vect/beijing-pm2.5-data-process

Beijing PM2.5 Data Process for Python Programming 2024 Fall Data Visualization Lab 2

data python visualization

Last synced: 15 Jun 2026

https://github.com/eva-kaushik/data-clustering

Clustering Accelerators for hard and soft clustering, including implementations of K-means, K-medoids, hierarchical clustering, fuzzy C-means, and Gaussian mixture models. Demonstrates text clustering using both hard and soft clustering algorithms.

clustering clustering-algorithm data datascience machine-learning-algorithms

Last synced: 09 Apr 2025

https://github.com/srevenant/data-science-alpine

A docker container for data science, using alpine linux and python3

alpine data numpy pandas python3 science scipy xgboost

Last synced: 05 May 2026

https://github.com/kasunjayasanka/simple-backend-database-data-retrieval

Simple HTML form with inserting and retrieving data from Firebase Realtime Database

bootstrap css3 data firebase firebase-realtime-database html5 insert-data javascript retrieve-data

Last synced: 05 May 2026

https://github.com/bhar2254/sobershift

Simply attendance tracking application

data form ifc jambi java qt tracking utility

Last synced: 05 May 2026

https://github.com/thicclatka/tetration

New file format for tensors

cli data fileformat mmap tensors

Last synced: 26 May 2026

https://github.com/badranalyst/covid-deaths-and-vaccinations-sql-data-exploration

This project involves exploratory data analysis on COVID-19 deaths and vaccinations data using SQL. It aims to uncover trends, patterns, and insights related to vaccination rates and their impact on mortality. The analysis provides a clearer understanding of the pandemic's dynamics, facilitating data-driven decisions in public health.

covid-19 data data-exploration dataset sql

Last synced: 19 Feb 2026

https://github.com/spatialcurrent/go-counter

Simple library and command line program for generating frequency distributions.

big-data bigdata data

Last synced: 29 Jan 2026

https://github.com/audeering/datasets

Data cards for public audb datasets

audb audio data management

Last synced: 29 Jan 2026

https://github.com/anct-cartographie-nationale/mednum-cli

✨ Interface en ligne de commande pour la transformation des données de lieux de médiation numériques collectées dans un format non standard vers le schéma de la mednum et leur publication sur data.gouv

anct betagouv data donnees gouvernement mediation-numerique nodejs open-data transformation

Last synced: 02 Aug 2025

https://github.com/aimin-nur/data-analyst-model-predictive

Sebuah Project data analyst yang bertujuan untuk mengindentifikasi karakteristik customer untuk menerima penawaran campaign marketing.

analyst data mechine-learning visualization

Last synced: 29 Jan 2026

https://github.com/denisecase/dc-mailer

Send an email using Python

alerts data email python streaming

Last synced: 11 Apr 2025

https://github.com/acovaci/orbit

ORBIT: an Open source Rust-based implementation of a data Build Tool, inspired by DBT

cargo clap-rs data data-warehouse dbt rust rust-lang tokio-rs

Last synced: 16 Mar 2025

https://github.com/tomcardoso/journalism-data-intersection

A talk on working at the intersection of journalism and data science

data data-journalism journalism

Last synced: 15 May 2025

https://github.com/ashishsingh789/titanic_dataset_eda_and_visualization

This repository contains an exploratory data analysis (EDA) of the Titanic dataset. Key analyses include survival rates by gender, passenger class, age distribution, family size, and correlation heatmaps.

data data-science dataanalysis matplotlib numpy pandas pandas-dataframe python seborn visualisation

Last synced: 11 Apr 2026

https://github.com/contawo/travel-journal

This is a travel journal application for storing all the places that you have visited. I was learning by doing react when creating this project. I learnt a lot with it and upgraded my reactjs skills.

data learning-by-doing props reactjs

Last synced: 05 May 2026

https://github.com/chenxingqiang/modeling_tabular_data

# modeling_tabular_data | Keywords: modeling_tabular_data focusing on modeling_tabular_data.

data modeling tabular

Last synced: 30 Jan 2026

https://github.com/rosacarla/databases

Bases de dados utilizados em atividades práticas do MBA Data Analytics do IGTI.

data data-analytics dataset

Last synced: 19 Mar 2026

https://github.com/bearaujus/bdatamatrix

Structured Tabular Data Management in Go

data go golang matrix

Last synced: 30 Jan 2026

https://github.com/restricted/redis-data-cache

TypeScript implementation of data cache management by class name

cache data object redis state typesript

Last synced: 30 Jan 2026

https://github.com/jneidel/animal-names

Dataset of 100 common animal names

animals data dataset json names opendata

Last synced: 25 Mar 2025

https://github.com/lut-ful/pizza-sales-report

This Pizza Sales Report provides valuable insights into sales performance through detailed analysis and visualizations. By leveraging Power BI and SQL Server

data data-wrangling microsoft-sql-server power-bi power-bi-dax python

Last synced: 30 Jan 2026

https://github.com/rubyonworld/ldpath

This is a ruby implementation of LDPath, a language for selecting values linked data resources.

data ldpath resource ruby

Last synced: 12 Nov 2025

https://github.com/reubano/devcraft-workshop

Materials for the DevCraft workshop on stream processing

data functional-programming meza python riko stream-processing tutorial

Last synced: 04 May 2026

https://github.com/jrmi/pyfiles

Big file collection manager

big-data data opendata

Last synced: 31 Jan 2026

https://github.com/illustratien/toolphd

Make your analysis simple and reproducible

academic analysis data phd publications r r-package reproducible-research scientific

Last synced: 26 Jan 2026

https://github.com/team-hydrogen/nasa-adc-data

All files relating to the computation of the data provided

data jupyter-notebook nasa-app-development-challenge

Last synced: 25 Mar 2025

https://github.com/82luli02/sakila_dvd_rental_database_analysis

Analysis of the Sakila DVD Rental database using SQL

data data-analysis data-science data-visualization sql

Last synced: 10 Mar 2026

https://github.com/mierune/tinybufr

[WIP] A Rust library for decoding BUFR (Binary Universal Form for the Representation of meteorological data) files.

bufr data meteorology rust weather wmo

Last synced: 15 May 2025

https://github.com/ournet/ournet.web.data

Ournet web data module

data ournet web

Last synced: 04 Apr 2025

https://github.com/plurid/datasign

Single Source of Truth Data Contract Specifier

data file-format

Last synced: 08 Nov 2025

https://github.com/charlenry/python_data_science

Mes notebooks de travaux pratiques sur Python pour la Data Science

analysis data dataframe jupyter kaggle matplotlib notebook numpy pandas pyplot python science seaborn visualisation

Last synced: 25 Jun 2026

https://github.com/pythoncoderunicorn/jamesbeardaward

a repo for James Beard Award data

data dataset jamesbeard

Last synced: 07 Feb 2026

https://github.com/rissh/titanicsurvivalpredictionusingml

Predicting Titanic passenger survival through machine learning. This project includes data preprocessing, exploratory data analysis, feature engineering, and model training using Python. 🚢

data data-analysis data-science data-visualization dataanalysis jupiter-notebook machine-learning machine-learning-algorithms machinelearning matplotlib numpy pandas prediction prediction-model python python3 seaborn tenserflow tflearn titanic

Last synced: 01 Feb 2026

https://github.com/julienmalka/shiftgenerator

ShiftGenerator WeSki 2018

data data-science latex python

Last synced: 06 May 2026

https://github.com/okieraised/rke2-deployment

Single-node RKE2 deployment

data helm helm-charts helm-deployment rke2

Last synced: 17 Mar 2026

https://github.com/ms140569/loki-example-store

Testdata for loki password manager

data

Last synced: 26 Feb 2026

https://github.com/mahtabranjbar/onlineshopping_analysis_dashboard

This project analyzes online shopper behavior using various machine learning models and EDA techniques.

dashboard data dataanalysis eda machine-learning streamlit

Last synced: 08 Feb 2026

https://github.com/ksm26/ml-ai-data-science-jobs-in-canada

Explore the latest machine learning, artificial intelligence, and data science job opportunities in Canada. Stay informed about Canadian tech job market trends and find your next career move.

ai-canada ai-careers canada canadian-tech-companies canadian-tech-job-market data data-analysis data-engineering data-science data-science-careers machine-learning prompt-engineering robotics

Last synced: 06 May 2026

https://github.com/dineshram0212/youtube-analysis

This YouTube Analysis Package provides tools for analyzing YouTube video data, including metrics on views, likes, comments, and engagement trends. Ideal for gaining insights into video performance and audience interaction patterns.

data data-visualization pandas python webscraping youtube-api-v3

Last synced: 19 Jun 2026

https://github.com/etmendz/mendz.data

Provides tools and guidance for creating data access contexts and repositories.

context data datasettings entity-framework mendz paginginfo repository resultinfo

Last synced: 11 Jun 2025

https://github.com/tadiusfrank2001/pythonprojects

Compilation of Some Fun Introduction to Python Lab Coding Projects introducing the foundamentals of data science, databases, and pythonlibraries

data data-science databases gamedesign python pythonlibrarires sorting-algorithms sqlite string-manipulation

Last synced: 06 May 2026

https://github.com/srvanderplas/statistical_atlas

Framed Charts and the Statistical Atlas of 1870

census data ggplot2 graphics r statistics visualization

Last synced: 29 May 2026

https://github.com/rrwen/twitter2mongodb-cli

Command line tool for extracting Twitter data to MongoDB databases

api cli cmd command data database get interface line mdb media mongo mongod mongodb post social stream tool tweet twitter

Last synced: 06 May 2026

https://github.com/bishtrishu/netflix_movies_dashboard

This project is a comprehensive dashboard for analyzing Netflix movies and shows. Using a combination of Power BI, Python, and Excel, this dashboard provides insights into various aspects of Netflix's content library.

ai artifical-intelligense dashboard data dataanalysis dataanalyst dataanalytics datacleaning datahandling datascience datavisualization excel machine-learning msexcel powerbi report

Last synced: 09 Feb 2026

https://github.com/myles-parfeniuk/esp32_sdlogger

C++ esp-idf driver component for SD cards interfaced via SPI. WIP

card data esp-idf esp32 logger sd sdcard sdmmc sdspi spi

Last synced: 09 Feb 2026

https://github.com/ralzz/dibimbing_datascience

This project contains an Exploratory Data Analysis (EDA) of the Estonia Passenger List dataset. I handled missing values, removed duplicate data, and created basic visualizations to find insights.

data data-science eda google-colab kaggle pandas python

Last synced: 06 May 2026

https://github.com/iv4n-ga6l/functional-dataprocessing-pipeline

A functional data processing pipeline that accepts an input file, allows specifying both input and output formats, applies specified transformations, and produces a resulting output file.

csv data datapreprocessing excel json pandas parquet pipeline python

Last synced: 06 May 2026

https://github.com/jigyasag18/airline-performance-and-passenger-satisfaction-project-using-big-data-analytics

This project analyzes 10 years of U.S. domestic airline data (~3GB) using Hadoop (Cloudera) and Hive for data processing. Power BI dashboards visualize key metrics like delays, on-time rates, air time, and diversions. The solution includes Hive queries, DAX measures, HDFS ingestion scripts, and year-wise insights with recommendations.

big-data big-data-analytics bigdata cloudera cloudera-hadoop cloudera-hadoop-framework data data-analysis data-visualization database hadoop hive power-bi powerbi powerbi-dashboard powerbi-dashboards powerbi-report powerbi-visuals powerbi-visuals-tools powerbidashboard

Last synced: 01 Aug 2025

https://github.com/adamouization/python-machine-learning-data-science-notes

:orange_book: Jupyter notebooks containing useful Python code and notes for general Machine Learning and Data Science projects.

data data-science data-visualization guide jupyter jupyter-notebook machine-learning matplotlib notes numpy pandas pandas-dataframe python seaborn

Last synced: 11 Apr 2026

https://github.com/enescidem/twitter-topic-modeling

Topic modeling is an unsupervised method to identify topics in text. This project analyzes tweets from prominent Turkish accounts to uncover underlying themes in their shared content.

data data-science machine-learning nlp topic-modeling twitter x

Last synced: 10 Feb 2026

https://github.com/kena0ki/dddl

generates test Data from DDL.

data database db ddl generator sql table test

Last synced: 30 Apr 2026

https://github.com/paladini/aa-daily-reflections-database

Alcoholics Anonymous (AA) Daily Reflections in English, Spanish, French and Brazilian Portuguese

aa alcoholics-anonymous daily-reflections data database reflections

Last synced: 16 Apr 2026

https://github.com/prajakta1321/streetml-a-cityscape-traffic-volume-prognostication

StreetML leverages ML learning techniques to revolutionize urban traffic prediction through precise volume prognostication, aiming to enhance cityscape mobility through data-driven insights.

catboostregressor data datavisualisation exploratory-data-analysis lightgbm-regressor linearregression machine-learning machine-learning-algorithms predictive-analytics random-forest-regression xgboost-regression

Last synced: 08 Apr 2025

https://github.com/vatshayan/songs-datasets

Datasets for Songs and Music for Dancing, Emotional, Happy and scenic view

1000dataset classfication csv data datapackage datapackages dataset datasets excel free freedata freedatasets genre machine music sgenre song songs

Last synced: 18 Mar 2026

https://github.com/utrechtuniversity/momentum-dataflow

Repository for publishing website about data management practices of the Momentum project

data datageneration datamanagement

Last synced: 27 Feb 2026