An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/justinyahin/wpdf

Create, filter, sort and display users data on your WordPress site.

data filtering wordpress

Last synced: 18 Apr 2026

https://github.com/ffatahillah7/eda-dsf-dibimbing-titanic-accident

Data Science Fair 3.0 Dibimbing Portofolio - Analyctics and Learning from titanic dataset

data numpy pandas python science seaborn

Last synced: 17 Apr 2026

https://github.com/etmendz/mendz.data.sqlserver

Provides a generic Mendz.Data-aware context for ADO.Net-compatible access to SQL Server databases.

ado-net context data database datasettings mendz sql-server

Last synced: 10 May 2026

https://github.com/ahmad-ali-rafique/decision-tree-regressor-modeling

Comprehensive exploration of decision tree regressors, including data cleaning, model building, and performance evaluation on various datasets.

artificial-intelligence data data-analysis dataanalytics decision-trees decisiontreeregressor modeling models regression-models

Last synced: 17 Apr 2026

https://github.com/yuvrajsaraogi/sales-prediction-using-python

Sales prediction involves estimating future product sales based on factors like advertising spend, target audience, and platform. Businesses rely on data scientists to forecast sales and optimize advertising costs. Machine learning in Python can be used for this task.

data data-analysis data-science data-visualization machine-learning matplotlib natural-language-processing numpy pandas prediction python sales-prediction-using-python sql

Last synced: 19 Apr 2026

https://github.com/bhavanachitragar/layoff_analysis

This Streamlit app is designed for Layoff Analysis. It allows users to explore and analyze layoff data from different perspectives, including overall analytics, country-specific insights, and individual company details.

data dataanalysis streamlit streamlit-webapp

Last synced: 18 Apr 2026

https://github.com/als8446/tripleten-data-science-projects

Projects Overview Projects made in the Data Scientist course from TripleTen LatAm

data data-analysis hypothesis-tests machine matplotlib numpy pandas python scipy sklearn

Last synced: 10 Apr 2026

https://github.com/codbex/codbex-hestia-data-sample

Sample data for codbex-hestia

data module sample

Last synced: 05 Apr 2026

https://github.com/lafkpages/minecraft-crafting-info

Scrapes https://www.minecraftcrafting.info for crafting recipes.

api crafting data minecraft

Last synced: 17 Jun 2026

https://github.com/prakashjha1/loan-eligibility-prediction

This repository contains the codebase and resources for a machine learning-based project aimed at predicting loan eligibility for individuals. The project utilizes various algorithms and data preprocessing techniques to build predictive models that assess the likelihood of an applicant being eligible for a loan based on historical data.

data data-visualization exploratory-data-analysis loan-prediction-analysis machine-learning-algorithms naive-bayes-classification parameter-tuning python random-forest

Last synced: 19 Apr 2026

https://github.com/carlosrs14/parallel-data-preprocessig-system

A parallel data preprocessing system using threads and synchronization mechanisms (barrier, busy-waiting, condition variables) to clean and prepare data for AI training.

barrier-method c condition-variable data operative-systems parallel-computing posix preprocessing synchronization threads

Last synced: 24 Jul 2025

https://github.com/snimmagadda1/luigi-etl-example

🔍 Example of an ETL pipeline using Spotify's Luigi

data luigi luigi-pipeline python spotify

Last synced: 30 Mar 2025

https://github.com/prashhhant213/data_analysis_and_visualization-_for_streaming_platform

Data Analysis and Visualization for streaming platform to provide insights and recommendations to improve their userbase.

colab-notebook data datavisualization matplotlib numpy pandas python seaborn

Last synced: 20 Apr 2026

https://github.com/zhukovanan/stepik_

The completed tasks of different data or computer science related fields on stepik

data statistical-learning statistics stepik-course

Last synced: 21 Apr 2026

https://github.com/davorg/towerbridge

When is Tower Bridge lifting?

data hacktoberfest london perl web-scraping

Last synced: 29 Jun 2026

https://github.com/stefen-taime/llm-rag-mtl-public-hospital

Ce projet développe un modèle de type Retrieve-Augment-Generate (RAG) pour répondre aux questions en utilisant les données publiques des avis laissés sur Google pour des hôpitaux à Montréal

data google-reviews hopital hospital hub ia llm montreal open-source quebec rag

Last synced: 21 Apr 2026

https://github.com/schijioke-uche/data-analysis-with-python-an-spss-model

With this Python notebook algorithm, you can use SPSS Model notebook to build machine learning pipelines that you can use to iterate rapidly during the model building process in data analysis. Whether you're trying to find the right algorithm or experimenting with different ways of preparing your data, you can create reproducible research that's easily understood by any member of your team with Hypothesis definition.

anova cp4a cp4d cp4i cp4s data ibm ibm-cloud jeffrey-chijioke-uche jeffrey-solomon-chijioke-uche openshift python python3 redhat t-test

Last synced: 22 Apr 2026

https://github.com/grimen/python-humanizer

A human/developer friendly value humanizer - for Python.

data debug debugging format formatting humanize humanizer log logging print printing value

Last synced: 05 Jun 2026

https://github.com/ppatrzyk/heatmap

Display CSV as a heatmap in terminal

csv data data-visualization terminal

Last synced: 24 Apr 2026

https://github.com/yuvrajsaraogi/-iris-flower-classification

Iris flower has three species; setosa, versicolor, and virginica, which differs according to their measurements. Now assume that you have the measurements of the iris flowers according to their species, and the task is to train a machine learning model that can learn from the measurements of the iris species and classify them.

classification data data-analysis data-science data-visualization flower flower-classification iris iris-classification iris-flower iris-flower-classification knn knn-classification machine-learning machine-learning-algorithms ml natural-language-processing nlp python

Last synced: 24 Apr 2026

https://github.com/hruth-vik/sales-analysis-report

SalesScope is a powerful sales analytics dashboard that extracts insights, reveals trends, and drives strategy from raw data.

analytics data powerbi-report powerbi-visuals python

Last synced: 24 Apr 2026

https://github.com/rylan12/apscores

A quick way to visualize how the AP score distributions have changed from year to year.

advanced-placement analysis ap-exam data scores

Last synced: 19 Jun 2026

https://github.com/piyushkumar2025/india-general-elections-2024_data-analyst

Analyzed election data for 540+ constituencies and 100+ parties using SQL. Calculated state-wise seat distributions, classified 30+ parties into alliances, identified top 10 candidates by EVM votes, calculated victory margins, and analyzed voting patterns for 300+ candidates to uncover key insights.

analytics data database mysql sql statistics

Last synced: 22 May 2026

https://github.com/xjwllmsx/hacker-news-engagement

Analyze Hacker News data to reveal which post types and posting hours spark the most discussion, using Python and a reproducible Jupyter notebook.

data data-analysis jupyter python

Last synced: 25 Apr 2026

https://github.com/carlos-levi/twitterbots_analise_redesneurais

Projeto para a disciplina de IA - análise exploratória e aplicação de técnicas de aprendizado de máquina para detectar contas automatizadas (bots) na plataforma 𝕏 (Twitter)

data machine-learning twitter-bot

Last synced: 06 Jun 2026

https://github.com/badranalyst/data-cleaning-and-exploratory-data-analysis-project

This project uses SQL to clean and analyze a layoffs dataset. Data cleaning tasks include removing duplicates, standardizing values, and handling missing data. Exploratory analysis is performed to identify trends in layoffs across companies, industries, and time periods.

cleaning-data data database dataset mysql mysql-database sql

Last synced: 07 Apr 2025

https://github.com/tsbarr/citi-bikes-challenge

Citibikes NYC Data Analysis: Uncover insights from over a decade of ride data. Jupyter notebook for data aggregation/cleaning & Tableau dashboards for interactive visualization.

data data-visualization pandas-python python tableau

Last synced: 27 Apr 2026

https://github.com/bilgehangecici/datatypeconverter

Converting integer and floating numbers to appropriate bit-level representation.

data datatypeconverter java machine-level variables

Last synced: 30 Mar 2025

https://github.com/gurpreet0022/crop-fertilizers-recommendation-system-using-ml-

This repository is a part of AICTE - Shell Internship on 'Green Skills using AI technologies' Cycle 3.

data datapreprocessing datavisualization jupyter-notebook machine-learning python

Last synced: 27 Apr 2026

https://github.com/bhumitbedse/machine-learning-projects

AI Machine learning Deep learning Computer vision NLP Projects with code

computer-vision data data-science deep-learning machine-learning natural-language-processing python

Last synced: 27 Apr 2026

https://github.com/tacticalnuclearraccoon/dataviz_with_js

Sample data vizualisation as part of a training on Javascript Frameworks for dataviz

d3 data datawrapper echarts javascript visualization

Last synced: 27 Apr 2026

https://github.com/oguzhanfatihkucuk/data-analytics-project-kafka-spark

The data in this project was collected in a database using Apache Kafka and processed with Apache Spark Streaming. The project aims to create a forecasting model and analyze sales forecasts per customer.

big-data data data-visualization hadoop kafka ml mlpipeline plt pyhton spark

Last synced: 28 Apr 2026

https://github.com/leonardomusini/mbe-growth-nexus-converter

Python tool to convert laboratory text files into NeXus files for Molecular Beam Epitaxy (MBE) data.

data data-engineering nexus python

Last synced: 28 Apr 2026

https://github.com/sirmaxx/log_manager

log manager services for microservices

data fastapi logging microservice mongodb

Last synced: 09 Apr 2026

https://github.com/roggersanguzu/weather-medical-expense-prediction-ml-models

This repo contains a model for determining the rainfall patterns and another for medical expense prediction model

data data-analysis data-science datasets joblib machine-learning machine-learning-algorithms scikitlearn-machine-learning

Last synced: 30 Aug 2025

https://github.com/kitpymes/netcore-serialize-data

El objetivo es resguardar datos secretos encriptando y serializando archivos .json y convertirlos en archivos .dat.

csharp data decrypt encrypt json net netcore2 serialize

Last synced: 29 Apr 2026

https://github.com/gcoronelc/uni-epies-das-2022-2

Curso de Análisis y Diseño de Sistemas en UNI-EPIES.

dao data datos gcoronelc java jdbc mvc mvc-pattern sql sqlserver

Last synced: 29 Apr 2026

https://github.com/sn0wfree/factor_table

an universal connector for all kind data source and manage all kind data as factor type by one package

connector data database factor

Last synced: 29 Apr 2026

https://github.com/barkintopcu/apple-stock-prediction-edu

The purpose of this project is to demonstrate time series analysis techniques using real-world stock data, without offering any form of financial advice or investment suggestion.

data deep-learning forecasting machine-learning python

Last synced: 29 Apr 2026

https://github.com/arunabhagit/data-driven-e-commerce-sales-analysis

This is a E-Commerce Data Analysis and finding out insights with statistical charts using Python . I have Used pandas , plotly libraries to show insights and statistical Charts .

analysis charts data insights pandas-python plotly python statistics

Last synced: 29 Apr 2026

https://github.com/tazeenrashid/orders-analysis-using-python-sql-server-and-tableau

I sourced some Orders data through Kaggle; did EDA using Python and then fetched some insights out of cleaned data using SQL Server (SSMS). Then, I built a Tableau Dashboard for some visual insights. Have a look and share your feedback!

analytics data eda jupyter-notebook python sql tableau

Last synced: 29 Apr 2026

https://github.com/svetlanam/kbl-to-csv-s3

Keboola extractor, that converts excel to CSV based on input mapping criteria and upload to S3 bucket

data data-cleaning data-transformation etl keboola s3-bucket

Last synced: 20 Jun 2026

https://github.com/axnjr/csv-parser-utils

My own Pandas in Go, Python & Rust, Utility methods for Handling CSV Files in Core Go & Rust with bindings for python.

csv data dataanalysis datatools go golang golang-application pandas python rs rust

Last synced: 29 Apr 2026

https://github.com/gvatsal60/ds-on-kaggle

A collection of data science projects, experiments, and insights from Kaggle competitions and datasets

data data-science data-visualization numpy pandas python3

Last synced: 29 Apr 2026

https://github.com/patrickdavies100/pipeline38

An application to automate the creation and execution of SQL queries.

data pandas-dataframe pipeline postgresql psycopg2 sqlalchemy

Last synced: 30 Apr 2026

https://github.com/omarsaad21/it-salary-eda

A python EDA project implemented on IT department salaries data we made data exploration and made data visulization for some questions on dataset

data explotary-data-analysis juypter-notebook numpy pandas python visualization

Last synced: 30 Apr 2026

https://github.com/g-schumacher44/analyst_resource_hub

A collection of guidebooks, quickref, and resources for data analysis

analytics bigquery data lookerstudio machine-learning model python sql yaml-configuration

Last synced: 20 Jun 2026

https://github.com/raphcodec/rand-org-generator

Rand-Org-Generator attempts mimic real company structures. The dummy data generated by this project is intended to be used in analytics projects or web projects.

data duckdb factory-boy faker org-chart polars python3

Last synced: 30 Apr 2026

https://github.com/miguelmedinacastro/trabalho-dados-r

Trabalho final da disciplina Análise Exploratória de Dados

data data-science data-science-projects data-visualization database r rstudio

Last synced: 01 May 2026

https://github.com/dantetrb/diabetes-readmission-dbt

Predictive analytics on diabetic patient readmissions using dbt, DuckDB and Python – with explainability and clustering.

clustering data dataengineering dbt diabetes duckdb hdbscan healthcare jupyter lime readmission-prediction sql

Last synced: 01 May 2026

https://github.com/shauryauppal/mydatatoolkit

A toolkit for data scientists to get work done faster, easier, and in a smarter way.

analytics awesome-list data data-science hacktoberfest

Last synced: 08 Jun 2026

https://github.com/skygenesisenterprise/aether-meet

Aether Meet is a lightweight, open-source client built for privacy, speed, and seamless integration within the Aether Office ecosystem

applications data docker javascript meeting nextjs notes typescript voip

Last synced: 01 May 2026

https://github.com/vbshuliar/ktor-http-request-response

This project is part of my Android Development Specialization provided by Meta on Coursera. In this project I practised HTTP requests and responses using Ktor.

android compose data http https json kotlin ktor request response

Last synced: 01 May 2026

https://github.com/sandygcabanes/etl-earthquake-data-from-usgs-google-cloud-composer-airflow

Airflow, Google Cloud Composer, GCS, BigQuery, Python. This automated pipeline pulls daily earthquake data from a trusted public source, stores it securely in the cloud, and organizes it into clean, searchable tables for analysis.

cloud composer dag data engineering etl etl-pipeline google json python

Last synced: 01 May 2026

https://github.com/muhammadadilnaeem/bcg-data-science-job-simulation-on-forage-august-2024

This repository contains all the tasks, code, and documentation completed during the BCG Data Science job simulation on The Forage platform. The simulation focused on analyzing customer churn, building predictive models, and presenting insights for a major utility company.

bcg customer-churn-prediction-with-machine-learning data data-science forage numpy pandas

Last synced: 01 May 2026

https://github.com/mtnzorlu/quiz-content-builder

Structured JSON quiz data builder for developers

builder data education json vue

Last synced: 23 Jun 2026

https://github.com/0xhericles/spamdetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 02 May 2026

https://github.com/franckalbinet/maris-crawlers

Automated data harvesting of MARIS data sources

automation data marine-radioactivity

Last synced: 25 Aug 2025

https://github.com/jesuscc1993/data-cleaner-extension

Clears browser data in a single click.

application-data chrome chrome-extension data

Last synced: 02 May 2026

https://github.com/asacxyz/flutter_aplicando_persistencia_de_dados

Para acompanhamento do curso Flutter: aplicando persistência de dados

dart data data-storage flutter persistence persistent-storage sqflite sql sqlite

Last synced: 03 May 2026

https://github.com/bastianolea/sinim_municipal_genero

Datos comunales de género del Sistema Nacional de Información Municipal

chile comunas data genero laboral tiempo

Last synced: 23 Jun 2026

https://github.com/fallaciousreasoning/nz-mountains

A list of mountains in NZ, scraped from https://climbnz.org.nz

alpine climbing climbnz data json json-api maps mountaineering scraping

Last synced: 04 May 2026

https://github.com/soham7998/data-analysis-projects

My Data Analysis Projects which are completed by me and gain a hands on Experience from each project. the project showcase different Concepts , Visualization and many things.

data data-analysis data-science machine-learning nlp python soham visualization

Last synced: 04 May 2026

https://github.com/dimitryzub/russo-ukraine-war-prediction-losses

Highlights rusian losses with predictions based on historic data from Ministry Defence of Ukraine 🐱‍👤

data dataanalysis dataanalytics matplotlib pandas prophet python

Last synced: 04 May 2026

https://github.com/jdanielgoh/cobertura-campanias

En una democracia ¿caben todas las voces? Proyecto para visualizar el monitoreo de radio y TV que realiza el INE de las candidaturas presidenciales 2024

d3js data datavisualization vue

Last synced: 09 Jun 2026

https://github.com/parzibyte/jsonp-php

Ejemplo de JSONP con PHP

data example json jsonp php request

Last synced: 04 May 2026

https://github.com/gabya06/twitter_models

Repository used for twitter impression models

data data-science impressions machinelearning python ridge-regression sklearn twitter

Last synced: 04 May 2026

https://github.com/srevenant/data-science-alpine

A docker container for data science, using alpine linux and python3

alpine data numpy pandas python3 science scipy xgboost

Last synced: 05 May 2026

https://github.com/guardias-eu/reasin

Interface to the European Alien Species Information Network API

api biodiversity biodiversity-data biodiversity-informatics data invasive-species oscibio r r-package

Last synced: 04 Oct 2025

https://github.com/edjoukou/pizza-sales-report

A data analysis project using SQL with MySQL database

analysis data mysql powerbi visualization

Last synced: 05 May 2026

https://github.com/contawo/travel-journal

This is a travel journal application for storing all the places that you have visited. I was learning by doing react when creating this project. I learnt a lot with it and upgraded my reactjs skills.

data learning-by-doing props reactjs

Last synced: 05 May 2026

https://github.com/rdmurphy/deno-quaff

A port of the quaff Node.js library to Deno.

archieml csv data deno json toml yaml

Last synced: 05 May 2026

https://github.com/muthupillai1204/diwali_sales_analysis

The Diwali sales analysis reviews past data to identify trends, peak buying times, popular products, and customer demographics. It assesses sales volume, revenue growth, and promotional effectiveness, helping businesses optimize marketing and inventory for future seasons.

data datacleaning eda excel jupyter-notebook matlplotlib numpy pandas python seaborn visualization

Last synced: 05 May 2026

https://github.com/mito-ds/mitosheet_helper_config

The mitosheet_helper_config package used by enterprises to configure the mitosheet package.

data data-analytics data-science data-visualization jupyter pandas python

Last synced: 05 May 2026

https://github.com/julienmalka/shiftgenerator

ShiftGenerator WeSki 2018

data data-science latex python

Last synced: 06 May 2026

https://github.com/parthds02/analyzing-student-success-with-data

Discover key factors influencing student performance through data analysis and visualization. Explore gender, parental education, sports, and ethnicity impacts.

data datascience jupyter-notebook kaggle python pythonlibraries

Last synced: 06 May 2026

https://github.com/ekoepplin/dbt-bigquery-core

How to get data to BigQuery (or duckDB) and setup dbt tests for SODA cloud monitoring

bigquery data data-quality dbt dlt duckdb gcp soda

Last synced: 06 May 2026

https://github.com/juanpablodiaz/beertv

A Next.js Full Stack app to displays funny Beer TV Ads

api-routes data next tailwindcss

Last synced: 07 May 2026

https://github.com/hudson-newey/data-miner

A simple data miner that collects information from an API and stores it in a file

api api-client big-data bigdata data logger logging

Last synced: 10 Jun 2026

https://github.com/tjas/postgrad-ai-ddv-plotly

Jupyter Notebook to analyze the salaries of Federal District government public servants, using Python, Pandas and Plotly Express, to solve the proposed exercise in "Data Discovery and Visualization" discipline.

analysis analytics data data-analytics data-discovery data-science data-visualization graph graphs jupyter-notebook jupyter-notebooks pandas plotly plotly-express python

Last synced: 07 May 2026

https://github.com/zsvoboda/olympics

Self service analytics of 120 years of Olympics data

analytics dashboards data datavisualization dataviz olympics open-data open-datasets opendata reports

Last synced: 08 May 2026

https://github.com/writetome51/page-load-access

A TypeScript/Javascript class that loads a batch (array) of data from a larger set too big to be loaded all at once.

batch class data javascript load loader typescript

Last synced: 16 May 2026

https://github.com/taquece/goals-per-match

basic script to calculate average football goals per match from .CSV

beginner csv data football nodejs python sports-analytics

Last synced: 09 May 2026