An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/mominurr/fire-gas-leak-detection-system

A real-time fire prevention system integrating IoT sensors and computer vision to trigger evacuations.

ai computer-vision data datascience machine-learning ml python yolo

Last synced: 27 Jan 2026

https://github.com/thingston/extractor

Collection of PHP classes to extract data from HTML pages.

data html php

Last synced: 14 Jan 2026

https://github.com/zulfachafidz/telco_churn_insight_customer_loss_prediction_with_random_forest_and_decision_tree-algorithms

The main problem in the business world is customer churn, or losing customers, especially in the telecommunications industry, which experiences very tight competition. To overcome this problem, an analysis was carried out to help the company understand how many customers have the potential to switch providers.

data data-science data-visualization dataanalysis dataanalyst dataanalytics datadrivenwithdataprovider decision-tree decision-tree-classifier decision-trees random-forest random-forest-classifier

Last synced: 01 May 2026

https://github.com/nel-zi/climainsights

Developed an automated ETL pipeline using Apache Airflow and Python to collect, process, and store weather data from multiple cities via Weatherstack API. Implemented data cleaning, orchestration, and error handling to ensure accuracy and scalability.

airflow apache-spark data data-engineering engineering etl-pipeline

Last synced: 01 May 2026

https://github.com/vbshuliar/ktor-http-request-response

This project is part of my Android Development Specialization provided by Meta on Coursera. In this project I practised HTTP requests and responses using Ktor.

android compose data http https json kotlin ktor request response

Last synced: 01 May 2026

https://github.com/brayflex/spy-sector-rotation-google-sheet

Creates a dynamic spreadsheet to visualize SPY and it's 11 largest sector ETFs. See market trends and identify potential sector rotation opportunities.

data etf google-sheets index price rotation script sector spreadsheet spy stock-market

Last synced: 29 Jun 2026

https://github.com/juanandres-montero/dataanalysis

Dedicado al análisis de datos.

costa-rica data

Last synced: 10 Aug 2025

https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas

SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.

data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase

Last synced: 18 May 2026

https://github.com/docuvesta/shiseido_skincare_usa_fr_infographics

Découvrir les indicateurs de performance liés aux avis d'un sérum très réputé de la marque de beauté luxe japonaise Shiseido. Cette comparaison concerne les sites web USA et FR 💯

analysis automatisation data datanalysis graphique infographie pandas plotly python skincare soins

Last synced: 11 Apr 2026

https://github.com/moeabbas6/bq_data_loader

A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.

bigquery data python

Last synced: 24 Mar 2025

https://github.com/mikeqfu/network-rail-track-fixity-layer

This project develops a data mining tool for analysing and predicting track movements using asset data, environmental factors and track design knowledge to model key parameters and generate fixity values for the GB rail network.

data data-integration data-mining data-science information-management knowledge-discovery point-cloud rail rail-alignment rail-track track-fixity

Last synced: 02 Sep 2025

https://github.com/heyimsteve/solnftdatadash

This a React-based web application that provides detailed information about NFT collections on the Solana blockchain. It uses the HelloMoon API to fetch and display data about NFT collections, including statistics, loan summaries, ownership information, and floor prices.

dashboard data hellomoon nft react solana solana-nft

Last synced: 30 Jan 2026

https://github.com/bdr-pro/graphyml

A powerful, interactive Streamlit application to explore, edit, visualize, and query a graph-based database of YAML nodes — ideal for movie metadata, research articles, or structured knowledge graphs.

data database yaml yml

Last synced: 23 Jul 2025

https://github.com/shudhanshusaurabh001/super_market-data-analysis-using-python

This project focuses on analyzing supermarket sales data using Python. The goal is to extract meaningful insights from the dataset, such as sales trends, customer purchasing behavior, and product performance.

analysis csv data insights matplotlib numpy pandas project python seaborn

Last synced: 06 Apr 2026

https://github.com/jameshenderson12/chatbot-utils

Generic data and elements that can be reused or repurposed for chatbot development.

boilerplate chatbot data development elements intents template utterances

Last synced: 04 Mar 2026

https://github.com/turner-kendall/turner-kendall

Turner Kendall - dev, opps, sec.

config data github-config go rust security

Last synced: 31 Oct 2025

https://github.com/awpala/udemy-my-courses-data-parser

Download Udemy lists and courses metadata for authenticated student user

data scripts udemy

Last synced: 07 May 2026

https://github.com/seldszar/piccha

Another tree data structure

data tree

Last synced: 16 Jul 2025

https://gitlab.com/pommalabs/htmlark

HtmlArk packs a webpage into a single HTML file: https://htmlark-docs.pommalabs.xyz/

audios css data embed fonts html images javascript uri videos

Last synced: 03 Sep 2025

https://github.com/jigyasag18/aircraft-data-management

This repository offers a comprehensive simulation of global military air deployments involving 10 countries, aircraft models, mission types, and strategic zones. It analyzes air power distribution, mission intent (offensive, defensive, support), and geopolitical positioning. The project provides structured insights into regional & zone level threat

aircraft-data aircraft-performance data data-analysis data-visualization database database-management dataset datavisualisation mysql powerbi powerbi-report powerbi-visuals sql

Last synced: 04 Feb 2026

https://github.com/ailixter/gears-dictionary

The project, which Gears Dictionary

arrays data dictionaries dictionary php struct utilities

Last synced: 19 Jul 2025

https://github.com/abhijeetdasbakshi/ecommerce-insights

A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization — enabling actionable insights into e-commerce

airflow airflow-dags ci-cd-pipeline clustering dags data data-pipelines docker docker-compose docker-container dockerfile git great-expectations kafka mongodb pca-analysis postgresql pyspark t-sne umap-learn

Last synced: 04 Apr 2026

https://github.com/0xHericles/SpamDetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 24 Mar 2025

https://github.com/heitang/fcu-classid

逢甲大學:學院 ID 、 系所 ID 和班級 ID

data fcu project

Last synced: 30 Mar 2025

https://github.com/noedemange/orderedheatmapanalysis

OrderedHeatMapAnalysis (OHMA) is a direct data analysis framework allowing to simultaneously visualize and analyze the structure of complex datasets. An optimized seriation of rows and columns of the input data table is performed, resulting in a mapping of the whole dataset into an ordered heatmap.

analysis bi-seriation data dataanalysis heatmap r rstats seriation shiny shiny-apps

Last synced: 27 Feb 2025

https://github.com/fatihilhan42/hollywood-theatrical-market-synopsis-1995-to-2021

In this project, the data of hollywood film production companies from 1995 to 2021 were examined. Significant tables and graphs were created using data visualization algorithms, with the tickets sold divided into categories.

data data-analysis data-science data-visualization

Last synced: 23 Mar 2025

https://github.com/petzi53/repair

R Datasets of the Open Repair Alliance (ORA).

data r repair repair-cafe

Last synced: 19 May 2026

https://github.com/shahsuvarli/election-voters-data-analysis-pandas

Educational project analyzing Azerbaijan voter demographics with pandas, focusing on data cleaning, grouping, and visualization.

cleaning data grouping matplotlib numpy pandas python visualization

Last synced: 12 Apr 2026

https://github.com/jsanz/kart-test

Testing Kart repository

data geospatial kart

Last synced: 26 Jan 2026

https://github.com/arthurdanjou/studies

💼 This is the repository containing all my projects done during my studies in Python and R.

ai data data-science data-visualization jupyter jupyter-notebook ml python r

Last synced: 08 Apr 2025

https://github.com/jigyasag18/movie-recommendation-system-project

This repository features a personalized movie recommendation system that offers tailored suggestions to users. It leverages a dataset of 5,000 English-language films and utilizes data processing, feature engineering, and a cosine similarity algorithm to analyze user preferences. The system includes an intuitive user interface for easy navigation.

data datacleaning datapreprocessing machine-learning machine-learning-algorithms python streamlit streamlit-webapp

Last synced: 28 May 2026

https://github.com/keanteng/kaggledata

📊Data Source For Program Testing

data dataset excel

Last synced: 24 Mar 2025

https://github.com/giuleo129/dataanalysis

This folder contains two projects focused on data analysis and statistical learning using R, covering exploratory data analysis, modeling, and predictive techniques.

data data-analysis data-science statistical-learning

Last synced: 25 Jan 2026

https://github.com/natanast/euroleaguebasketball

An R package providing data on Euroleague Basketball

data data-science package r

Last synced: 01 Apr 2025

https://github.com/v-mayya/quantitative-analysis-data-dashboard

Quantitative survey data analysis using R

data data-analysis data-visualization flourish r

Last synced: 01 Apr 2025

https://github.com/armand-sauzay/datasets

Datasets for machine learning

ai data datasets machine-learning ml

Last synced: 18 Jan 2026

https://github.com/h-sutiwas/r2de-2025

This repository is related to the Road To Data Engineer Bootcamp by DataTH. It contains all related coursework, some mini projects and other resources within the field of Data Engineering.

data data-engineering data-visualization docker gcp pipeline spark

Last synced: 30 Apr 2026

https://github.com/makcymal/silvera

My researches on ML and statistics, optimization methods, CS algoritms and numerical methods

algorithms data data-structures machine-learning numerical-methods statistics

Last synced: 01 Apr 2025

https://github.com/inist-cnrs/ws-data

Modèles et données pour les web services

data dvc models

Last synced: 03 Sep 2025

https://github.com/shubhamsoni98/project_using_knn

This project applies the K-Nearest Neighbors (KNN) algorithm to predict iPhone purchases based on customer data. Using features like age, salary, and previous purchase behavior, the KNN model classifies customers into buyers and non-buyers.

anaconda analytics data data-science eda knn knn-classification machine-learning-algorithms predict project python scikit-learn tableau

Last synced: 03 Jan 2026

https://github.com/alextanhongpin/node-github-api

:page_with_curl: sample github api queries with nodejs for scraping purposes

data github-api nodejs

Last synced: 06 May 2026

https://github.com/ellisgl/geeklab-arraytranslation

Convert an array to another data format or convert a data format to an array.

array data format php php7-2 php72

Last synced: 25 Mar 2025

https://github.com/purarue/scramble-history

parses rubiks cube scramble history/solve time from cstimer.net, cubers.io, twistytimer -- merges them together giving you uniform averages/data/graphs

cstimer cubing data rubiks-cube speedsolving

Last synced: 11 Jun 2025

https://github.com/etmendz/mendz.data

Provides tools and guidance for creating data access contexts and repositories.

context data datasettings entity-framework mendz paginginfo repository resultinfo

Last synced: 11 Jun 2025

https://github.com/lorenzobloise/client_satisfaction_classification

Jupyter notebook in which satisfaction from clients reviewing European hotels is analyzed using Python libraries such as pandas, numpy and scikit-learn. Various classification models are trained and tested to predict client satisfaction.

classification data data-mining jupyter jupyter-notebook machine-learning pandas python

Last synced: 21 Feb 2026

https://github.com/team-hydrogen/nasa-adc-data

All files relating to the computation of the data provided

data jupyter-notebook nasa-app-development-challenge

Last synced: 25 Mar 2025

https://github.com/nadahamdy217/movies-data-etl-using-python-gcp

Developed a comprehensive ETL pipeline for movie data using Python, Docker, and a GCP Pub/Sub emulator. Successfully processed and published the data in a local Docker environment, showcasing advanced data engineering skills.

analytics data data-engineering data-ingestion data-preparation data-preprocessing data-processing data-project docker etl etl-pipeline gcp matplotlib matplotlib-pyplot numpy pandas pubsub python scipy seaborn

Last synced: 06 Jan 2026

https://github.com/igor-starostenko/sabre

Slice your files like a champ with **sabre**

data golang package

Last synced: 28 Mar 2025

https://github.com/entropyorg/p5-data-testimage

:notebook::camera: interface for retrieving test images

cpan data image-analysis

Last synced: 29 May 2026

https://github.com/jneidel/animal-names

Dataset of 100 common animal names

animals data dataset json names opendata

Last synced: 25 Mar 2025

https://github.com/4strium/data-analysis-france

🔍 Script allowing the analysis and recovery of precise data on French cities.

cities csv data france python research

Last synced: 01 Apr 2025

https://github.com/denisecase/buzzline-05-case

Kafka pipelines with data storage

consumer data kafka producer python

Last synced: 11 Apr 2025

https://github.com/tdjsnelling/hermes

Hermes is a real-time data framework for React + MongoDB

data docker framework mongodb nodejs react react-hooks reactjs real-time typescript websocket

Last synced: 12 Apr 2026

https://github.com/sajjad425/missingvalue

This repository provides a guide on handling missing values in Python, covering identification methods, imputation techniques (mean, median, mode, fill, interpolation), advanced methods (KNN, multiple imputation), and best practices. It includes practical examples for both numerical and categorical data.

data data-analysis-python data-science missing-value-handling missing-value-imputation

Last synced: 04 Apr 2025

https://github.com/yash-chauhan-dev/sf_analytics

Business teams often rely on data analysts to extract insights using SQL. This tool eliminates that dependency by bridging the gap between humans and data using AI.

aiml analytics data dbt langchain llm python snowflake streamlit

Last synced: 07 May 2026

https://github.com/tttardigrado/fq

Graffs for the MEDEA project

bokehplots data data-science dataanalysis pandas physics python3

Last synced: 12 Apr 2026

https://github.com/frer0t/userverse

creating api for data analysis

data data-analytics spring-boot users

Last synced: 12 Apr 2026

https://github.com/matheusafonseca/deploy-ml-models-with-streamlit-udemy

This repository is dedicated to storing the code developed during the "Machine Learning Model Deployment with Streamlit" course on Udemy. The course covers basic to advanced techniques for deploying machine learning models using Streamlit.

data data-science data-visualization interface joblib layout machine-learning optimization-algorithms python python3 sklearn sklearn-datasets sklearn-library sklearn-pipeline streamlit

Last synced: 19 Apr 2026

https://github.com/rohitblaze10/netflix_analysis_using_tableau

The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.

data data-analysis data-science data-visualization netflix tableau

Last synced: 04 Feb 2026

https://github.com/axetroy/stone

build data stuck like a stone, Sturdy!

axetroy data stone stuck

Last synced: 04 Jul 2025

https://github.com/astridlyre/offhand

A Random Data Generator Library for JavaScript.

data generator javascript library random typescript

Last synced: 20 May 2026

https://github.com/davorg/cookingvinyl

Web site with info about Cooking Vinyl records

cooking-vinyl data hacktoberfest music perl

Last synced: 02 Apr 2025

https://github.com/syed-bakhtawar-fahim/dsa_algorithm_code

Assalam o Alikum Guys, This is the repo of Data Structure and Algorithm in C programming language. I hope it will help you in learning Data Structure and Algorithm in C. I'm also learning Data Structure and algorithm in Python in better and easy way you can also explore it

algorithm algorithms-and-data-structures c data data-structures-and-algorithms dsa-algorithm dsa-learning-series dsa-practice

Last synced: 12 Apr 2025

https://github.com/pietrapaz/bootcamp_dio_ciencia_de_dados

Bootcamp Potência Tech powered by iFood | Ciência de Dados - Dio ⚠️

cienciadedados dados data datascience python

Last synced: 09 Apr 2025

https://github.com/dahsie/machine_learning_from_scratch

This project aims to implement some machine learning basic techniques(e.g. MinMaxScaler, StandardScaler, TD-IDF, PCA, Logistic Regression, LDA, KNN, Naive Bayes Classifier) using only pyton, numpy and pandas. This will enable me to have hone my data scientist skills

classification clustering data data-processing datascience machienlearning nlp nltk numpy pandas python regression

Last synced: 04 May 2026

https://github.com/rikiitokazu/dataprojects

Data analysis practice using SQL and Python

data python sql web-scraping

Last synced: 12 Apr 2026

https://github.com/romtaug/scoring-stoxx

Scoring et création de portefeuilles du STOXX, CAC et DAX via scrapping Wikipédia et envoi des résultats par mail - yfinance

api data emailing portfolio scoring stoxx wikipedia yfinance

Last synced: 05 Sep 2025

https://github.com/ayushverma135/dbms-labfile

Created for practical learning, this DBMS lab file offers hands-on exercises covering SQL queries, normalization, indexing, and more. With clear instructions and sample datasets, students gain invaluable experience in database design and management.

data dbms dbms-lab

Last synced: 04 Feb 2026

https://github.com/vara-co/tech-certifications

These are the certifications that back-up some of my skills.

certificates certifications data data-analytics skills

Last synced: 07 Jan 2026

https://github.com/miraclx/split-merge

Efficient, flexible data stream chunker and merger

chunk data efficient merge middleware nodejs pipeline split stream

Last synced: 07 May 2026

https://github.com/bcongdon/nid-data

National Inventory of Dams Data

data datasette government-data

Last synced: 21 Apr 2026

https://github.com/rishikesh-jadhav/track_deep_learning

Data collected from the Udacity simulator comprising RGB images with steering and throttle annotations for each frame, specifically gathered for behavioral cloning purposes.

data datacollection udacity-self-driving-car

Last synced: 03 Jan 2026

https://github.com/sysread/skewer

A priority queue for Go implemented using a skew heap

binary data go heap min minqueue priority queue skew structure

Last synced: 26 Aug 2025

https://github.com/etmendz/mendz.data.oracle

Provides a generic Mendz.Data-aware context for ADO.Net-compatible access to Oracle databases.

ado-net context data database datasettings mendz oracle

Last synced: 13 Apr 2026

https://github.com/shadmanshaikh/data-analysis-and-ml-work

All of my work in Data Analysis and Machine learning

analytics artificial-intelligence data machine-learning

Last synced: 05 Jul 2025

https://github.com/white-gecko/lineage-dump

RDF dump of the device information from the lineage wiki

data dataset lineageos rdf

Last synced: 28 May 2026