An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/istinnew/etl-pipeline-ganz-project

End-to-end ETL pipeline project for collecting, transforming, and loading data into a cloud-based database using Python, MySQL, and Google Cloud Analytics

cloud cloud-engineering cloud-services data data-science dataanalytics database database-schema googlecloud mysql mysql-database python python-lambda

Last synced: 20 Apr 2026

https://github.com/roovedot/unet-cnn-for-road-segmentation

(In Progress) Unet architecture with CNNs (Convolutional Neural Networks) aimed at Road Segmentation

cnn cnn-for-visual-recognition cnn-pytorch computer-vision data data-engineering data-science unet unet-image-segmentation unet-pytorch

Last synced: 01 Jul 2025

https://github.com/omers/sre-devops-tools

Tools and useful sources for SRE and DevOps

awsome awsome-list data devops monitoring sre tools

Last synced: 20 Apr 2026

https://github.com/rezapace/newbash

This project involves managing various application shortcuts and configurations primarily for a Linux environment. It includes scripts for creating .desktop entries for applications, managing system configurations, and handling application processes.

automation backup bash data dekstop linux newbash ohmyzsh script testing zsh

Last synced: 11 Apr 2026

https://github.com/beriberikix/senml-zephyr

A codec for encoding and decoding Sensor Measurement Lists (SenML) for Zephyr

codec data iot senml sensor zephyr-rtos

Last synced: 24 Mar 2025

https://github.com/boratechlife/tensorflow-questions-datasets

A Tensorflow questions Datasets to help you practice Machine learning and Train Models

data datapreprocessing datasets machinelearning modeltrain questions tensorflow

Last synced: 23 Mar 2025

https://github.com/adamouization/python-machine-learning-data-science-notes

:orange_book: Jupyter notebooks containing useful Python code and notes for general Machine Learning and Data Science projects.

data data-science data-visualization guide jupyter jupyter-notebook machine-learning matplotlib notes numpy pandas pandas-dataframe python seaborn

Last synced: 11 Apr 2026

https://github.com/yashkp1234/movie-recommendation-engine

My project on analyzing the movie data set, and creating a recommendation engine using that analysis.

analysis data notebook python recommendation-engine

Last synced: 04 May 2025

https://github.com/nel-zi/nuga_bank

Developed an automated data exploration and cleaning pipeline for Nuga Bank to streamline data preparation, ensure consistent data quality, and normalize datasets into structured databases for efficient analysis and reporting.

data data-automation data-visualization datacleaning datatransformation etl-automation etl-pipeline

Last synced: 16 May 2025

https://github.com/rick-does/json-razor

Reduces JSON, YAML, and NDJSON volume by collapsing repeated structures while preserving the schema, making the schema easier for you to read.

cli data devtools json logs ndjson schema yaml

Last synced: 20 Apr 2026

https://github.com/hormcodes/data

Terraform configuration for public data storage hosted on data.horm.codes

aws cloudfront content-management data github-actions s3-bucket terraform

Last synced: 20 Apr 2026

https://github.com/machinecyc/lotteryinsight

Use crawler to collect Taiwan Lotto data, and save data into local MySQL server.

crawler data docker lottery mysql-database python3 taiwan

Last synced: 09 May 2026

https://github.com/jigyasag18/movie-recommendation-system-project

This repository features a personalized movie recommendation system that offers tailored suggestions to users. It leverages a dataset of 5,000 English-language films and utilizes data processing, feature engineering, and a cosine similarity algorithm to analyze user preferences. The system includes an intuitive user interface for easy navigation.

data datacleaning datapreprocessing machine-learning machine-learning-algorithms python streamlit streamlit-webapp

Last synced: 28 May 2026

https://github.com/stdlib-js/wasm-base-dtype2wasm

Return the WebAssembly data type associated with a provided array data type value.

array base data dtype javascript node node-js nodejs stdlib type types util utilities utility utils wasm webassembly

Last synced: 09 May 2026

https://github.com/thanh-wutan/chess-opening-comparator

Interactive web app using R to visualize and compare chess opening performance and popularity.

chess-openings data databases datavisualisation r

Last synced: 09 May 2026

https://github.com/infinitode/pyautoplot

PyAutoPlot is an open-source Python library designed to make dataset analysis much easier by generating helpful detailed plots using matplotlib. It automatically generates appropriate plots based on the dataset you feed it.

analysis automatic csv data dataset dataset-analysis generation matplotlib pandas plots plotting-in-python plotting-library python

Last synced: 16 Mar 2025

https://github.com/schluppeck/2024-abdsa-notes

some notes related to DS's presentation

abdsa data python rstats science

Last synced: 21 Apr 2026

https://github.com/mozzo1000/web-analytics

Website analysis tools and data

analysis analytics data website

Last synced: 21 Apr 2026

https://github.com/fastpix/android-data-kaltura

This SDK enables seamless integration with Kaltura Player, offering advanced video analytics via the FastPix Dashboard

analytics android-sdk data fastpix kaltura kaltura-player metrics sdk video video-metrics

Last synced: 21 Apr 2026

https://github.com/jsanz/kart-test

Testing Kart repository

data geospatial kart

Last synced: 26 Jan 2026

https://github.com/kalaspuff/ready

🎟 [not yet built] Take control of the event loop with simplified task management, queueing and data loading.

asyncio data dataloading event futures python python3 resolver tasks

Last synced: 10 May 2026

https://github.com/stefen-taime/llm-rag-mtl-public-hospital

Ce projet développe un modèle de type Retrieve-Augment-Generate (RAG) pour répondre aux questions en utilisant les données publiques des avis laissés sur Google pour des hôpitaux à Montréal

data google-reviews hopital hospital hub ia llm montreal open-source quebec rag

Last synced: 21 Apr 2026

https://github.com/jdenn0514/surveycore

Core Survey Analysis Infrastructure

data r resear survey-analysis

Last synced: 21 Apr 2026

https://github.com/khushi-sabarad/data_analysis

linkedin learning capstone project

data data-engineering matplotlib pandas python

Last synced: 10 May 2026

https://github.com/sebastianbrzustowicz/flight-quality-overview-microservice

Go + Docker. Microservice with parallel computations to convert raw vehicle flight data into overview raport with visualisation.

container control csv data docker drone flight go goroutines http microservice parallel-computing pdf quadcopter raport rms sse vehicle

Last synced: 10 May 2026

https://github.com/gcoronelc/ucv_gdi-2_202202-a1

Taller de Base de Datos Avanzado con Gustavo Coronel

data database datos function gcoronelc procedure sql sqlserver t-sql transact transact-sql

Last synced: 22 Apr 2026

https://github.com/beeracs/llama

Run Llama models in your web browser using JavaScript and WebAssembly. Explore light and dark modes easily. 🌐🐱👤

ai data fine-tuning framework gpt langchain large-language-models llama3 llamaindex llm lora machine-learning nlp peft qlora qwen rlhf vllm

Last synced: 10 May 2026

https://github.com/sauravsrivastav/githubreposearcher

GitHub Repo Searcher 🔍 is a Streamlit web application designed to help you search for GitHub repositories based on a query and view the results in a tabular format. You can also download the results in CSV or Excel format for further analysis. 📊📈

data data-export excel github-api python repository-searcher streamlit webapp

Last synced: 20 Jan 2026

https://github.com/syed-nihaal/car-price-prediction-and-performance-analysis

A data science notebook project focused on analyzing car features and building a model for car price prediction.

data data-analysis data-visualization jupyter-notebook python

Last synced: 23 Apr 2026

https://github.com/cpietsch/breitband

developer repo of breitband-berlin

d3js data threejs visualization

Last synced: 02 May 2026

https://github.com/coryson/osm-mla-finder

Python script to locate institutions employing Medical Laboratory Assistants in Germany, developed for BTZ – Berufliche Bildung Köln GmbH. It uses OpenStreetMap, SerpAPI, and web scraping to find and verify relevant labs, clinics, and diagnostic centers.

beautifulsoup data openstreetmap osm python scraping serpapi webscraping

Last synced: 24 Apr 2026

https://github.com/yuvrajsaraogi/-iris-flower-classification

Iris flower has three species; setosa, versicolor, and virginica, which differs according to their measurements. Now assume that you have the measurements of the iris flowers according to their species, and the task is to train a machine learning model that can learn from the measurements of the iris species and classify them.

classification data data-analysis data-science data-visualization flower flower-classification iris iris-classification iris-flower iris-flower-classification knn knn-classification machine-learning machine-learning-algorithms ml natural-language-processing nlp python

Last synced: 24 Apr 2026

https://github.com/cyberoctane29/python-for-data-analysis

A repository dedicated to learning Python for data analysis, data science, and data analytics. This collection of Jupyter notebooks covers practical exercises and concepts from the Google Advanced Data Analytics Professional Certificate program.

data data-analysis data-analytics data-science python

Last synced: 24 Apr 2026

https://github.com/issacto/kowloonwestparking

Deployed Web App

data hongkong react

Last synced: 24 Apr 2026

https://github.com/marielachirinosr/cyclistic-data-analytics-project

This project explores user behavior within a fictional bike-sharing system, modeled after Cyclistic, operating in Chicago.

data data-visualization pandas powerbi-report powerbi-visuals python

Last synced: 24 Apr 2026

https://github.com/petzi53/repair

R Datasets of the Open Repair Alliance (ORA).

data r repair repair-cafe

Last synced: 19 May 2026

https://github.com/mehmetkahya0/gallstone_dataset_analysis_project

Safra Taşı Hastalığı (Gallstone-1) Veri Seti Analizi (https://archive.ics.uci.edu/dataset/1150/gallstone-1)

analysis analytics data data-analysis data-science data-visualization database graph matplotlib python

Last synced: 25 Apr 2026

https://github.com/rubix982/product-quality-classification

This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.

data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp

Last synced: 25 Apr 2026

https://github.com/thinkphp/my-react-tictactoeai-app

App React Tic Tac Toe Component based on Artificial Intelligence

ai algoirthms data datastructures games javascript react

Last synced: 25 Apr 2026

https://github.com/carlos-levi/twitterbots_analise_redesneurais

Projeto para a disciplina de IA - análise exploratória e aplicação de técnicas de aprendizado de máquina para detectar contas automatizadas (bots) na plataforma 𝕏 (Twitter)

data machine-learning twitter-bot

Last synced: 06 Jun 2026

https://github.com/marielachirinosr/bellabeat-wellness-data-trends

Analyzing smart device data for insights on user activity patterns to optimize interventions for better health outcomes.

data data-analysis data-visualization pandas python python3 tableau tableau-public

Last synced: 25 Apr 2026

https://github.com/shwetajanwekar/prediction-with-regression

prediction with regression for salary_hike and delivery time dataset

data data-science datset exploratory-data-analysis matplotlib pandas plot prediction r2-score seaborn sns

Last synced: 25 Apr 2026

https://github.com/richardlitt/bird-watching

My birdwatching list and repo

birding data ebird

Last synced: 26 Jan 2026

https://github.com/jigyasag18/multiple-disease-detection-app

This repository contains the implementation of a Multiple Disease Detection System, which employs advanced machine learning techniques for early detection and prediction of prevalent diseases, including diabetes, heart disease, and Parkinson's disease. The system utilizes a variety of patient health metrics such as demographics and medical history.

data datapreprocessing machine-learning machine-learning-algorithms machinelearningmodel prediction python streamlit streamlit-webapp

Last synced: 07 Jun 2026

https://github.com/lotfiferaga/instagram-reach-analysis

The Instagram Reach Analysis project aims to develop a Python-based tool to analyze the reach and engagement metrics of Instagram posts.

analytics data data-science datavisualization python

Last synced: 18 Jun 2026

https://github.com/adrianoleitedasilva/adrianoleitedasilva

Me chamo Adriano, tenho 35 anos de idade, sendo 18 anos dedicados as áreas de Tecnologia da Informação e Educação.

adrianoleitedasilva automation ceo cio cto data data-science dev diretor github mobile professor python readme techlead web

Last synced: 10 May 2026

https://github.com/f-ssemwanga/pandas-numpy-repo

This repo has extensive work I have done on Pandas and NumPy Modules during the advanced programming Module

cleaning-data-in-python data numpy-arrays pandas visualization

Last synced: 27 Apr 2026

https://github.com/notthestallion/pca__3d-and-from-scratch__principal-component-analysis

In this project, I will be implementing Principal Component Analysis (PCA) from scratch on an ecological footprint consummation database for countries and a three-dimensional scale using a movie database. The goal of this project is to gain a deeper understanding of PCA and to demonstrate its capabilities in exploring complex datasets.

data data-science database pca pca-analysis principal-component-analysis principal-component-analysis-pca principle-component-analysis

Last synced: 10 May 2026

https://github.com/doppelgunner/baby

A program for storing data just for fun

data doppelgunner java note storing

Last synced: 12 Jun 2026

https://github.com/luminati-io/seleniumbase-with-proxy

SeleniumBase with authenticated proxies to bypass restrictions, enhance web scraping, and manage rotating proxies for better data extraction.

data data-collection proxy-server python residential-proxy selenium seleniumwire web-scraping

Last synced: 27 Apr 2026

https://github.com/ioanzicu/batch_loading_one-to-many_data_model

Unesco Batch Loading One-to-Many Data using Django

batch data django sqlite3

Last synced: 27 Apr 2026

https://github.com/bbfh-dev/protox

Go library for (de-)serializing custom protocols

binary data format go library parsing protocol reader writer

Last synced: 01 Jul 2025

https://github.com/0xHericles/ufcg-geojson

GeoJSON file containing the blocks and buildings of the Federal University of Campina Grande.

data data-visualization geojson map open-source ufcg university

Last synced: 24 Mar 2025

https://github.com/gurpreet0022/crop-fertilizers-recommendation-system-using-ml-

This repository is a part of AICTE - Shell Internship on 'Green Skills using AI technologies' Cycle 3.

data datapreprocessing datavisualization jupyter-notebook machine-learning python

Last synced: 27 Apr 2026

https://github.com/santiagoenriquega/custom_database

Python-based database library for database management, indexing, transactions, and constraints, showcasing foundational database concepts.

data data-engineering database database-design python

Last synced: 27 Apr 2026

https://github.com/tacticalnuclearraccoon/dataviz_with_js

Sample data vizualisation as part of a training on Javascript Frameworks for dataviz

d3 data datawrapper echarts javascript visualization

Last synced: 27 Apr 2026

https://github.com/chompfoods/stub-inflector

Inflector server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery inflector ingredients nutrition raw recipe-api recipes server stub stub-inflector stub-server

Last synced: 27 Apr 2026

https://github.com/drkane/area-profiles

Produce UK area profiles based on various data sources

dash-plotly data flask statistics uk

Last synced: 27 Apr 2026

https://github.com/gngdb/llamass

LLAMASS is an arbitrary collection of tools I've put together to deal with motion data

amass data pose pytorch

Last synced: 28 Apr 2026

https://github.com/oguzhanfatihkucuk/data-analytics-project-kafka-spark

The data in this project was collected in a database using Apache Kafka and processed with Apache Spark Streaming. The project aims to create a forecasting model and analyze sales forecasts per customer.

big-data data data-visualization hadoop kafka ml mlpipeline plt pyhton spark

Last synced: 28 Apr 2026

https://github.com/infinitode/crsd

A synthetic customer review sentiment dataset for sentiment analysis generated using different AI models.

ai data dataset datasets huggingface-datasets mit-license ml nlp open-source python sentiment sentiment-analysis sentiment-classification text-data

Last synced: 10 Jun 2026

https://github.com/leonardomusini/mbe-growth-nexus-converter

Python tool to convert laboratory text files into NeXus files for Molecular Beam Epitaxy (MBE) data.

data data-engineering nexus python

Last synced: 28 Apr 2026

https://github.com/dhimmel/adeptus

ADEPTUS -- differential gene expression signatures of disease

adeptus data differential-expression disease gene-expression genes rephetio

Last synced: 05 Jan 2026

https://github.com/n-ce/localstorage-data-interchange-manager

Implementation of local storage data interchange using map data structure.

data export import javascript js-maps json localstorage

Last synced: 28 Apr 2026

https://github.com/0xHericles/SpamDetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 24 Mar 2025

https://github.com/i-am-uchenna/sql-data-warehouse-project

The Data Warehouse and Analytics Project is a comprehensive initiative designed to demonstrate the end-to-end process of building a modern data warehouse and deriving actionable insights through SQL-based analytics.

architecture business-intelligence crm data data-analysis database database-management datawarehouse erp etl etl-pipeline model sql sqlserver

Last synced: 15 May 2026

https://github.com/jigyasag18/aircraft-data-management

This repository offers a comprehensive simulation of global military air deployments involving 10 countries, aircraft models, mission types, and strategic zones. It analyzes air power distribution, mission intent (offensive, defensive, support), and geopolitical positioning. The project provides structured insights into regional & zone level threat

aircraft-data aircraft-performance data data-analysis data-visualization database database-management dataset datavisualisation mysql powerbi powerbi-report powerbi-visuals sql

Last synced: 04 Feb 2026

https://github.com/howz1t/ptypes

This package provides useful data types for use in PHP.

badges composer computer-science data data-structures data-types packagist php types

Last synced: 29 Apr 2026

https://github.com/gcoronelc/uni-epies-das-2022-2

Curso de Análisis y Diseño de Sistemas en UNI-EPIES.

dao data datos gcoronelc java jdbc mvc mvc-pattern sql sqlserver

Last synced: 29 Apr 2026

https://github.com/iammahesh123/spring-annotations-demo

This project serves as a demonstration of various annotations used in the Spring Framework.

autowire bean component configuration controller data document postmapping repository requestmapping scope service spring

Last synced: 29 Apr 2026

https://github.com/mumtaz4118/scraping-medium-and-data-analytics

The file DataExtraction.py extracts information from the json files scrapped by the scrapper medium_scrapper_post.py. To extract information from json files scrapped by medium_scrapper_tag_archive.py (scrapping from tags archive) then use Data_Extraction_Archive_Tags.py

data data-analysis data-analytics data-extraction data-preprocessing data-science data-scraping deep-learning machine-learning python

Last synced: 29 Apr 2026

https://github.com/stdlib-js/array-struct-factory

Return a constructor for creating arrays having a fixed-width composite data type.

array composite data factory javascript node node-js nodejs stdlib struct structure typed typed-array types

Last synced: 29 Apr 2026

https://github.com/faster-games/dynamic-components

Dynamic Runtime Components for Unity3D

data framework unity3d

Last synced: 11 Apr 2026

https://github.com/shoaib1522/data-aggregator-tool-in-python

This all are the illustration of the things used in " Data Aggregation Tool " as a scenario of Data Science Engineer written in Document(PDF)

data data-science dataaggregation lists python-script python3 sets-python tuples

Last synced: 29 Apr 2026

https://github.com/martgro/datagrabber

Tool for extracting data points from plots

data extract image plots python3

Last synced: 29 Apr 2026

https://github.com/arunabhagit/data-driven-e-commerce-sales-analysis

This is a E-Commerce Data Analysis and finding out insights with statistical charts using Python . I have Used pandas , plotly libraries to show insights and statistical Charts .

analysis charts data insights pandas-python plotly python statistics

Last synced: 29 Apr 2026