An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/plurid/defocus

Apophatic User Content Resolution [Desearch Concept]

data

Last synced: 08 Nov 2025

https://github.com/caiorss/julia-box-docker

Docker that provides a development environment for Julia language, Octave, Python, R (Rlang) with a Jupyter Notebook; Jupyter QtConsole and so on.

data datascience deveops docker julia jupyter octave python rlang scientific

Last synced: 09 May 2026

https://github.com/plurid/delog

Cloud Service for Centralized Logging

cloud data logging

Last synced: 08 Nov 2025

https://github.com/opdev1004/crumbdbjs

JSON files based database Javascript

data data-storage data-store database database-management nodejs

Last synced: 18 Apr 2026

https://github.com/bhojpur/dlm

The Bhojpur DLM is a software-as-a-service product used for Data Lifecycle Management based on Bhojpur.NET Platform for data delivery.

data lifecycle-management

Last synced: 19 Feb 2026

https://github.com/plurid/datasign

Single Source of Truth Data Contract Specifier

data file-format

Last synced: 08 Nov 2025

https://github.com/jigyasag18/iit-guhawati-final-capstone-project

Smart Dynamic Parking Price Optimization System that adjusts parking fees in real-time based on demand, traffic, and competition. It employs adaptive pricing models and rerouting logic to enhance parking utilization and reduce congestion. The system is visualized via an interactive Streamlit dashboard, enabling users to simulate dynamic pricing.

bokeh bokeh-server bokehplots capstone-project data dataset deployment machine-learning machine-learning-algorithms matplotlib matplotlib-pyplot mlproject normalisation numpy pandas pathway python streamlit

Last synced: 05 Apr 2026

https://github.com/josericodata/josericodata.github.io

Welcome to my portfolio website. This site showcases my skills, experience, education, and projects as a Data Analyst.

awesine-latex big-data career-development data data-analyst data-science database dublin ireland job-seeking jose-maria-rico-leal jose-rico jose-rico-data latex latex-cv portfolio portfolio-website python sql

Last synced: 18 Apr 2026

https://github.com/master-helix/ibm-data-analyst-certification-stock-analysis-project

This is a mini project repository of my IBM Certification involving stock analysis and plotting of Tesla and GameStop

analytics data data-analysis data-visualization ibm matplotlib pandas python web-scraping

Last synced: 09 May 2026

https://github.com/phelipe-sempreboni/certificates

Tutorial intended for information about my licenses and certificates acquired over time.

certificate certificates certification course data database datascience licences license-management marketing marketing-analytics python sql

Last synced: 16 May 2026

https://github.com/ciyer/altair-matplotlib

Ports of examples from a Matplotlib tutorial to Altair/Vega

altair data dataviz vega vega-lite

Last synced: 29 Jul 2025

https://github.com/henryssondaniel/teacup-java-report-mysql

Report Teacup data to a MySQL database

data logs mysql reports teacup

Last synced: 20 Apr 2026

https://github.com/danielrosehill/value-factors-data-vis

Streamlit app containing visualisations of the Global Value Factors Database (GVFD) released by the IFVI in 2024

data data-visualization sustainability sustainability-data

Last synced: 29 Jul 2025

https://github.com/anjaliwork20/moodify

Mood-based music recommendation system that considers a user's emotional state to recommend songs, genres, artists and playlists using Machine learning

artificial-intelligence cnn-keras cnn-model convolutional-neural-networks data data-analysis data-science data-structures data-visualization database deep-learning machine-learning machine-learning-algorithms python recommended song songs

Last synced: 20 Apr 2026

https://github.com/nushratjabenaurnima/cse_477_data_mining

A collection of labs, reports, Jupyter notebooks, and project outputs for the CSE 477 Data Mining course. This repository tracks my learning journey through data preprocessing, association rules, clustering, classification, and real-world data analysis with Python.

data data-analysis data-mining data-science google-colab-notebook jupyter-notebook machine-learning python python-3

Last synced: 09 Apr 2026

https://github.com/edjoukou/human_resources

A data analysis project using MySQL Server database

analysis data mysql powerbi sql visualization

Last synced: 25 Sep 2025

https://github.com/dataglyder/Data-Analysis-Tools-to-Get-You-Started

This repository describes a few tools for a beginner Data Analyst.

analytics data python r sql

Last synced: 29 Jul 2025

https://github.com/stdlib-js/array-base-symmetric-banded-filled2d-by

Create a filled two-dimensional symmetric banded nested array according to a provided callback function.

alloc allocate array callback data fill filled foreach generic javascript map matrix multidimensional node node-js nodejs stdlib strided structure types

Last synced: 20 Apr 2026

https://github.com/hormcodes/data

Terraform configuration for public data storage hosted on data.horm.codes

aws cloudfront content-management data github-actions s3-bucket terraform

Last synced: 20 Apr 2026

https://github.com/yashkp1234/movie-recommendation-engine

My project on analyzing the movie data set, and creating a recommendation engine using that analysis.

analysis data notebook python recommendation-engine

Last synced: 04 May 2025

https://github.com/i-rzr-i/domaincommonextensions

The purpose of this repository/library is to provide the most relevant and used extension methods in the life cycle of application development that allow us to improve our code, and writing speed, and use more efficiently dev team time during this period for more complex functionality.

api class data datatype extension helper object parser type util

Last synced: 20 Sep 2025

https://github.com/nxion/sql-data-warehouse-project

Building a modern data warehouse with MS SQL server, ETL processes, data modeling and analyitics.

data data-analysis data-analytics data-engineering data-lakehouse data-warehouse datalake datascience etl etl-job medallion-architecture ms mssql sql sql-query sql-server

Last synced: 05 Jun 2026

https://github.com/vishwas-chakilam/movies-review-scraping-analysis

A project for collecting, cleaning, and analyzing movie data. Includes scripts for web scraping (deprecated) and using the OMDb API to fetch movie details. Analyze and visualize data with Python and Power BI to uncover insights and trends in movie ratings and genres.

data dataanalysis datacleaning datavisualization matplotlib-python numpy-library pandas python webscraping

Last synced: 21 Apr 2026

https://github.com/baranasoftware/curricular-api

The design and implementation of a REST API for student and course data for a Higher Ed institution.

aws data data-pipeline go golang lambda rest rest-api sqlite3 system-design terraform

Last synced: 09 May 2026

https://github.com/stefen-taime/llm-rag-mtl-public-hospital

Ce projet développe un modèle de type Retrieve-Augment-Generate (RAG) pour répondre aux questions en utilisant les données publiques des avis laissés sur Google pour des hôpitaux à Montréal

data google-reviews hopital hospital hub ia llm montreal open-source quebec rag

Last synced: 21 Apr 2026

https://github.com/jdenn0514/surveycore

Core Survey Analysis Infrastructure

data r resear survey-analysis

Last synced: 21 Apr 2026

https://github.com/thanh-wutan/chess-opening-comparator

Interactive web app using R to visualize and compare chess opening performance and popularity.

chess-openings data databases datavisualisation r

Last synced: 09 May 2026

https://github.com/grimen/python-humanizer

A human/developer friendly value humanizer - for Python.

data debug debugging format formatting humanize humanizer log logging print printing value

Last synced: 05 Jun 2026

https://github.com/ryanga09/digitalent_fundamentaldatascience-selfpractice

A repository of hands-on projects from DigiTalent’s Fundamental Data Science training, covering web scraping, data exploration, data cleaning, and data annotation. Includes Jupyter notebooks and example code for practical learning.

data data-analysis data-science data-visualization dataset digitalent komdigi notebook-jupyter notebooks

Last synced: 02 Aug 2025

https://github.com/howwohmm/fetchgram

era-adjusted Instagram content intelligence — scrape any public profile, OCR every image, measure what actually works. free, local, no API keys.

analytics cli content-strategy data instagram ocr python scraper

Last synced: 06 Jun 2026

https://github.com/jigyasag18/airline-performance-and-passenger-satisfaction-project-using-big-data-analytics

This project analyzes 10 years of U.S. domestic airline data (~3GB) using Hadoop (Cloudera) and Hive for data processing. Power BI dashboards visualize key metrics like delays, on-time rates, air time, and diversions. The solution includes Hive queries, DAX measures, HDFS ingestion scripts, and year-wise insights with recommendations.

big-data big-data-analytics bigdata cloudera cloudera-hadoop cloudera-hadoop-framework data data-analysis data-visualization database hadoop hive power-bi powerbi powerbi-dashboard powerbi-dashboards powerbi-report powerbi-visuals powerbi-visuals-tools powerbidashboard

Last synced: 01 Aug 2025

https://github.com/issacto/kowloonwestparking

Deployed Web App

data hongkong react

Last synced: 24 Apr 2026

https://github.com/jigyasag18/global-terrorism-1970-2017-analysis-using-big-data

This repository explores over 180,000 terrorist incidents across 205 countries using Hadoop and Power BI. The project identifies global and regional patterns in terrorism, analyzes the impact on civilians, and highlights high-risk areas. Key insights include attack trends,weapon usage,top terror groups,& country-specific risks like those in India.

big-data big-data-analytics data data-analysis data-visualization dataanalytics dataset hadoop hive hive-database hive-db hivedb power-bi powerbi powerbi-dashboards powerbi-desktop powerbi-report powerbi-report-validation powerbi-visuals powerbidashboard

Last synced: 19 Feb 2026

https://github.com/khushi-sabarad/data_analysis

linkedin learning capstone project

data data-engineering matplotlib pandas python

Last synced: 10 May 2026

https://github.com/jigyasag18/ai-ml-salaries-and-ai-tools-usage-trends

This repository presents an in-depth Power BI analytics report on the AI job market trends and student AI tool usage from 2020 to 2025. It combines structured datasets (job postings, salaries, surveys) with custom DAX measures to uncover key patterns in salaries, remote work, industry demand, and student engagement. 5 interaractive dashboards made.

analysis data data-analysis data-visualization dataanalysis dataanalytics dataset datavisualization power-bi powerbi powerbi-dashboards powerbi-desktop powerbi-report powerbi-visuals powerbidashboard visualization

Last synced: 16 Feb 2026

https://github.com/mysociety/sync-ep-to-jkan

Syncs EveryPolitician data to mySociety's data portal.

data everypolitician jkan politicians

Last synced: 27 Jul 2025

https://github.com/rubix982/product-quality-classification

This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.

data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp

Last synced: 25 Apr 2026

https://github.com/carlos-levi/twitterbots_analise_redesneurais

Projeto para a disciplina de IA - análise exploratória e aplicação de técnicas de aprendizado de máquina para detectar contas automatizadas (bots) na plataforma 𝕏 (Twitter)

data machine-learning twitter-bot

Last synced: 06 Jun 2026

https://github.com/marielachirinosr/bellabeat-wellness-data-trends

Analyzing smart device data for insights on user activity patterns to optimize interventions for better health outcomes.

data data-analysis data-visualization pandas python python3 tableau tableau-public

Last synced: 25 Apr 2026

https://github.com/shwetajanwekar/prediction-with-regression

prediction with regression for salary_hike and delivery time dataset

data data-science datset exploratory-data-analysis matplotlib pandas plot prediction r2-score seaborn sns

Last synced: 25 Apr 2026

https://github.com/jigyasag18/multiple-disease-detection-app

This repository contains the implementation of a Multiple Disease Detection System, which employs advanced machine learning techniques for early detection and prediction of prevalent diseases, including diabetes, heart disease, and Parkinson's disease. The system utilizes a variety of patient health metrics such as demographics and medical history.

data datapreprocessing machine-learning machine-learning-algorithms machinelearningmodel prediction python streamlit streamlit-webapp

Last synced: 07 Jun 2026

https://github.com/arunabhagit/bank-customer-churn-analysis-and-risk-tracker

This project analyzes customer churn using machine learning and visual storytelling through Power BI. A Random Forest model identifies high-risk customers, while interactive dashboards reveal key churn patterns, enabling targeted retention strategies and data-driven decision-making for business improvement.

analysis data powerbi predictive-modeling sql

Last synced: 28 Jul 2025

https://github.com/f-ssemwanga/pandas-numpy-repo

This repo has extensive work I have done on Pandas and NumPy Modules during the advanced programming Module

cleaning-data-in-python data numpy-arrays pandas visualization

Last synced: 27 Apr 2026

https://github.com/fatihemres/africa

Africa app by SwiftUI. Using AVFoundation, MapKit, data, models, animations, stickers.

animations avfoundation data mapkit models swift swift-animations swiftui

Last synced: 27 Apr 2026

https://github.com/yuweaec/project-scidatapipeline

A comprehensive toolkit for processing, simulating, and analyzing scientific data, integrating Python, Fortran, and Jupyter notebooks for seamless workflows.

analysis data pipeline processing scientific simulation

Last synced: 27 Apr 2026

https://github.com/demkeys/lazydatatransfer

Lazy method to transfer upto 64kb of data over the network using UDP

data data-trans network python transfer udp

Last synced: 07 Jun 2026

https://github.com/gurpreet0022/crop-fertilizers-recommendation-system-using-ml-

This repository is a part of AICTE - Shell Internship on 'Green Skills using AI technologies' Cycle 3.

data datapreprocessing datavisualization jupyter-notebook machine-learning python

Last synced: 27 Apr 2026

https://github.com/schenkd/tweetminer

Data Miner for Twitter Streaming API

data dataminer datamining java twitter twitter-api twitter4j

Last synced: 07 Jun 2026

https://github.com/o-rumiantsev/exchange

Data Exchange System (Prototype)

chat css data exchange system websocket

Last synced: 27 Apr 2026

https://github.com/chompfoods/stub-inflector

Inflector server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery inflector ingredients nutrition raw recipe-api recipes server stub stub-inflector stub-server

Last synced: 27 Apr 2026

https://github.com/gngdb/llamass

LLAMASS is an arbitrary collection of tools I've put together to deal with motion data

amass data pose pytorch

Last synced: 28 Apr 2026

https://github.com/oguzhanfatihkucuk/data-analytics-project-kafka-spark

The data in this project was collected in a database using Apache Kafka and processed with Apache Spark Streaming. The project aims to create a forecasting model and analyze sales forecasts per customer.

big-data data data-visualization hadoop kafka ml mlpipeline plt pyhton spark

Last synced: 28 Apr 2026

https://github.com/hemangsharma/assignment-2---classification-models

Assignment 2 - Classification Models repository contains project for 36106 Machine Learning Algorithms and Applications

data datascience-machinelearning machine-learning ml

Last synced: 10 Jun 2026

https://github.com/shreeparab1890/indian-elections-2019-analysis-eda

This ipython notebook is the Exploratory data analysis (EDA) of the Indian Lok Sabha Elections 2019.

data data-analysis data-science data-visualization eda exploratory-data-analysis matplotlib numpy pandas plotly python python3 visualization

Last synced: 28 Apr 2026

https://github.com/canadaluke888/speedtable

Ultra-fast terminal table renderer written in C

c data datasets fast python python-wrapper python3 tables

Last synced: 01 Mar 2026

https://github.com/weskal/vexus_pipeline

Automated pipeline for generating, ingesting, and validating realistic data, designed to simulate real-world workflows with scheduling, data quality checks, and version control.

airflow data pipeline python sqlserver workflow

Last synced: 20 Jan 2026

https://github.com/n-ce/localstorage-data-interchange-manager

Implementation of local storage data interchange using map data structure.

data export import javascript js-maps json localstorage

Last synced: 28 Apr 2026

https://github.com/aaronspindler/selfdrivingcar

Learning deep learning and making a self driving car in the process

car data deep deep-learning driving keras learning machine machine-learning python self self-driving-car

Last synced: 09 Apr 2026

https://github.com/abhishekn1947/samgov-scraper

Automated Python scraper for sam.gov contracts

analytics automation aws data pandas postgresql rds selenium webscraper

Last synced: 09 Apr 2026

https://github.com/mtalhaofc/nutrition_system

A simple AI-powered web app built using Streamlit that provides personalized weekly meal plans and nutrition recommendations based on user demographics, health goals, and nutritional preferences.

cosine-similarity data data-science food machine-learning model nutrition pandas python streamlit

Last synced: 29 Apr 2026

https://github.com/mumtaz4118/scraping-medium-and-data-analytics

The file DataExtraction.py extracts information from the json files scrapped by the scrapper medium_scrapper_post.py. To extract information from json files scrapped by medium_scrapper_tag_archive.py (scrapping from tags archive) then use Data_Extraction_Archive_Tags.py

data data-analysis data-analytics data-extraction data-preprocessing data-science data-scraping deep-learning machine-learning python

Last synced: 29 Apr 2026

https://github.com/apsalverda/ebird-hotspot-menu-bar-python

🪶 Retrieve recent hotspot observations using eBird API

data ebird ebird-api hotspot instructions live livedata macos menubar observations platypus

Last synced: 29 Apr 2026

https://github.com/mr-dhan/eda-sales-customer-transactions

Dalam dunia bisnis ritel yang kompetitif, pemahaman mendalam terhadap perilaku pelanggan merupakan fondasi penting untuk pengambilan keputusan strategis. Namun, data transaksi pelanggan seringkali berjumlah besar dan kompleks, sehingga memerlukan proses analisis yang efektif untuk mengungkap insight yang berharga.

dashboard data data-analysis data-analysis-python data-science data-visualization eda python

Last synced: 29 Apr 2026

https://github.com/kayahr/datastream

Data stream classes for writing and reading all kinds of data types, even single bits

data datastream input output stream typescript

Last synced: 01 Aug 2025

https://github.com/afeiship/data-arary

Data array with some new methods.

array data data-structure js list

Last synced: 11 May 2026

https://github.com/mirzayasirabdullahbaig07/advanced-sql-in-python

This repository covers advanced SQL concepts implemented using Python. It demonstrates how to interact with databases, run complex queries, perform joins, aggregations, window functions, and more using libraries like sqlite3, SQLAlchemy, or pandas. Ideal for data analysts and developers looking to integrate SQL power into Python workflows.

data databases dbms mysql nosql programing-language python sql

Last synced: 29 Apr 2026

https://github.com/ozgrozer/electron-store-data

A Node.js module to store Electron data in the computer

data electron store

Last synced: 29 Apr 2026

https://github.com/istinnew/eniac_ab_insight

Dive into a comprehensive analysis aimed at boosting iPhone 13 sales by optimizing the Click-Through Rate (CTR) of the “SHOP NOW” button, compare different button designs and determine the most effective strategy for increasing engagement.

ab-testing data data-analysis data-engineering data-science data-visualization google googlecolab libraries python testing testing-tools visual-studio-code

Last synced: 29 Apr 2026

https://github.com/ipstack/wizard

Wizard for create ipstack databases

composer data geo geoip id-database info ip ipstack ipstack-wizard php wizard

Last synced: 29 Apr 2026

https://github.com/dxtaner/graphql_events

Graphql-Events

data events graphql

Last synced: 29 Apr 2026

https://github.com/petzi53/repairdata

Open Repair Alliance Datasets 2021

data open-data open-datasets r repair repair-cafe repairs

Last synced: 22 Jun 2026

https://github.com/sehaj003/boston-bruins-roster-planning-mysql-nosql

Repository for Data Management project, Boston Bruins Roster Planning using MySQL and NoSQL along with data analysis using Python

data data-management mongodb mysql project-repository python

Last synced: 11 May 2026

https://github.com/axnjr/csv-parser-utils

My own Pandas in Go, Python & Rust, Utility methods for Handling CSV Files in Core Go & Rust with bindings for python.

csv data dataanalysis datatools go golang golang-application pandas python rs rust

Last synced: 29 Apr 2026

https://github.com/creativecuriositystudio/cruddle

(DEPRECATED) Simplifying CRUDL screen development using ModelSafe

angular2 crud data html model typescript ui web

Last synced: 09 Apr 2026

https://github.com/grace-mengke-hu/redditpushshiftapi

This package is for collecting Reddit dataset and organize the data in Mongo Database

collection data reddit

Last synced: 13 Jun 2025

https://github.com/raghavendranhp/youtube_data_harvesting

The "YouTube Data Analyzer" is a versatile tool for businesses and content creators, enabling them to gather, analyze, and harness valuable insights from multiple YouTube channels. With streamlined data collection, storage in MongoDB, migration to SQL, and a user-friendly Streamlit interface, it empowers users to make data-driven decisions

apiintegration data datacollection eda googleapi googleapiclient matplotlib mongodb mysql mysqlconnector numpy oops pandas pymongo python pythonoops sql sqlalchemy streamlit youtube-api

Last synced: 13 Apr 2026

https://github.com/mustafaozvardar/selenium-eksisozluk

This project is a simple web scraper built with Python using Selenium. It extracts and prints the content of popular entries from a specific EksiSozluk page.

data python selenium selenium-python

Last synced: 29 Apr 2026

https://github.com/quangandrei1003/france_air_pollution_pipeline

End-to-end air pollution data pipeline for French metropolitan cities using Airflow, Python, dbt, BigQuery.

airflow bigquery data data-analytics data-engineering data-modeling data-visualization dbt docker etl pandas python terraform

Last synced: 13 Apr 2026

https://github.com/sakshamarora07/blinkit-sales-report-power-bi

This dashboard provides Blinkit with insights to optimize its grocery delivery operations and understand customer preferences. It evaluates sales trends, outlet performance, and item categories to identify key areas for improvement. The interactive visuals allow detailed exploration of sales distribution, customer ratings, and product popularity.

data data-science dataanalytics datavisualization excel powerbi sql

Last synced: 08 Jan 2026

https://github.com/living-with-machines/zoonyper

Code to make it easy to import and process Zooniverse annotations and their metadata in Python/Jupyter Notebooks

crowdsourcing data data-processing data-science python zooniverse

Last synced: 04 Jul 2025

https://github.com/abdiasarsene/edusight-data-driven-insights-for-smarter-education

EduSight transforms educational data into actionable insights, helping NGOs, schools, and policymakers improve academic performance, optimize resources, and evaluate learning programs for better outcomes.

data excel github powerbi

Last synced: 26 Jan 2026

https://github.com/cljoly/data

📊 Data sets to populate some parts of my website (mostly https://cj.rs/open-source/).

data open-source sqlite wip

Last synced: 03 May 2026

https://github.com/revolutionarybukhari/datawarehouse_meshjoin_superstore

A dataware house is generated for streaming data of a superstore using extended mesh join by Syed Husnain Haider Bukhari

data data-science data-warehousing meshjoin

Last synced: 23 May 2026

https://github.com/sysread/skewer

A priority queue for Go implemented using a skew heap

binary data go heap min minqueue priority queue skew structure

Last synced: 26 Aug 2025