An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/fatihilhan42/nba-players-data-1950-to-2021

In this project, the data of the NBA players between the years 1950-2021 were examined. After the NBA players' season, height, performance, averages of points, teams and positions they played were obtained through csv files, important tables and graphs were created using data cleaning and data visualization algorithms.

data data-analysis data-engineering data-science data-visualization

Last synced: 16 Oct 2025

https://github.com/charon25/weatherdata

17 000 weather measurements collected by a weather station created for a college project.

csv data dataset datasets json measurements strasbourg weather weather-data

Last synced: 16 Jan 2026

https://github.com/elcarrillo/structpy

StructPy is a Python-based command-line tool designed for academics and scientists to manage data projects effectively. It simplifies workflows by creating structured project directories, generating timestamped filenames, validating datasets, and backing up projects seamlessly.

command-line-tool data database file-structure organization python science-tool

Last synced: 24 Apr 2026

https://github.com/abdullahashfaqvirk/Earth-Engine-Data-Scraper

A Python based web scraper designed to extract and organize dataset metadata from the Google Earth Engine Datasets Catalog for research, and analysis purposes.

beautifulsoup data data-science python requests scraper web-scraping

Last synced: 27 Sep 2025

https://github.com/howwohmm/fetchgram

era-adjusted Instagram content intelligence β€” scrape any public profile, OCR every image, measure what actually works. free, local, no API keys.

analytics cli content-strategy data instagram ocr python scraper

Last synced: 06 Jun 2026

https://github.com/eva-kaushik/data-clustering

Clustering Accelerators for hard and soft clustering, including implementations of K-means, K-medoids, hierarchical clustering, fuzzy C-means, and Gaussian mixture models. Demonstrates text clustering using both hard and soft clustering algorithms.

clustering clustering-algorithm data datascience machine-learning-algorithms

Last synced: 09 Apr 2025

https://github.com/saboye/sales-performance-analysis

A dashboard that presents monthly sales performance by product segment and product category to help clients identifying the segments and categories that have met or exceeded their sales targets, as well as those that have not met their sales targets.

dashboard data data-science eda tableau visualization

Last synced: 27 Jan 2026

https://github.com/mat06mat/matbot

My discord bot code

data discord-bot discord-py py-cord

Last synced: 17 Oct 2025

https://github.com/yuvrajsaraogi/-iris-flower-classification

Iris flower has three species; setosa, versicolor, and virginica, which differs according to their measurements. Now assume that you have the measurements of the iris flowers according to their species, and the task is to train a machine learning model that can learn from the measurements of the iris species and classify them.

classification data data-analysis data-science data-visualization flower flower-classification iris iris-classification iris-flower iris-flower-classification knn knn-classification machine-learning machine-learning-algorithms ml natural-language-processing nlp python

Last synced: 24 Apr 2026

https://github.com/cyberoctane29/python-for-data-analysis

A repository dedicated to learning Python for data analysis, data science, and data analytics. This collection of Jupyter notebooks covers practical exercises and concepts from the Google Advanced Data Analytics Professional Certificate program.

data data-analysis data-analytics data-science python

Last synced: 24 Apr 2026

https://github.com/ronknight/user-data-dashboard

πŸ“ˆ A data visualization tool for analyzing user data using an Excel-based data source.

dashboard data excel ga4 screenshot

Last synced: 17 Oct 2025

https://github.com/haideratgh/sql-data-analytics-project

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis

analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-query sql-server window-functions-in-sql

Last synced: 29 Jun 2025

https://github.com/enoch208/eventmaster

A user-friendly application that helps you easily record and play back your keyboard and mouse actions. With its modern design using `tkinter` and `ttkthemes`, it provides a smooth and easy-to-use interface. The app combines reliable technical features to give you a great experience.

automation data key keylogging-python replay spy tools

Last synced: 01 Jun 2026

https://github.com/hruth-vik/sales-analysis-report

SalesScope is a powerful sales analytics dashboard that extracts insights, reveals trends, and drives strategy from raw data.

analytics data powerbi-report powerbi-visuals python

Last synced: 24 Apr 2026

https://github.com/jun-labs/jq

🧷 Let's practice jq.

data jq json json-data parse

Last synced: 27 Sep 2025

https://github.com/saikatharryc/motionchart-d3js

A dynamic Motion chart Built with D3 js.

chart d3js data data-science

Last synced: 23 Dec 2025

https://github.com/issacto/kowloonwestparking

Deployed Web App

data hongkong react

Last synced: 24 Apr 2026

https://github.com/stdlib-js/ndarray-vector-bool

Create a boolean vector (i.e., a one-dimensional ndarray).

bool boolean constructor ctor data javascript ndarray node node-js nodejs stdlib structure types vec vector

Last synced: 24 Apr 2026

https://github.com/psgebeline/harvard-data-science

My work for the nine courses in Harvard's data science program, each with notes/assignments. Work in progress.

data linear-regression machine-learning modeling probability-theory r visualization wrangling

Last synced: 19 Oct 2025

https://github.com/vidupriya/aws-glue--data-copy

The function for copying data like CSV, Parquet, avro etc., from a source S3 bucket to a destination S3 bucket using AWS Glue. It includes the necessary setup for the Glue job, logging, reading data from the source bucket, and writing it to the destination bucket

aws awsglue awss3 data data-copying glue glue-job pyspark python3 s3 s3-bucket s3-buckets s3-storage spark

Last synced: 02 May 2026

https://github.com/equinor/fmu-sumo-uploader

Upload to Sumo in the FMU context

data fmu python subsurface sumo

Last synced: 06 May 2026

https://github.com/entorb/analyze-ha-energy

Analyze Home Assistant Solar Production Data

data home-assistant pandas photovoltaic pv python

Last synced: 08 May 2026

https://github.com/marielachirinosr/cyclistic-data-analytics-project

This project explores user behavior within a fictional bike-sharing system, modeled after Cyclistic, operating in Chicago.

data data-visualization pandas powerbi-report powerbi-visuals python

Last synced: 24 Apr 2026

https://github.com/vishwas-chakilam/hr-dashboard

This project involves creating an interactive HR Dashboard using Power BI for visualization and MySQL for data cleaning and analysis. It provides insights into employee performance, attrition, salary distribution, and hiring trends.

dashboard data datac datacleaning datavisualization mysql powerbi

Last synced: 23 Mar 2025

https://github.com/tdjsnelling/hermes

Hermes is a real-time data framework for React + MongoDB

data docker framework mongodb nodejs react react-hooks reactjs real-time typescript websocket

Last synced: 12 Apr 2026

https://github.com/rrwen/py-examples

Collection of python examples in each branch

beginner data download excel guide introduction links processing python reference spreadsheet url xls

Last synced: 05 May 2026

https://github.com/badranalyst/covid-deaths-and-vaccinations-sql-data-exploration

This project involves exploratory data analysis on COVID-19 deaths and vaccinations data using SQL. It aims to uncover trends, patterns, and insights related to vaccination rates and their impact on mortality. The analysis provides a clearer understanding of the pandemic's dynamics, facilitating data-driven decisions in public health.

covid-19 data data-exploration dataset sql

Last synced: 19 Feb 2026

https://github.com/erencelik/binance-public-data-node

Nodejs downloader and unzipper script for Binance Public Data

binance data downloader nodejs public script

Last synced: 15 May 2026

https://github.com/elimu-ai/ml-event-simulator

πŸ€– Simulation of learning events and assessment events

data learning-analytics machine-learning ml

Last synced: 28 Feb 2025

https://github.com/thedevreda/jadaerospace

A Real life project showing how to improve selling aircraftparts and helping salers to focus more on effective products at JadAero

data data-analysis data-cleaning data-visualization jupyter-notebook powerbi python

Last synced: 02 Aug 2025

https://github.com/ybelenko/openapi-data-mocker-interfaces

Package with OpenApiDataMocker interfaces.

data fake faker interface mock mocker oas oas3 openapi swagger

Last synced: 05 Jan 2026

https://github.com/anct-cartographie-nationale/mednum-cli

✨ Interface en ligne de commande pour la transformation des données de lieux de médiation numériques collectées dans un format non standard vers le schéma de la mednum et leur publication sur data.gouv

anct betagouv data donnees gouvernement mediation-numerique nodejs open-data transformation

Last synced: 02 Aug 2025

https://github.com/denisecase/dc-mailer

Send an email using Python

alerts data email python streaming

Last synced: 11 Apr 2025

https://github.com/octoenergy/tentaclio-snowflake

A python project containing all the dependencies for snowflake tentaclio schema.

data

Last synced: 20 Oct 2025

https://github.com/hit07/fitgpt-hacksc

AI-Powered Fitness Coach; πŸ₯ˆ Runner up at HackSC's SoCal Tech Week hackathon

data elasticsearch gpt-4o-mini llm pipeline

Last synced: 28 Feb 2025

https://github.com/dilkushsingh/webscraping-with-selenium-and-beautifulsoup

Web Scrapped a popular tech gadgets website using Selenium and BeautifulSoup, also performed Data Analysis on scrapped data.

beautifulsoup data datacleaning datagathering eda exploratory-data-analysis python selenium webscraping

Last synced: 24 Feb 2026

https://github.com/plurid/defocus

Apophatic User Content Resolution [Desearch Concept]

data

Last synced: 08 Nov 2025

https://github.com/rubyonworld/ldpath

This is a ruby implementation of LDPath, a language for selecting values linked data resources.

data ldpath resource ruby

Last synced: 12 Nov 2025

https://github.com/plurid/delog

Cloud Service for Centralized Logging

cloud data logging

Last synced: 08 Nov 2025

https://github.com/mohibmirza-py/email-verifier-script

Streamlit app to verify emails in bulk

ai analysis data streamlit

Last synced: 29 Apr 2026

https://github.com/cosmos-loops/cosmos-data

Cosmos.Data is a inline project of COSMOS LOOPS PROGRAMME to provide several SQL-Query, RMDB/ORM and No-SQL components' extensions.

connection-pool data mysql mysqlconnector oracle postgresql sqlite sqlkata sqlserver transaction uow

Last synced: 12 Apr 2026

https://github.com/bhojpur/dlm

The Bhojpur DLM is a software-as-a-service product used for Data Lifecycle Management based on Bhojpur.NET Platform for data delivery.

data lifecycle-management

Last synced: 19 Feb 2026

https://github.com/mehmetkahya0/gallstone_dataset_analysis_project

Safra Taşı Hastalığı (Gallstone-1) Veri Seti Analizi (https://archive.ics.uci.edu/dataset/1150/gallstone-1)

analysis analytics data data-analysis data-science data-visualization database graph matplotlib python

Last synced: 25 Apr 2026

https://github.com/acovaci/orbit

ORBIT: an Open source Rust-based implementation of a data Build Tool, inspired by DBT

cargo clap-rs data data-warehouse dbt rust rust-lang tokio-rs

Last synced: 16 Mar 2025

https://github.com/plurid/datasign

Single Source of Truth Data Contract Specifier

data file-format

Last synced: 08 Nov 2025

https://github.com/nushratjabenaurnima/cse_477_data_mining

A collection of labs, reports, Jupyter notebooks, and project outputs for the CSE 477 Data Mining course. This repository tracks my learning journey through data preprocessing, association rules, clustering, classification, and real-world data analysis with Python.

data data-analysis data-mining data-science google-colab-notebook jupyter-notebook machine-learning python python-3

Last synced: 09 Apr 2026

https://github.com/politicaargentina/opinar

πŸ“ˆ ICG toolbox for R - Indice de Confianza en el Gobierno πŸ‡¦πŸ‡· (Universidad Torcuato Di Tella)

argentina data political-science politics public-opinion

Last synced: 22 Oct 2025

https://github.com/cityofnewyork/nyco-wp-open-data-transients

Interface for saving Open Data endpoints as WordPress Transients. Maintained by @NYCOpportunity

civic-tech composer data nycopportunity open-data plugin transients wordpress

Last synced: 10 Apr 2026

https://github.com/robertoostenveld/dcn.dsc_62002071_01_114_v1

Simon task M/EEG data [Data set].

data datalad open-data

Last synced: 23 Jan 2026

https://github.com/ssanthosh010303/collection-data-training

A collection of challenges exercised during data training program.

airflow apache azure azure-data-factory azure-databricks azure-logic-apps bigdata data hadoop spark

Last synced: 27 Jan 2026

https://github.com/maluscat/reactive-storage

[MIRROR] Register, observe and intercept deeply reactive data without the need for proxies

data javascript reactive typescript

Last synced: 10 Mar 2026

https://github.com/rubix982/product-quality-classification

This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.

data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp

Last synced: 25 Apr 2026

https://github.com/edjoukou/human_resources

A data analysis project using MySQL Server database

analysis data mysql powerbi sql visualization

Last synced: 25 Sep 2025

https://github.com/andrewl/danelaw

Geopackage containing the boundary of the Danelaw

data geospatial medieval viking

Last synced: 23 Jan 2026

https://github.com/xjwllmsx/hacker-news-engagement

Analyze Hacker News data to reveal which post types and posting hours spark the most discussion, using Python and a reproducible Jupyter notebook.

data data-analysis jupyter python

Last synced: 25 Apr 2026

https://github.com/jigyasag18/bird-strikes-in-aviation-project

This project analyzes over a decade of U.S. bird strike data (2000–2011) to evaluate safety risks, damage trends, and cost implications in aviation. Using PostgreSQL for database management and Power BI for dashboard visualization, it uncovers critical insights into when, where, and how wildlife impacts aircraft. Key findings inform strategically.

bird-strike-prevention bird-strike-prevention-in-real-airport data data-analysis data-analysis-project data-visualisation data-visualization data-visualization-project data-visualizations database dataset dax-query postgresql postgresql-database powerbi powerbi-desktop powerbi-report powerbi-visuals sql sql-database

Last synced: 09 May 2026

https://github.com/mlkav/tri-hita-karana

Project Tri Hita Karana - Future Knowledge G20 Bali. DTS Kominfo x Binar Academy.

bali data data-science g20 science

Last synced: 06 Jun 2026

https://github.com/carlos-levi/twitterbots_analise_redesneurais

Projeto para a disciplina de IA - anΓ‘lise exploratΓ³ria e aplicaΓ§Γ£o de tΓ©cnicas de aprendizado de mΓ‘quina para detectar contas automatizadas (bots) na plataforma 𝕏 (Twitter)

data machine-learning twitter-bot

Last synced: 06 Jun 2026

https://github.com/asma-hachaichi/imdb-movies-rating-prediction

This project collects movies information from IMDb using web scraping, then uses this data to guess movie ratings. It combines the skills of gathering data from the internet to predict how well movies are liked.

beautifulsoup4 data data-science machine-learning movies movies-reviews prediction python scraping

Last synced: 31 Mar 2025

https://github.com/ryanga09/digitalent_fundamentaldatascience-selfpractice

A repository of hands-on projects from DigiTalent’s Fundamental Data Science training, covering web scraping, data exploration, data cleaning, and data annotation. Includes Jupyter notebooks and example code for practical learning.

data data-analysis data-science data-visualization dataset digitalent komdigi notebook-jupyter notebooks

Last synced: 02 Aug 2025

https://github.com/viniddev/active_finance

Nesse projeto busquei solucionar um problema corriqueiro que Γ© a dificuldade de se manter atualizado sobre as variaΓ§Γ΅es do mercado de aΓ§Γ΅es e fundos imobiliΓ‘rios. Usei selenium webdriver para buscar informaΓ§Γ΅es e uma API do Telegram para enviar relatΓ³rios para o usuΓ‘rio

automation data data-analisis rpa selenium-webdriver telegram-bot

Last synced: 03 May 2026

https://github.com/sankooc/validatez

object validation for node

data validate

Last synced: 13 May 2026

https://github.com/marielachirinosr/hotel-data-analysis

Pandas & Matplotlib Learning Analysis. Repository featuring data analysis projects using Pandas and Matplotlib libraries

data data-analysis matplotlib pandas python

Last synced: 25 Apr 2026

https://github.com/jneidel/animal-names

Dataset of 100 common animal names

animals data dataset json names opendata

Last synced: 25 Mar 2025

https://github.com/plateformeio/docs

The official documentation of the Plateforme framework

api app asgi async data db docs fastapi plateforme pydantic python restx services sqlalchemy

Last synced: 11 Apr 2026

https://github.com/remcostoeten/github-and-vercel-api-showcase-dashboard

Showcase results of possible fetched data from the Github and Vercel API built in all vanilla js.

api-rest da data express-js github-api nodejs vercel-api

Last synced: 07 Mar 2026

https://github.com/sharoonjoseph321/insurance_fraud_detection

Fraud Detection using machine learning algorithm-KN Neighbors .Data exploration using Pyspark and matplotlib.

analytics data data-science eda high-performance knn-algorithm knn-classification machine-learning matplotlib-pyplot pyspark python seaborn spark statistics

Last synced: 23 Mar 2025

https://github.com/dhanish03/reliance-sales-report-dashboard

This project, Reliance Sales Report Dashboard, showcases a dynamic and interactive Power BI dashboard designed to analyze sales performance. The dashboard provides key insights into various aspects of sales data, including product-wise performance, region-based revenue, and profitability trends.

data datavisualization-project powerbi visualization

Last synced: 23 Jan 2026

https://github.com/harmanveer-2546/reducing-data-entries

Way to delete data entries from csv/excel file using. For excel file, use excel instead of csv in the code.

csv data data-entry delete-data excel numpy pandas python

Last synced: 05 May 2026

https://github.com/anuraganalog/blog

Data Science Blog

anuraganalog blog data science

Last synced: 26 Apr 2026

https://github.com/keminghe/osu

Unofficial and publicly-available NPM data-package about The Ohio State University.

college data majors ohio-state organizations public students university unofficial

Last synced: 06 Jan 2026

https://github.com/entropyorg/p5-data-testimage

:notebook::camera: interface for retrieving test images

cpan data image-analysis

Last synced: 29 May 2026

https://github.com/filipnet/infoscreen

Arduino subscribes values by MQTT and view info on an OLED I2C display

arduino data display i2c mqtt oled-display-ssd1306 visualization weather weatherstation

Last synced: 12 Apr 2026

https://github.com/louis-heraut/dataverseur

πŸ«– A dataverse API R wrapper to enhance the deposit procedure using only R variable declarations

data data-repository data-science datascience dataset dataverse dataverse-api json metadata metadata-management metadata-parser r

Last synced: 24 Oct 2025

https://github.com/moscatellimarco/webscrap-tinydeal

"WebScrap-TinyDeal" is a Scrapy-powered πŸ•·οΈ tool for harvesting product information 🏷️ from TinyDeal. It outputs structured CSV data πŸ“, ready for analysis. Explore the scripts πŸ‘¨β€πŸ’» for an interactive scraping adventure or leverage the data for competitive pricing strategies πŸ“ˆ.

css data datascience html pandas python scrapy web webscraper webscraping

Last synced: 14 Apr 2026

https://github.com/ztgx/muvera

MUVERA: Making multi-vector retrieval as fast as single-vector search

algorithms data google muvera retrieval rust search structure vector

Last synced: 25 Oct 2025

https://github.com/jigyasag18/airline-performance-and-passenger-satisfaction-project-using-big-data-analytics

This project analyzes 10 years of U.S. domestic airline data (~3GB) using Hadoop (Cloudera) and Hive for data processing. Power BI dashboards visualize key metrics like delays, on-time rates, air time, and diversions. The solution includes Hive queries, DAX measures, HDFS ingestion scripts, and year-wise insights with recommendations.

big-data big-data-analytics bigdata cloudera cloudera-hadoop cloudera-hadoop-framework data data-analysis data-visualization database hadoop hive power-bi powerbi powerbi-dashboard powerbi-dashboards powerbi-report powerbi-visuals powerbi-visuals-tools powerbidashboard

Last synced: 01 Aug 2025

https://github.com/byndyusoft/byndyusoft.data.relational

Relational abstractions for Byndyusoft.Data.Relational.

byndyusoft data dataaccess db relational-databases

Last synced: 25 Oct 2025

https://github.com/jigyasag18/global-terrorism-1970-2017-analysis-using-big-data

This repository explores over 180,000 terrorist incidents across 205 countries using Hadoop and Power BI. The project identifies global and regional patterns in terrorism, analyzes the impact on civilians, and highlights high-risk areas. Key insights include attack trends,weapon usage,top terror groups,& country-specific risks like those in India.

big-data big-data-analytics data data-analysis data-visualization dataanalytics dataset hadoop hive hive-database hive-db hivedb power-bi powerbi powerbi-dashboards powerbi-desktop powerbi-report powerbi-report-validation powerbi-visuals powerbidashboard

Last synced: 19 Feb 2026

https://github.com/uznetdev/smoking-prediction

This project focuses on analyzing the "Smoking" dataset and building a predictive model for smoking status based on various health metrics. The goal is to identify factors influencing smoking behavior and develop a reliable model for prediction.

ai classification data data-science kaggle-competition machine-learning ml roc-auc sklearn smoking

Last synced: 17 Apr 2026

https://github.com/jigyasag18/ai-ml-salaries-and-ai-tools-usage-trends

This repository presents an in-depth Power BI analytics report on the AI job market trends and student AI tool usage from 2020 to 2025. It combines structured datasets (job postings, salaries, surveys) with custom DAX measures to uncover key patterns in salaries, remote work, industry demand, and student engagement. 5 interaractive dashboards made.

analysis data data-analysis data-visualization dataanalysis dataanalytics dataset datavisualization power-bi powerbi powerbi-dashboards powerbi-desktop powerbi-report powerbi-visuals powerbidashboard visualization

Last synced: 16 Feb 2026