An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/sandygcabanes/etl-earthquake-data-from-usgs-google-cloud-composer-airflow

Airflow, Google Cloud Composer, GCS, BigQuery, Python. This automated pipeline pulls daily earthquake data from a trusted public source, stores it securely in the cloud, and organizes it into clean, searchable tables for analysis.

cloud composer dag data engineering etl etl-pipeline google json python

Last synced: 01 May 2026

https://github.com/muhammadadilnaeem/bcg-data-science-job-simulation-on-forage-august-2024

This repository contains all the tasks, code, and documentation completed during the BCG Data Science job simulation on The Forage platform. The simulation focused on analyzing customer churn, building predictive models, and presenting insights for a major utility company.

bcg customer-churn-prediction-with-machine-learning data data-science forage numpy pandas

Last synced: 01 May 2026

https://github.com/anburocky3/cbse-schools-data

Fetch CBSE Schools in seconds and use it for your data projects

cbse data data-analysis data-science grabber nextjs

Last synced: 24 Jun 2026

https://github.com/0xhericles/spamdetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 02 May 2026

https://github.com/gcoronelc/ucv_gdi-1_202302-a2

Taller de Gestión de Datos e Información I con Gustavo Coronel.

data data-science database databases machine-learning machinelearning oracle sql sql-server

Last synced: 02 May 2026

https://github.com/vidushibhadana/eda-on-nyc-taxi-data

About Conducting an Exploratory Data Analysis (EDA) on New York City taxi data and visualizing it through countplots, distribution plots (displot), and histograms using Python and it's libraries.

data data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 11 Apr 2026

https://github.com/s1dewalker/electric-future

Visual Analysis: Future of Automotive Industry

data data-visualization machine-learning python3 regression-analysis tableau

Last synced: 02 May 2026

https://github.com/jesuscc1993/data-cleaner-extension

Clears browser data in a single click.

application-data chrome chrome-extension data

Last synced: 02 May 2026

https://github.com/badranalyst/movie-correlation-analysis-in-python

This project analyzes movie data correlations using Python libraries like Pandas, NumPy, Seaborn, and Matplotlib. It examines relationships between attributes such as ratings, genres, and box office performance to uncover trends that inform recommendations and enhance understanding of movie success factors.

data data-analysis dataset jupyter jupyter-notebook matplotlib matplotlib-pyplot numpy pandas python seaborn

Last synced: 03 May 2026

https://github.com/prakashpandey16/sql_data_warehouse_project

Building a modern data warehouse with SQL Server, including ETL Processes, data modeling, and analytics.

cleaning-data data data-engineering data-science database etl-pipeline sqlserver

Last synced: 03 May 2026

https://github.com/arnavk-09/phishing-detection

🎣 Detect Phishing URLs with Data Pre-fitted... API & Web UI

csv data fastapi flask python scikit-learn

Last synced: 03 May 2026

https://github.com/yash-chauhan-dev/spark_cluster_docker

Set-up local spark cluster, hadoop (hdfs), airflow, postgresql on docker with ease, without any local installations

apache-spark data data-engineering data-engineering-pipeline deployment docker docker-compose hadoop hdfs local-development localhost pyspark python

Last synced: 04 May 2026

https://github.com/soham7998/data-analysis-projects

My Data Analysis Projects which are completed by me and gain a hands on Experience from each project. the project showcase different Concepts , Visualization and many things.

data data-analysis data-science machine-learning nlp python soham visualization

Last synced: 04 May 2026

https://github.com/maxwelllzh/gis-tutorial-

Tutorials for Columbia University GIS Club

data python

Last synced: 04 May 2026

https://github.com/muhammadadilnaeem/student-performance-indicater-end-to-end-data-science-project

This project leverages data science techniques to build a predictive model that estimates a student's exam performance. The project follows a structured data science workflow, including data collection, preprocessing, model building, evaluation, and deployment.

data machine-learning-algorithms pandas pymysql python sql

Last synced: 11 Apr 2026

https://github.com/jdanielgoh/cobertura-campanias

En una democracia ¿caben todas las voces? Proyecto para visualizar el monitoreo de radio y TV que realiza el INE de las candidaturas presidenciales 2024

d3js data datavisualization vue

Last synced: 09 Jun 2026

https://github.com/parzibyte/jsonp-php

Ejemplo de JSONP con PHP

data example json jsonp php request

Last synced: 04 May 2026

https://github.com/gabya06/twitter_models

Repository used for twitter impression models

data data-science impressions machinelearning python ridge-regression sklearn twitter

Last synced: 04 May 2026

https://github.com/a-poor/datatransform.jl

A package for defining (and performing) tabular-data transformations with JSON.

data data-science data-transformation etl feature-engineering json julia julia-package tabular-data

Last synced: 05 May 2026

https://github.com/chompfoods/stub-nodejs-server

Node.js server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery ingredients node node-js node-server nodejs nutrtion raw recipe-api recipes server server-stub stub stub-server

Last synced: 05 May 2026

https://github.com/rdmurphy/deno-quaff

A port of the quaff Node.js library to Deno.

archieml csv data deno json toml yaml

Last synced: 05 May 2026

https://github.com/chanchalsoorma/web-scraping

This repo aims to provide a straightforward, easy-to-use scraping code written in Python.

beautifulsoup beautifulsoup4 data python request selenium webscraping

Last synced: 05 May 2026

https://github.com/mito-ds/mitosheet_helper_config

The mitosheet_helper_config package used by enterprises to configure the mitosheet package.

data data-analytics data-science data-visualization jupyter pandas python

Last synced: 05 May 2026

https://github.com/shibbbbs/fastapi_project

A FastAPI application that reads financial data from an Excel file (capbudg.xls) and provides API endpoints to list available tables (sheet names), fetch row names from a selected table, and calculate the sum of numerical values from a specified row. The API is accessible via a web-based interactive documentation at /docs

data dataanalysis fastapi pandas python

Last synced: 06 May 2026

https://github.com/ksm26/ml-ai-data-science-jobs-in-canada

Explore the latest machine learning, artificial intelligence, and data science job opportunities in Canada. Stay informed about Canadian tech job market trends and find your next career move.

ai-canada ai-careers canada canadian-tech-companies canadian-tech-job-market data data-analysis data-engineering data-science data-science-careers machine-learning prompt-engineering robotics

Last synced: 06 May 2026

https://github.com/ekoepplin/dbt-bigquery-core

How to get data to BigQuery (or duckDB) and setup dbt tests for SODA cloud monitoring

bigquery data data-quality dbt dlt duckdb gcp soda

Last synced: 06 May 2026

https://github.com/bryanhe24/data_analysis_app

A full-stack web application that allows users to upload CSV datasets, analyze the data with statistical summaries and visualizations, and interact with an AI-powered assistant for querying the dataset.

ai data data-analysis data-visualization fullstack-development javascript math python reactjs

Last synced: 07 May 2026

https://github.com/chardos/get-git-data

Access git repository data in node.

data git javascript node

Last synced: 07 May 2026

https://github.com/danyal-faheem/project-logs-analyzer

This repo contains scripts to analyze project logs and display some charts related to the data

data data-visualization matplotlib pandas python streamlit

Last synced: 07 May 2026

https://github.com/zsvoboda/olympics

Self service analytics of 120 years of Olympics data

analytics dashboards data datavisualization dataviz olympics open-data open-datasets opendata reports

Last synced: 08 May 2026

https://github.com/writetome51/page-load-access

A TypeScript/Javascript class that loads a batch (array) of data from a larger set too big to be loaded all at once.

batch class data javascript load loader typescript

Last synced: 16 May 2026

https://github.com/basemax/okala-product-ids

A PHP script to fetch and save product IDs from Okala's online store API across multiple categories and store branches.

crawler crawler-okala crawler-php crawlers data database ids ir iran json okala okala-crawler php php-crawler product

Last synced: 09 May 2026

https://github.com/caiorss/julia-box-docker

Docker that provides a development environment for Julia language, Octave, Python, R (Rlang) with a Jupyter Notebook; Jupyter QtConsole and so on.

data datascience deveops docker julia jupyter octave python rlang scientific

Last synced: 09 May 2026

https://github.com/master-helix/ibm-data-analyst-certification-stock-analysis-project

This is a mini project repository of my IBM Certification involving stock analysis and plotting of Tesla and GameStop

analytics data data-analysis data-visualization ibm matplotlib pandas python web-scraping

Last synced: 09 May 2026

https://github.com/mohamedbilal1800/olympic_history_data_analysis

This project delves into the 120 Years of Olympic History: Athletes and Results dataset, analyzing athlete demographics, medal achievements, and country performances across the Summer and Winter Olympics from 1896 to 2016.

analysis data eda matplotlib-pyplot pandas python seaborn visulaization

Last synced: 09 May 2026

https://github.com/tomcardoso/journalism-data-intersection

A talk on working at the intersection of journalism and data science

data data-journalism journalism

Last synced: 15 May 2025

https://github.com/baranasoftware/curricular-api

The design and implementation of a REST API for student and course data for a Higher Ed institution.

aws data data-pipeline go golang lambda rest rest-api sqlite3 system-design terraform

Last synced: 09 May 2026

https://github.com/fgazzelloni/20240930-dwpwr

Data Wrangling Practice with R - 30 September Tutorial for R-Ladies Rome

data data-science data-structures data-wrangling

Last synced: 28 Jun 2026

https://github.com/yuvrajsaraogi/car-price-prediction-with-machine-learning

The price of a car depends on a lot of factors like the goodwill of the brand of the car, features of the car, horsepower and the mileage it gives and many more. Car price prediction is one of the major research areas in machine learning. So, if you want to learn how to train a car price prediction model then this project is for you.

car-price-prediction-with-machine-learning data data-science deep-learning deep-neural-networks engineer github learning machine-learning mini-project natural-language-processing prediction predictive-modeling project python3 sql

Last synced: 15 Apr 2026

https://github.com/kashifkhan7/cleaning-analysis_cli

Analyze sales data easily with our CLI app. Gain insights on revenue trends and visualize results using Python, Pandas, and Matplotlib. 🚀📊

conditional-statements css data datacleaning exception-handling exiftool html json matplotlib-pyplot metadata metadata-extraction pandas-python python sales-analysis seaborn-python speech-to-text transcription youtube

Last synced: 13 Apr 2026

https://github.com/vatshayan/pokemon-analysis

Visualization, Analysis & Predicting the accuracy of finding Pokemon power, attack & speed through Machine Learning

artificial-intelligence data data-analysis data-science data-visualization dataset machine-learning machine-learning-algorithms pokemon scikit-learn

Last synced: 30 May 2026

https://github.com/laguer/jupyt-nb

Mathematical and Physical Constants ratios in Cosmology and micro physics

analysis constants cosmology data dimensional julia mathematical micro notebook physical physics python ratios science

Last synced: 13 Apr 2026

https://github.com/lefuturiste/npm-api

Search or get a npm package

api data npm php

Last synced: 14 May 2026

https://github.com/yeti-robotics/past-scouting-data

❄️ Scouting Data from Previous Events/Seasons ❄️

data first frc

Last synced: 06 Jan 2026

https://github.com/nafisalawalidris/nafisalawalidris

Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and Bitcoin converge.

artifical-intelligence bitcoin config data data-science developer github-config github-pages machine-learning

Last synced: 16 May 2026

https://github.com/woctezuma/recent-sales-data

Data available to estimate sales of Steam games during release week.

data sales steam

Last synced: 05 Feb 2026

https://github.com/rajkumarbestha/nsedataextractor

NSEDataExtractor

data python python3

Last synced: 26 Mar 2025

https://github.com/fehmitahsindemirkan/web-scrapper

Professional and high performance web scraping project.

data ecommerce emailsender fileexplorer logging python web webscraping

Last synced: 10 Jan 2026

https://github.com/boytchev/coursedataviz

Supplementary materials for "Data Visualization" course

data fmi su visualization

Last synced: 16 Mar 2025

https://github.com/bkestelman/dasy-ml

DaSy DataSynthesizer - Create synthetic data with desired statistical properties for machine learning research.

data data-science machine-learning

Last synced: 14 Jan 2026

https://github.com/meokullu/prefill

PreFill adds desired characters onto output values to increase their legibility.

alignment data data-analysis data-engineering data-science legibility

Last synced: 17 Jan 2026

https://github.com/rishitabansal9/adult-census-income-prediction

This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.

data data-analysis data-science feature-engineering random-forest-classifier

Last synced: 25 Mar 2025

https://github.com/paezha/bsantiago

A data package with the results of a travel and well-being survey conducted in Santiago in 2016

data equity package r santiago survey travel well-being

Last synced: 18 Mar 2025

https://github.com/sadratehranian/data-collection-and-machine-learning

create a model using logistic regression to predict whether the fire alarm of a smoke detector should sound or not. Second, predicts whether an electric drive in a production plant may be faulty or not.

data data-analysis data-science datacollection logistic-regression machine-learning ml nn

Last synced: 05 Jan 2026

https://github.com/vidushibhadana/covid19-data-exploration-using-sql

Deployed diverse SQL techniques to analyze COVID-19 data for an improved understanding of pandemic's regression.

data database database-management sql

Last synced: 19 Aug 2025

https://github.com/Coko7/vegapull-records

Cards dataset for One Piece TCG

data one-piece one-piece-card-game one-piece-tcg tcg

Last synced: 28 Apr 2025

https://github.com/microsoftcloudessentials-learninghub/demosscenarios-techtalks

This repository showcases demonstrations and scenarios using Microsoft Cloud technologies. Please note that these demos are intended as a guide and are based on my personal experiences.

ai analytics azure copilot data data-science fabric m365 microsoft-general ml powerapps powerbi privatebot security sharepoint

Last synced: 14 Mar 2026

https://github.com/remidumas/rstats

RStats weblog

data ia r science stats

Last synced: 25 Mar 2025

https://github.com/quonverbat/ordner

A simple, customizable and cross-platform data tracker.

data datatracker javafx management

Last synced: 07 Jul 2025

https://github.com/bmcollier/contiguous

Provides COBOL-style contiguous data structures in Python

cobol contiguous data python

Last synced: 14 Jan 2026

https://github.com/ankitrai259/sales_insight_dashboard

Sales Insight: Using SQL for data cleaning and Power BI for making interactive dashboard

dashboard data data-visualization datacleaning postgresql powerbi sql

Last synced: 17 Mar 2025

https://github.com/nagipragalathan/linkedin_backup_datas

This repository contains the backup data from my previous LinkedIn account. Unfortunately, my old LinkedIn account was compromised and subsequently blocked by LinkedIn. As a result, I created a new account, but that too got blocked for reasons unknown to me.

backup blocked data linkedin linkedin-account memory nagipragalathan recovery storage

Last synced: 18 Jan 2026

https://github.com/blueheron786/quranic-universal-library-mushaf-layouts

The Quranic Universal Library (QUL)'s Qur'an mushaf 15-line layouts (madini, uthmani)

data database layout mushaf quran sqlite uthmani uthmani-quran

Last synced: 13 Apr 2026

https://github.com/stupidcucumber/elephant-crawler

System for mining texts from websites.

data data-mining-python python

Last synced: 25 Apr 2026

https://github.com/henryssondaniel/teacup-service-report-mysql-java

Connect your Teacup report data to a MySQL database

data logs mysql reports teacup

Last synced: 13 Apr 2026

https://github.com/vishwas-chakilam/twitter-sentiment-analysis

Twitter Sentiment Analysis is a Python project that analyzes the sentiment of tweets based on a user-defined keyword. It uses Tweepy to fetch tweets from the Twitter API and TextBlob for sentiment analysis. The application features a user-friendly GUI with Tkinter, displaying tweet sentiment as positive, negative, or neutral.

api data data-science dataanalysis python3 textblob-sentiment-analysis tkinter tweepy-api

Last synced: 11 Mar 2025

https://github.com/karosi12/ng-data-share

Angular communication with input and output properties

angular communication data data-binding input output sharing typescript

Last synced: 16 Jan 2026

https://github.com/otoneko1102/roulette-base

ルーレットの色と番号をjson形式でまとめたものです。カジノ風ルーレットを作るときにどうぞ。A collection of roulette colors and numbers in json format. Use it when making a casino-style roulette.

casino casino-games data json require roulette

Last synced: 16 Mar 2025

https://github.com/luminati-io/LinkedIn-dataset-samples

Sample dataset of 1001 LinkedIn companies, extracted via Bright Data API, featuring essential data points for competitive analysis and market insights.

data database dataset linkedin linkedin-api linkedin-data linkedin-dataset linkedin-scraper sample web-scraping

Last synced: 09 Apr 2025

https://github.com/living-with-machines/zoonyper

Code to make it easy to import and process Zooniverse annotations and their metadata in Python/Jupyter Notebooks

crowdsourcing data data-processing data-science python zooniverse

Last synced: 04 Jul 2025

https://github.com/sakshamarora07/blinkit-sales-report-power-bi

This dashboard provides Blinkit with insights to optimize its grocery delivery operations and understand customer preferences. It evaluates sales trends, outlet performance, and item categories to identify key areas for improvement. The interactive visuals allow detailed exploration of sales distribution, customer ratings, and product popularity.

data data-science dataanalytics datavisualization excel powerbi sql

Last synced: 08 Jan 2026

https://github.com/grace-mengke-hu/redditpushshiftapi

This package is for collecting Reddit dataset and organize the data in Mongo Database

collection data reddit

Last synced: 13 Jun 2025

https://github.com/samhollings/nhs_data_cleansing

A repo of reusable functions for cleansing data

cleansing data data-cleaning data-cleansing preprocessing pyspark python python3

Last synced: 05 Oct 2025

https://github.com/affan005-ai/tesla-stock-prediction

This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models

data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn

Last synced: 05 Oct 2025

https://github.com/quangandrei1003/france_air_pollution_pipeline

End-to-end air pollution data pipeline for French metropolitan cities using Airflow, Python, dbt, BigQuery.

airflow bigquery data data-analytics data-engineering data-modeling data-visualization dbt docker etl pandas python terraform

Last synced: 13 Apr 2026

https://github.com/rysteq/abstract-data-structures

This repository contains two programs written in C about the stack and queue ADT's

abstract-data-structures c data queue stack

Last synced: 06 Oct 2025

https://github.com/ryanve/i11

CSS named colors list

colors css data dataset

Last synced: 07 Oct 2025

https://github.com/sysread/skewer

A priority queue for Go implemented using a skew heap

binary data go heap min minqueue priority queue skew structure

Last synced: 26 Aug 2025

https://github.com/lexiortiz/advanced-data-analytics

Structured learning notes, code snippets, and key takeaways from the Google Advanced Data Analytics Professional Certificate. Serves as a personal reference for reinforcing concepts and as a resource for others on a similar learning journey.

data data-analysis data-engineering google python-3 sql

Last synced: 29 May 2026

https://github.com/adrianoleitedasilva/adrianoleitedasilva

Me chamo Adriano, tenho 35 anos de idade, sendo 18 anos dedicados as áreas de Tecnologia da Informação e Educação.

adrianoleitedasilva automation ceo cio cto data data-science dev diretor github mobile professor python readme techlead web

Last synced: 10 May 2026

https://github.com/aiwithqasim/project_allocation_system

Project Allocation System (PAS) automates and simplifies the process of Allocating projects to students. Teachers can simply add details on prompting for input and perform a number of operation modules including Adding Projects, Updating Projects, Searching Projects , Deleting Projects and Display All Projects

algorithms-and-data-structures algorthims c-plus-plus data data-structures linked-list

Last synced: 08 Oct 2025

https://github.com/slavos1/covid_data

Use Public Health England daily data to show daily cases.

covid covid-19 covid19 data pandas python3

Last synced: 14 Apr 2026

https://github.com/burythehammer/foosbot-results

Foosball results for the OpenCredo foosbot

data foosball machine-learning python

Last synced: 13 Apr 2026