An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/yash-chauhan-dev/spark_cluster_docker

Set-up local spark cluster, hadoop (hdfs), airflow, postgresql on docker with ease, without any local installations

apache-spark data data-engineering data-engineering-pipeline deployment docker docker-compose hadoop hdfs local-development localhost pyspark python

Last synced: 04 May 2026

https://github.com/fallaciousreasoning/nz-mountains

A list of mountains in NZ, scraped from https://climbnz.org.nz

alpine climbing climbnz data json json-api maps mountaineering scraping

Last synced: 04 May 2026

https://github.com/damisparks/become_data_analyst

Are you new to Data Analysis ? Here you will find simple notebook that will help through your journey. These are personal projects I work on and still working.

data data-analysis data-visualization matplotlib numpy pandas-tutorial

Last synced: 04 May 2026

https://github.com/maxwelllzh/gis-tutorial-

Tutorials for Columbia University GIS Club

data python

Last synced: 04 May 2026

https://github.com/rabeal21/tea

Generate random TEA wallet addresses in bulk with this simple utility. Perfect for testing and exploring the TEA blockchain. 🌱💻

bucklescript bucklescript-tea chinese-translation cli data earlgrey educators hacking ios-automation ios-test ocaml peer-evaluations php red-team teachyourselfcs test-framework translation tui

Last synced: 04 May 2026

https://github.com/sjg/my-search-story

My Search Story is a demo application developed for the Data Portability API Workshop and the #AISprint2025 events. #BuildwithAI

data docker generative-ai google-cloud-platform google-cloud-run nodejs

Last synced: 04 May 2026

https://github.com/srevenant/data-science-alpine

A docker container for data science, using alpine linux and python3

alpine data numpy pandas python3 science scipy xgboost

Last synced: 05 May 2026

https://github.com/a-poor/datatransform.jl

A package for defining (and performing) tabular-data transformations with JSON.

data data-science data-transformation etl feature-engineering json julia julia-package tabular-data

Last synced: 05 May 2026

https://github.com/rrwen/py-examples

Collection of python examples in each branch

beginner data download excel guide introduction links processing python reference spreadsheet url xls

Last synced: 05 May 2026

https://github.com/edjoukou/pizza-sales-report

A data analysis project using SQL with MySQL database

analysis data mysql powerbi visualization

Last synced: 05 May 2026

https://github.com/contawo/travel-journal

This is a travel journal application for storing all the places that you have visited. I was learning by doing react when creating this project. I learnt a lot with it and upgraded my reactjs skills.

data learning-by-doing props reactjs

Last synced: 05 May 2026

https://github.com/chompfoods/stub-nodejs-server

Node.js server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery ingredients node node-js node-server nodejs nutrtion raw recipe-api recipes server server-stub stub stub-server

Last synced: 05 May 2026

https://github.com/rdmurphy/deno-quaff

A port of the quaff Node.js library to Deno.

archieml csv data deno json toml yaml

Last synced: 05 May 2026

https://github.com/chanchalsoorma/web-scraping

This repo aims to provide a straightforward, easy-to-use scraping code written in Python.

beautifulsoup beautifulsoup4 data python request selenium webscraping

Last synced: 05 May 2026

https://github.com/julienmalka/shiftgenerator

ShiftGenerator WeSki 2018

data data-science latex python

Last synced: 06 May 2026

https://github.com/shibbbbs/fastapi_project

A FastAPI application that reads financial data from an Excel file (capbudg.xls) and provides API endpoints to list available tables (sheet names), fetch row names from a selected table, and calculate the sum of numerical values from a specified row. The API is accessible via a web-based interactive documentation at /docs

data dataanalysis fastapi pandas python

Last synced: 06 May 2026

https://github.com/lexz-08/sharpdata

Easily manage DataGridViews or create one with the struct 'DataGridManager' provided.

csharp data datagridview ui user-interface windows windows-forms winforms

Last synced: 06 May 2026

https://github.com/parthds02/analyzing-student-success-with-data

Discover key factors influencing student performance through data analysis and visualization. Explore gender, parental education, sports, and ethnicity impacts.

data datascience jupyter-notebook kaggle python pythonlibraries

Last synced: 06 May 2026

https://github.com/ekoepplin/dbt-bigquery-core

How to get data to BigQuery (or duckDB) and setup dbt tests for SODA cloud monitoring

bigquery data data-quality dbt dlt duckdb gcp soda

Last synced: 06 May 2026

https://github.com/xljones/bugsnag-exporter

Export Bugsnag project, error, and event data easily from a command line call which automatically handles pagination, and API backoffs

bash bugsnag cmd csv data error error-capture error-handling error-reporting event export go golang json project zsh

Last synced: 06 May 2026

https://github.com/fabsdevx/file-format-converter-handout

Data Engineering project for learning purposes. Credits to itversity

csv csv-import data data-engineering database pandas python

Last synced: 06 May 2026

https://github.com/darrendavy12/azure-databricks-setup-guide-with-formula1-csv

Azure Databricks Setup Guide with Formula1 CSV - Azure Databricks, PySpark, Python, Data Lake Storage

apache azure cloud data databricks lake notebooks pyspark python spark storage

Last synced: 06 May 2026

https://github.com/ashleydavis/brisjs-web-scraping-talk

Code to accompany my talk on web scraping for the Brisbane JavaScript meeting in September 2018

cheerio data data-acquisition data-acquisiton electron headless-browsers javascript nightmare nightmarejs nodejs web-scraping

Last synced: 06 May 2026

https://github.com/juanpablodiaz/beertv

A Next.js Full Stack app to displays funny Beer TV Ads

api-routes data next tailwindcss

Last synced: 07 May 2026

https://github.com/miozilla/pandas

pandas :panda_face::panda_face: : Python Library # Data Analysis # Dataframe

analysis data dataframe pandas python sqlite3

Last synced: 07 May 2026

https://github.com/shantanujpk/bigdatacloud

Exploration of PySpark for data processing and interview prep — demonstrates handling corrupted records, applying transformations/actions, and building efficient data pipelines with practical examples.

big-data data jupyter-notebook pipeline pyspark python spark sparksql

Last synced: 07 May 2026

https://github.com/hackersandslackers/hackers-jupyter-posts

:red_circle: :closed_book: Our repository for Jupyter Notebook to serve as blog posts.

blog data data-engineering gatsbyjs jupyter jupyter-notebook python python3

Last synced: 07 May 2026

https://github.com/lab5e/loadabledata

Simple framework-agnostic wrapper around loadable data to help encapsulate and use state changes in a UI.

async data loadable state typescript ui

Last synced: 07 May 2026

https://github.com/bryanhe24/data_analysis_app

A full-stack web application that allows users to upload CSV datasets, analyze the data with statistical summaries and visualizations, and interact with an AI-powered assistant for querying the dataset.

ai data data-analysis data-visualization fullstack-development javascript math python reactjs

Last synced: 07 May 2026

https://github.com/hudson-newey/data-miner

A simple data miner that collects information from an API and stores it in a file

api api-client big-data bigdata data logger logging

Last synced: 10 Jun 2026

https://github.com/chardos/get-git-data

Access git repository data in node.

data git javascript node

Last synced: 07 May 2026

https://github.com/tjas/postgrad-ai-ddv-plotly

Jupyter Notebook to analyze the salaries of Federal District government public servants, using Python, Pandas and Plotly Express, to solve the proposed exercise in "Data Discovery and Visualization" discipline.

analysis analytics data data-analytics data-discovery data-science data-visualization graph graphs jupyter-notebook jupyter-notebooks pandas plotly plotly-express python

Last synced: 07 May 2026

https://github.com/safwan2003/randomforest_heart_disease_prediction

A machine learning project using Random Forest Classifier to predict heart disease. Includes data preprocessing (with binning), feature selection, and model evaluation.

binning data data-science datapipeline datapreprocessing datavisaulization deep-learning machine-learning python random-forest-classifier streamlit

Last synced: 07 May 2026

https://github.com/danyal-faheem/project-logs-analyzer

This repo contains scripts to analyze project logs and display some charts related to the data

data data-visualization matplotlib pandas python streamlit

Last synced: 07 May 2026

https://github.com/aidan-zamfir/the-iliad

Data analysis & relationship network for the characters of Homers Iliad

data data-analysis dataframes networks networkx python selenium spacy webscraping

Last synced: 08 May 2026

https://github.com/zsvoboda/olympics

Self service analytics of 120 years of Olympics data

analytics dashboards data datavisualization dataviz olympics open-data open-datasets opendata reports

Last synced: 08 May 2026

https://github.com/abhash-rai/regression-car-price-prediction

This repository contains my first complete data science project from web scrapping for data to data preprocessing, cleaning, exploratory data analysis, model training and deployment.

data data-science data-visualization eda exploratory-data-analysis machine-learning neural-network prediction prediction-model regression

Last synced: 08 May 2026

https://github.com/vanshuchaudhary/flightpriceanalysis-

The uploaded file is a Jupyter Notebook titled "Flight Analysis". It likely involves analyzing flight-related data, potentially exploring trends, patterns, or insights using data science techniques. The analysis might include data visualization, statistical analysis, or predictive modeling.

business-analytics data data-analysis data-visualization datainsights datascience matplotlib-pyplot python seaborn seaborn-plots seaborn-python sns statistical-analysis

Last synced: 08 May 2026

https://github.com/writetome51/public-data-container-interface

Just a TypeScript interface with 1 property: 'data'

container data interface typescript

Last synced: 15 May 2026

https://github.com/miniql/miniql-csv

A MiniQL query resolver that loads data from CSV files.

comma-separated-values csv data query query-language

Last synced: 08 May 2026

https://github.com/taquece/goals-per-match

basic script to calculate average football goals per match from .CSV

beginner csv data football nodejs python sports-analytics

Last synced: 09 May 2026

https://github.com/chompfoods/sdk-typescript-angular

Angular TypeScript SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

angular api branded chomp data database food grocery ingredients nutrition raw recipe-api recipes sdk typescript

Last synced: 09 May 2026

https://github.com/basemax/okala-product-ids

A PHP script to fetch and save product IDs from Okala's online store API across multiple categories and store branches.

crawler crawler-okala crawler-php crawlers data database ids ir iran json okala okala-crawler php php-crawler product

Last synced: 09 May 2026

https://github.com/tupizz/python-data-manipulation

Data manipulation and visualization with Python 2.x

csv data pandas python

Last synced: 09 May 2026

https://github.com/flexthink/matricize

A convenience library to convert between pure Python objects and their vectorized representations

data machine-learning numpy python

Last synced: 09 May 2026

https://github.com/naitiknayak196/tech-layoffs-cleaning-sql-vs-python

This project cleans and analyzes a tech layoffs dataset using MySQL and Python (Pandas) to compare their efficiency in data processing. It provides business insights into workforce trends, industry stability, and economic impacts to support data-driven decision-making.

data datacleaning dataset jyputer-notebook layoffdata layoffs mysql python sql

Last synced: 09 May 2026

https://github.com/yashkp1234/movie-recommendation-engine

My project on analyzing the movie data set, and creating a recommendation engine using that analysis.

analysis data notebook python recommendation-engine

Last synced: 04 May 2025

https://github.com/machinecyc/lotteryinsight

Use crawler to collect Taiwan Lotto data, and save data into local MySQL server.

crawler data docker lottery mysql-database python3 taiwan

Last synced: 09 May 2026

https://github.com/baranasoftware/curricular-api

The design and implementation of a REST API for student and course data for a Higher Ed institution.

aws data data-pipeline go golang lambda rest rest-api sqlite3 system-design terraform

Last synced: 09 May 2026

https://github.com/thanh-wutan/chess-opening-comparator

Interactive web app using R to visualize and compare chess opening performance and popularity.

chess-openings data databases datavisualisation r

Last synced: 09 May 2026

https://github.com/khushi-sabarad/data_analysis

linkedin learning capstone project

data data-engineering matplotlib pandas python

Last synced: 10 May 2026

https://github.com/adrianoleitedasilva/adrianoleitedasilva

Me chamo Adriano, tenho 35 anos de idade, sendo 18 anos dedicados as áreas de Tecnologia da Informação e Educação.

adrianoleitedasilva automation ceo cio cto data data-science dev diretor github mobile professor python readme techlead web

Last synced: 10 May 2026

https://github.com/hemangsharma/assignment-2---classification-models

Assignment 2 - Classification Models repository contains project for 36106 Machine Learning Algorithms and Applications

data datascience-machinelearning machine-learning ml

Last synced: 10 Jun 2026

https://github.com/infinitode/crsd

A synthetic customer review sentiment dataset for sentiment analysis generated using different AI models.

ai data dataset datasets huggingface-datasets mit-license ml nlp open-source python sentiment sentiment-analysis sentiment-classification text-data

Last synced: 10 Jun 2026

https://github.com/gurpreet0022/airbnb-eda

EDA on Airbnb booking data to uncover valuable insights, trends, and patterns

data data-science dataanalytics insights jupyter-notebook matplotlib numy pandas projects python3 seaborn visualization

Last synced: 11 May 2026

https://github.com/sehaj003/boston-bruins-roster-planning-mysql-nosql

Repository for Data Management project, Boston Bruins Roster Planning using MySQL and NoSQL along with data analysis using Python

data data-management mongodb mysql project-repository python

Last synced: 11 May 2026

https://github.com/alimghmi/bdlc

Bloomberg API integration, handling data requests, processing, and SQL database insertion.

api-client bloomberg data data-processing financial-data oauth2 python sql-database transformation

Last synced: 10 Jun 2026

https://github.com/alpine418/datahandler

Data handler for PHP arrays.

data data-handler php73

Last synced: 10 Jun 2026

https://github.com/mateogiuffra/estrd2024s1

trabajos prácticos realizados en la materia Estructura de Datos de la Universidad Nacional de Quilmes (UNQ)

c cpp data data-structures-and-algorithms eficiency functional-programming haskell unq

Last synced: 12 May 2026

https://github.com/cvinicius987/projetos-bigdata

Estudos de caso envolvendo projetos de BigData e Engenharia de Dados.

bigdata data data-engineering spark

Last synced: 13 May 2026

https://github.com/dev88jerry/cs304

Bishop's University - CS304 Data Structures

bishops bu data data-structures python structure university

Last synced: 11 Jun 2026

https://github.com/mitevpi/poli-parse

Political news scraping & NLP parsing from web pages.

data election javascript library module nlp npm package parse politics scrape sentiment

Last synced: 13 May 2026

https://github.com/flexiui-labs/flexi-grid

Flexi Grid is an advanced, lightweight, and customizable Angular 19 data grid component

angular data filter grid search select sort table

Last synced: 14 May 2026

https://github.com/poojaharihar03/wellness-cities-case-study

A case study for dats analysis of city health centers

analytics data r rstudio

Last synced: 11 Jun 2026

https://github.com/lulloooo/article-fromfitto55tofittoeveryone

Analysis leading to an article published in the EcoSprinter 2024 Annual edition about an Analysis of EU "Fit for 55" packages under a different perspective 🔎

analysis data environment european-union

Last synced: 12 Jun 2026

https://github.com/iannil/one-data-studio

one-data-studio integrates a data governance and development platform, a cloud-native MLOps platform, and a large model application development platform. It connects the entire value chain from raw data governance to model training and deployment, and further to the construction of generative AI applications.

data llm model platform

Last synced: 12 Jun 2026

https://github.com/wiseql/wiseql

The wise data browser — run SQL recipes as small, observable, debuggable steps

data debugging duckdb oracle quality sql tui

Last synced: 13 Jun 2026

https://github.com/asjadnaqvi/stata-tidytuesday

A Stata package for fetching Tidy Tuesday meta data and files

ado data r stata tidytuesday

Last synced: 13 Jun 2026

https://github.com/neuro-mechatronics-interfaces/ros2_data_agent

Code for a multipurpose file explorer specializing in reading ROS2 topic data from '.bag' or '.db3' files

data python ros2

Last synced: 13 Jun 2026

https://github.com/bastianolea/plebiscitos_chile

Datos de resultados electorales de los plebiscitos constitucionales de 2022 y 2023

chile comunas data elecciones politica social

Last synced: 15 Jun 2026

https://github.com/isharescheme/participant-onboarding-portal

Standardized onboarding portal for data space participants.

data onboarding particpant space

Last synced: 15 Jun 2026

https://github.com/arch-fan/pokedata

Pokemon Data in CSV format for whatever you need!

csv data dataset pokemon

Last synced: 17 Jun 2026

https://github.com/ibttf/bayborhood

Interactive map to find the ideal neighborhood in San Francisco based on data.

data data-analysis data-visualization gis mapbox react

Last synced: 18 Jun 2026

https://github.com/dineshram0212/youtube-analysis

This YouTube Analysis Package provides tools for analyzing YouTube video data, including metrics on views, likes, comments, and engagement trends. Ideal for gaining insights into video performance and audience interaction patterns.

data data-visualization pandas python webscraping youtube-api-v3

Last synced: 19 Jun 2026

https://github.com/rylan12/apscores

A quick way to visualize how the AP score distributions have changed from year to year.

advanced-placement analysis ap-exam data scores

Last synced: 19 Jun 2026

https://github.com/bastianolea/mineduc_personal_academico

Datos de Personal Académico, entre los años 2008 y 2024, del sistema de Educación Superior.

chile data educacion meses tiempo

Last synced: 19 Jun 2026

https://github.com/artcc/coredatademo

Demo for CoreDataGenericModule implementation

core coredata coredata-model data encrypted encrypted-data encryption persist

Last synced: 19 Jun 2026

https://github.com/svetlanam/kbl-to-csv-s3

Keboola extractor, that converts excel to CSV based on input mapping criteria and upload to S3 bucket

data data-cleaning data-transformation etl keboola s3-bucket

Last synced: 20 Jun 2026

https://github.com/g-schumacher44/analyst_resource_hub

A collection of guidebooks, quickref, and resources for data analysis

analytics bigquery data lookerstudio machine-learning model python sql yaml-configuration

Last synced: 20 Jun 2026

https://github.com/petzi53/repairdata

Open Repair Alliance Datasets 2021

data open-data open-datasets r repair repair-cafe repairs

Last synced: 22 Jun 2026

https://github.com/mtnzorlu/quiz-content-builder

Structured JSON quiz data builder for developers

builder data education json vue

Last synced: 23 Jun 2026

https://github.com/hlan22/2025-03-18-data-validation

(no longer useful) DSCI 310 Lecture about Data validation and code testing! Made in tandem with:

data validation

Last synced: 23 Jun 2026

https://github.com/bastianolea/sinim_municipal_genero

Datos comunales de género del Sistema Nacional de Información Municipal

chile comunas data genero laboral tiempo

Last synced: 23 Jun 2026