An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/ksm26/ml-ai-data-science-jobs-in-canada

Explore the latest machine learning, artificial intelligence, and data science job opportunities in Canada. Stay informed about Canadian tech job market trends and find your next career move.

ai-canada ai-careers canada canadian-tech-companies canadian-tech-job-market data data-analysis data-engineering data-science data-science-careers machine-learning prompt-engineering robotics

Last synced: 06 May 2026

https://github.com/poode/firebase-modeling

Get firebase/firestore entity model to migrate to mongo or any db later

data database firebase firestore modeling schema

Last synced: 06 May 2026

https://github.com/rrwen/twitter2mongodb-cli

Command line tool for extracting Twitter data to MongoDB databases

api cli cmd command data database get interface line mdb media mongo mongod mongodb post social stream tool tweet twitter

Last synced: 06 May 2026

https://github.com/iv4n-ga6l/functional-dataprocessing-pipeline

A functional data processing pipeline that accepts an input file, allows specifying both input and output formats, applies specified transformations, and produces a resulting output file.

csv data datapreprocessing excel json pandas parquet pipeline python

Last synced: 06 May 2026

https://github.com/darrendavy12/azure-databricks-setup-guide-with-formula1-csv

Azure Databricks Setup Guide with Formula1 CSV - Azure Databricks, PySpark, Python, Data Lake Storage

apache azure cloud data databricks lake notebooks pyspark python spark storage

Last synced: 06 May 2026

https://github.com/juanpablodiaz/beertv

A Next.js Full Stack app to displays funny Beer TV Ads

api-routes data next tailwindcss

Last synced: 07 May 2026

https://github.com/shantanujpk/bigdatacloud

Exploration of PySpark for data processing and interview prep — demonstrates handling corrupted records, applying transformations/actions, and building efficient data pipelines with practical examples.

big-data data jupyter-notebook pipeline pyspark python spark sparksql

Last synced: 07 May 2026

https://github.com/lab5e/loadabledata

Simple framework-agnostic wrapper around loadable data to help encapsulate and use state changes in a UI.

async data loadable state typescript ui

Last synced: 07 May 2026

https://github.com/bryanhe24/data_analysis_app

A full-stack web application that allows users to upload CSV datasets, analyze the data with statistical summaries and visualizations, and interact with an AI-powered assistant for querying the dataset.

ai data data-analysis data-visualization fullstack-development javascript math python reactjs

Last synced: 07 May 2026

https://github.com/hudson-newey/data-miner

A simple data miner that collects information from an API and stores it in a file

api api-client big-data bigdata data logger logging

Last synced: 10 Jun 2026

https://github.com/tjas/postgrad-ai-ddv-plotly

Jupyter Notebook to analyze the salaries of Federal District government public servants, using Python, Pandas and Plotly Express, to solve the proposed exercise in "Data Discovery and Visualization" discipline.

analysis analytics data data-analytics data-discovery data-science data-visualization graph graphs jupyter-notebook jupyter-notebooks pandas plotly plotly-express python

Last synced: 07 May 2026

https://github.com/abhash-rai/regression-car-price-prediction

This repository contains my first complete data science project from web scrapping for data to data preprocessing, cleaning, exploratory data analysis, model training and deployment.

data data-science data-visualization eda exploratory-data-analysis machine-learning neural-network prediction prediction-model regression

Last synced: 08 May 2026

https://github.com/vanshuchaudhary/flightpriceanalysis-

The uploaded file is a Jupyter Notebook titled "Flight Analysis". It likely involves analyzing flight-related data, potentially exploring trends, patterns, or insights using data science techniques. The analysis might include data visualization, statistical analysis, or predictive modeling.

business-analytics data data-analysis data-visualization datainsights datascience matplotlib-pyplot python seaborn seaborn-plots seaborn-python sns statistical-analysis

Last synced: 08 May 2026

https://github.com/taquece/goals-per-match

basic script to calculate average football goals per match from .CSV

beginner csv data football nodejs python sports-analytics

Last synced: 09 May 2026

https://github.com/basemax/okala-product-ids

A PHP script to fetch and save product IDs from Okala's online store API across multiple categories and store branches.

crawler crawler-okala crawler-php crawlers data database ids ir iran json okala okala-crawler php php-crawler product

Last synced: 09 May 2026

https://github.com/abhroroy365/market_analysis

This project explores customer segmentation and market analysis in the context of online retail using an online retail dataset. By applying advanced analytics, we aim to uncover insights that can drive strategic decisions and enhance business performance.

clustering data data-analysis data-visualization kmeans-clustering machine-learning market-analysis python silhouette-analysis

Last synced: 09 May 2026

https://github.com/pawlo77/nos_snowflake

Network Operating Systems course for DS studies in Winter 2024/25

azure data data-science snowflake snowpark streamlit

Last synced: 09 May 2026

https://github.com/mohamedbilal1800/olympic_history_data_analysis

This project delves into the 120 Years of Olympic History: Athletes and Results dataset, analyzing athlete demographics, medal achievements, and country performances across the Summer and Winter Olympics from 1896 to 2016.

analysis data eda matplotlib-pyplot pandas python seaborn visulaization

Last synced: 09 May 2026

https://github.com/thanh-wutan/chess-opening-comparator

Interactive web app using R to visualize and compare chess opening performance and popularity.

chess-openings data databases datavisualisation r

Last synced: 09 May 2026

https://github.com/sebastianbrzustowicz/flight-quality-overview-microservice

Go + Docker. Microservice with parallel computations to convert raw vehicle flight data into overview raport with visualisation.

container control csv data docker drone flight go goroutines http microservice parallel-computing pdf quadcopter raport rms sse vehicle

Last synced: 10 May 2026

https://github.com/hemangsharma/assignment-2---classification-models

Assignment 2 - Classification Models repository contains project for 36106 Machine Learning Algorithms and Applications

data datascience-machinelearning machine-learning ml

Last synced: 10 Jun 2026

https://github.com/infinitode/crsd

A synthetic customer review sentiment dataset for sentiment analysis generated using different AI models.

ai data dataset datasets huggingface-datasets mit-license ml nlp open-source python sentiment sentiment-analysis sentiment-classification text-data

Last synced: 10 Jun 2026

https://github.com/afeiship/data-arary

Data array with some new methods.

array data data-structure js list

Last synced: 11 May 2026

https://github.com/vbhatsaccnt/retail-strategy-and-analytics-optimization-of-control-stores-for-sales-enhancement

In this project, we aim to optimize the performance of retail chain stores by establishing control stores based on their performance compared to selected trial stores. By leveraging data analytics and strategic insights, we seek to enhance sales revenue and drive growth within the retail chain.

customer-segmentation data data-science risk-analysis

Last synced: 13 May 2026

https://github.com/mitevpi/poli-parse

Political news scraping & NLP parsing from web pages.

data election javascript library module nlp npm package parse politics scrape sentiment

Last synced: 13 May 2026

https://github.com/iannil/one-data-studio

one-data-studio integrates a data governance and development platform, a cloud-native MLOps platform, and a large model application development platform. It connects the entire value chain from raw data governance to model training and deployment, and further to the construction of generative AI applications.

data llm model platform

Last synced: 12 Jun 2026

https://github.com/shashwat9kumar/trends_in_a_country_on_twitter

Finding trending topics in each country on twitter and visualizing them in a WordCloud

data data-visualization trends tweepy twitter-api wordcloud

Last synced: 13 Jun 2026

https://github.com/asjadnaqvi/stata-tidytuesday

A Stata package for fetching Tidy Tuesday meta data and files

ado data r stata tidytuesday

Last synced: 13 Jun 2026

https://github.com/neuro-mechatronics-interfaces/ros2_data_agent

Code for a multipurpose file explorer specializing in reading ROS2 topic data from '.bag' or '.db3' files

data python ros2

Last synced: 13 Jun 2026

https://github.com/isharescheme/participant-onboarding-portal

Standardized onboarding portal for data space participants.

data onboarding particpant space

Last synced: 15 Jun 2026

https://github.com/lafkpages/minecraft-crafting-info

Scrapes https://www.minecraftcrafting.info for crafting recipes.

api crafting data minecraft

Last synced: 17 Jun 2026

https://github.com/ibttf/bayborhood

Interactive map to find the ideal neighborhood in San Francisco based on data.

data data-analysis data-visualization gis mapbox react

Last synced: 18 Jun 2026

https://github.com/mbagalman/lattice-doe

Python code to create experimental designs optimized to meet statistical power targets

abtesting data datascience designofexperiments experimentaldesign statistics

Last synced: 19 Jun 2026

https://github.com/rylan12/apscores

A quick way to visualize how the AP score distributions have changed from year to year.

advanced-placement analysis ap-exam data scores

Last synced: 19 Jun 2026

https://github.com/g-schumacher44/analyst_resource_hub

A collection of guidebooks, quickref, and resources for data analysis

analytics bigquery data lookerstudio machine-learning model python sql yaml-configuration

Last synced: 20 Jun 2026

https://github.com/petzi53/repairdata

Open Repair Alliance Datasets 2021

data open-data open-datasets r repair repair-cafe repairs

Last synced: 22 Jun 2026

https://github.com/mtnzorlu/quiz-content-builder

Structured JSON quiz data builder for developers

builder data education json vue

Last synced: 23 Jun 2026

https://github.com/weskal/vexus_pipeline

Automated pipeline for generating, ingesting, and validating realistic data, designed to simulate real-world workflows with scheduling, data quality checks, and version control.

airflow data pipeline python sqlserver workflow

Last synced: 20 Jan 2026

https://github.com/syedzaheerabbas/jamboree-education-linear-regression

Using data from Jamboree, this project explores the relationship between applicant profiles (GRE, TOEFL, GPA, etc.) and their chances of admission to Ivy League graduate programs. Linear regression, Ridge, and Lasso regression are employed to build predictive models and identify key factors.

data eda linear-regression python visualization

Last synced: 01 May 2026

https://github.com/dnut/associations

Python 3 library to identify high-dimensional statistical relationships in any data set.

analytics arch-linux association-rules data data-analysis data-mining data-science machine-learning python-modules

Last synced: 01 May 2026

https://github.com/tyriek-cloud/nyc-dca-etl

Created an ETL pipeline to merge two CSV files (converted to JSON) into a parquet file using Azure Data Factory, The data was extracted from NYC Open Data: https://opendata.cityofnewyork.us/ and I created a Blob Container within an existing storage account.

azure azure-data-factory blob-storage data data-engineering etl-pipeline

Last synced: 21 Jan 2026

https://github.com/plnech/never2late

Never 2 Late - a reinterpretation of Everest Pipkin's 'i've never picked a protected flower'

dada dada-science data generative-art glitch-art installation nlp poetry spacy vector-similarity wallpaper

Last synced: 10 Jun 2025

https://github.com/bertrand31/one-billion-rows-challenge

🌪️ Pushing Scala to its limits to aggregate a billion rows' worth of data in 2.42 seconds

competitive-programming competitive-programming-contests data data-engineering data-processing performance scala

Last synced: 05 Sep 2025

https://github.com/mubashirsidiki/certifications_work

his repository contains my work, projects, and solutions from various professional certification programs.

analysis coursera data data-science google ibm john-hopkins machine-learning michigan udemy

Last synced: 01 Jul 2025

https://github.com/mikeschinkel/go-testdata-defaulter

Simple package for Go to set table-driven test data defaults so that tables in tests only need include data that differs from defaults.

data defaults package testing tests

Last synced: 13 Oct 2025

https://github.com/loosenthedark/going-for-gold

A fairer, more measured look at the Tokyo 2020 Olympic medal count. Countries are ranked in relative (per capita) instead of absolute medal-winning terms. Users can toggle between two different ranking breakdowns, search for countries, contact the site owner and enable dark mode. Mobile-first React application leveraging the REST Countries API as well as a local JSON Olympic dataset. EmailJS and React Context API integration with custom form validation and error handling.

api create-react-app css data es6 fetch-api frontend html5 interactive-front-end-development javascript mobile-first olympics react react-components react-context-api react-hooks react-router react-router-dom reactjs responsive-web-design

Last synced: 07 May 2026

https://github.com/bastianolea/mineduc_personal_academico

Datos de Personal Académico, entre los años 2008 y 2024, del sistema de Educación Superior.

chile data educacion meses tiempo

Last synced: 19 Jun 2026

https://github.com/tabarzin/dh

A collection of links to various resources on Digital Humanities

data digitalhumanities opensource

Last synced: 24 Jan 2026

https://github.com/mohammad-malik/covid-visualizations-d3

This project provides a dashboard with five different perspectives on the pandemic, from patient-infection relationships to regional trends and hierarchical distributions. This was developed as part of a project for the course Data Analysis and Visualization (DS3001).

covid-19 d3 d3-visualization d3js data data-analysis data-analytics data-science visualization

Last synced: 28 May 2026

https://github.com/linguini1/edueval

The BorealisAI Let's Solve It mentorship project: summarizing student feedback submissions on their professor into one cohesive paragraph for faculty consideration during performance reviews.

ai data data-analysis data-science machine-learning machinelearning nlp python pytorch sentiment-analysis

Last synced: 01 May 2026

https://github.com/jamiew/void-runners-analysis

basic data analysis for the Void Runners Genesis Fleet spaceships

analysis data nfts

Last synced: 29 Mar 2025

https://github.com/mominurr/fire-gas-leak-detection-system

A real-time fire prevention system integrating IoT sensors and computer vision to trigger evacuations.

ai computer-vision data datascience machine-learning ml python yolo

Last synced: 27 Jan 2026

https://github.com/gsinghjay/ywcc-307-003

Group Presentations

cloud data government

Last synced: 04 Feb 2026

https://github.com/jpcurada/exploralytics

A python package for creating intermediate plotly visualizations

data eda plotly python visualization

Last synced: 05 Feb 2026

https://github.com/rec/kson

🔑 Json with the rough edges removed 🔑

data json serialization

Last synced: 01 May 2026

https://github.com/miniql/miniql-inline

A MiniQL query resolver for inline data.

data query query-language

Last synced: 27 May 2026

https://github.com/yagoluiz/enem-analise-extracao

[PT-BR] Extração e análise de dados do desempenho da região Centro-Oeste

analysis data extraction python3 r

Last synced: 17 Apr 2026

https://github.com/poissonconsulting/klexdatr

An R package of data from the Kootenay Lake Exploitation Study

cran data fish kootenay-lake rstats

Last synced: 16 Oct 2025

https://github.com/boratechlife/tensorflow-questions-datasets

A Tensorflow questions Datasets to help you practice Machine learning and Train Models

data datapreprocessing datasets machinelearning modeltrain questions tensorflow

Last synced: 23 Mar 2025

https://github.com/fatihilhan42/nba-players-data-1950-to-2021

In this project, the data of the NBA players between the years 1950-2021 were examined. After the NBA players' season, height, performance, averages of points, teams and positions they played were obtained through csv files, important tables and graphs were created using data cleaning and data visualization algorithms.

data data-analysis data-engineering data-science data-visualization

Last synced: 16 Oct 2025

https://github.com/psyteachr/psyteachrdata

Datasets for psyTeachR Books

data

Last synced: 23 Mar 2025

https://github.com/avestura/shell-dads

❓ Show a random tip from NIST DADS (https://xlinux.nist.gov/dads) every time you open your terminal

algorithms dads data data-structures ds nist

Last synced: 23 Oct 2025

https://github.com/smeltier/data-structures-c

This repository contains C language implementations of the main data structures covered in the Algorithms and Data Structures course. The implementations were developed as part of my hands-on learning process and include sequential lists, linked lists, and other fundamental structures.

algorithms algorithms-and-data-structures c c-language c-programming data data-structures data-structures-c structures-c

Last synced: 16 May 2025

https://github.com/ronknight/user-data-dashboard

📈 A data visualization tool for analyzing user data using an Excel-based data source.

dashboard data excel ga4 screenshot

Last synced: 17 Oct 2025

https://github.com/analyst-amitbisht/pizza-sales-report-

Its a guided project to practice tools like SSMS + Power BI & also skills like data cleaning, data exploration, data analysis, data visualization, etc.

analytics data data-visualization powerbi sql-server

Last synced: 18 Oct 2025

https://github.com/mecha-cms/x.kick

URL redirection files.

data extension files link redirect tool tsv url

Last synced: 23 Mar 2025

https://github.com/jprando/mattkillua

Estudo sobre .Net Core

data dbcontext domain efcore netcore

Last synced: 23 Mar 2025

https://github.com/bastianolea/cut_comunas

Versión actualizada de los códigos únicos territoriales (CUT) de las comunas y regiones del país.

chile comunas data estado

Last synced: 24 Jun 2026

https://github.com/ehvenga/data.driven.modeling

Repository to practice data driven modelling

data data-modeling

Last synced: 23 Mar 2025

https://github.com/cracko298/planet-life-save-converter

Convert your Planet-Life Saves To and From Base64 & *.planet files.

base64 base64-decoding base64-encoding data python python-script python3 save-converter save-data save-files

Last synced: 15 Mar 2025

https://github.com/ngofilho/scripts-db

Repository containing several dbs scripts samples.

cache data database db mariadb mongodb mysql oracle redis sql-server

Last synced: 11 Apr 2026

https://github.com/sandygcabanes/etl-earthquake-data-from-usgs-google-cloud-composer-airflow

Airflow, Google Cloud Composer, GCS, BigQuery, Python. This automated pipeline pulls daily earthquake data from a trusted public source, stores it securely in the cloud, and organizes it into clean, searchable tables for analysis.

cloud composer dag data engineering etl etl-pipeline google json python

Last synced: 01 May 2026

https://github.com/sebastianbrzustowicz/github-data

Java + Spring Boot. Application for sending requests to GitHub API and collecting received data.

api ci data github json junit mapping parallel repository rest-api stream

Last synced: 01 May 2026

https://github.com/mateogiuffra/estrd2024s1

trabajos prácticos realizados en la materia Estructura de Datos de la Universidad Nacional de Quilmes (UNQ)

c cpp data data-structures-and-algorithms eficiency functional-programming haskell unq

Last synced: 12 May 2026

https://github.com/kuanhungchen/spring-2019-data-structures

📦 Some programming assignments about basic data structures.

data data-structures

Last synced: 25 Feb 2025

https://github.com/fiddlydigital/anonimizer

A lib to replace and rehydrate sensitive data in text

anonimize anonymize data data-security prompt sanitize string string-manipulation text

Last synced: 15 Mar 2025

https://github.com/sanad343/complete-data-analyst

Data analysis is the process of turning raw data into useful information for decision-making.

data data-visualization datamanipulation eda excel exploratory-data-analysis powerbi python-3 sql tableau

Last synced: 30 Jun 2025

https://github.com/barbosa89/vue-table

A classical data table component in VueJS and Bootstrap 4, optimized for Laravel applications.

bootstrap4 data datatable javascript laravel php table vuejs

Last synced: 11 Apr 2026

https://github.com/nyo16/megas_pinakas

Bigtable elixir grpc client

bigtable data elixir grpc

Last synced: 13 Jan 2026

https://github.com/kenjyco/libs

Easily install kenjyco libs

api cli command-line data helper kenjyco libs python

Last synced: 16 May 2026

https://github.com/sankooc/validatez

object validation for node

data validate

Last synced: 13 May 2026

https://github.com/gappeah/layoffs-exploratory-data-analysis

This project uses MySQL to perform data cleaning and exploratory data analysis (EDA) on a dataset detailing company layoffs. The primary goal is to process, clean, and explore the data to gain insights into trends and patterns related to layoffs across various sectors.

data dataanalysis eda mysql sql

Last synced: 12 Jul 2025