An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/ismailhakkii/digital_vault

This project can be used for securing data, similar to a real vault.

data digital security-data vault

Last synced: 25 Mar 2025

https://github.com/svetlanam/kbl-to-csv-s3

Keboola extractor, that converts excel to CSV based on input mapping criteria and upload to S3 bucket

data data-cleaning data-transformation etl keboola s3-bucket

Last synced: 20 Jun 2026

https://github.com/svetlanam/etl-transformation

ETL data cleaning and transformation for specific use case in own Keboola project

cleaning data etl keboola python rest-api transformation

Last synced: 20 Jun 2026

https://github.com/yuvrajsaraogi/car-price-prediction-with-machine-learning

The price of a car depends on a lot of factors like the goodwill of the brand of the car, features of the car, horsepower and the mileage it gives and many more. Car price prediction is one of the major research areas in machine learning. So, if you want to learn how to train a car price prediction model then this project is for you.

car-price-prediction-with-machine-learning data data-science deep-learning deep-neural-networks engineer github learning machine-learning mini-project natural-language-processing prediction predictive-modeling project python3 sql

Last synced: 15 Apr 2026

https://github.com/pawamoy/keycut-data

Keyboard shortcuts data stored in YAML files

data keyboard-shortcuts

Last synced: 12 Feb 2026

https://github.com/kashifkhan7/cleaning-analysis_cli

Analyze sales data easily with our CLI app. Gain insights on revenue trends and visualize results using Python, Pandas, and Matplotlib. 🚀📊

conditional-statements css data datacleaning exception-handling exiftool html json matplotlib-pyplot metadata metadata-extraction pandas-python python sales-analysis seaborn-python speech-to-text transcription youtube

Last synced: 13 Apr 2026

https://github.com/bishtrishu/super_store_sales_dashboard

This repository contains a comprehensive sales analysis dashboard for a Superstore, created using Power BI. The objective is to contribute to the success of a business by utilizing data analysis technique, specially focusing on time series analysis, to provide valuable insights and accurate sales forecasting.

analytics data data-science dataanalysis dataanalyst datacleaning datascience datavisualization-project excel microsoft-azure microsoft-excel powerbi report sql

Last synced: 28 Feb 2026

https://github.com/poojaharihar03/wellness-cities-case-study

A case study for dats analysis of city health centers

analytics data r rstudio

Last synced: 11 Jun 2026

https://github.com/jeswr/blog

My personal blog

ai blog data semantics solid web

Last synced: 13 Feb 2026

https://github.com/ekoepplin/dbt-bigquery-core

How to get data to BigQuery (or duckDB) and setup dbt tests for SODA cloud monitoring

bigquery data data-quality dbt dlt duckdb gcp soda

Last synced: 06 May 2026

https://github.com/sumaiyyaf/british-airline-dashboard

This Tableau dashboard visualizes British Airways customer reviews, showcasing key metrics like average ratings for service, entertainment, and seat comfort. It features interactive filters for exploring ratings by aircraft type, country, and traveler type, along with trend analysis over time.

analysis dashboard data tableau visualization

Last synced: 13 Feb 2026

https://github.com/ralzz/dibimbing_datascience

This project contains an Exploratory Data Analysis (EDA) of the Estonia Passenger List dataset. I handled missing values, removed duplicate data, and created basic visualizations to find insights.

data data-science eda google-colab kaggle pandas python

Last synced: 06 May 2026

https://github.com/j0a0m4/olympics

Final Project for Data Engineering Accelerated LATAM

data olympics spark

Last synced: 13 Feb 2026

https://github.com/newrelic-experimental/newrelic-java-atomikos

Gives status of Atomikos Data Sources since this information is unavailable via JMX

atomikos data instrumentation java nrlabs nrlabs-data nrlabs-java-verify nrlabs-odp observability-data

Last synced: 30 May 2026

https://github.com/isandyawan/simplelinearregression

A application to analyze data using simple linear regression. This application can make regression model from variable and give advice to user if the model break regression assumsion

data linear r regression rstudio shiny statistic

Last synced: 14 Oct 2025

https://github.com/corneliustanui/personal_quarto_website

This repo contains source files for my personal Quarto-based website.

data netlify programming quarto r rbind websites

Last synced: 02 Apr 2025

https://github.com/elimu-ai/ml-event-simulator

🤖 Simulation of learning events and assessment events

data learning-analytics machine-learning ml

Last synced: 28 Feb 2025

https://github.com/darrendavy12/azure-databricks-setup-guide-with-formula1-csv

Azure Databricks Setup Guide with Formula1 CSV - Azure Databricks, PySpark, Python, Data Lake Storage

apache azure cloud data databricks lake notebooks pyspark python spark storage

Last synced: 06 May 2026

https://github.com/odiegosilva1/flask-github-style

Página de login usando Jinja no Flask.

data flask jinja2-templates orm python

Last synced: 31 May 2026

https://github.com/sanand0/iss-location

Tracks the International Space Station position. A demo of how to use GitHub Actions to schedule commits weekly.

data

Last synced: 14 Feb 2026

https://github.com/e-kotov/albofr-data-archive

Tiger Mosquito Colonisation in France data

aedes-albopictus colonisation data france tiger-mosquito

Last synced: 23 May 2026

https://github.com/juanpablodiaz/beertv

A Next.js Full Stack app to displays funny Beer TV Ads

api-routes data next tailwindcss

Last synced: 07 May 2026

https://github.com/elijah-1994/pre-process-e-commerce-dataset

Importing, Cleaning, and Pre-Processing E-Commerce Data for Analysis Using MySQL.

analytics data dataanalytics datacleaning dataprocessing mysql mysql-database sql

Last synced: 11 Mar 2025

https://github.com/gusenov/open-data-scripts

Scripts to explore public datasets. Скрипты для работы с открытыми данными.

charts data data-visualisation data-visualization datavisualization highcharts kazakhstan open-data opendata qazaqstan

Last synced: 28 Feb 2026

https://github.com/rishikesh-jadhav/track_deep_learning

Data collected from the Udacity simulator comprising RGB images with steering and throttle annotations for each frame, specifically gathered for behavioral cloning purposes.

data datacollection udacity-self-driving-car

Last synced: 03 Jan 2026

https://github.com/ankitrai259/sales_insight_dashboard

Sales Insight: Using SQL for data cleaning and Power BI for making interactive dashboard

dashboard data data-visualization datacleaning postgresql powerbi sql

Last synced: 17 Mar 2025

https://github.com/bcongdon/nid-data

National Inventory of Dams Data

data datasette government-data

Last synced: 21 Apr 2026

https://github.com/madhuresh2011/genai-powered-data-analytics-by-tata

I recently participated in Tata iQ's job simulation on the Forage platform, and it was incredibly useful to understand what it might be like to be on a data analytics team in an AI transformation consulting role.

chatgpt data dataanalytics eda excel gemini generative-ai internships powerpoint presentation

Last synced: 14 Feb 2026

https://github.com/shantanujpk/bigdatacloud

Exploration of PySpark for data processing and interview prep — demonstrates handling corrupted records, applying transformations/actions, and building efficient data pipelines with practical examples.

big-data data jupyter-notebook pipeline pyspark python spark sparksql

Last synced: 07 May 2026

https://github.com/jerboaburrow/uk-counties-and-unitary-authorities-may-2023-geojson

UK "Counties" Extracted from Office for National Statistics data

data geojson maps uk

Last synced: 29 Mar 2025

https://github.com/itsmeyogesh22/Solved-8-Weeks-SQL-Challenge-Correct-Solutions

Included in Serious SQL Virtual apprenticeship program, this repository contains solutions for all eight different case studies crafted by Danny Ma. For more information please visit: https://8weeksqlchallenge.com/

8weeksqlchallenge data dataanalytics datawithdanny postgresql sql sqlserver-2022 t-sql

Last synced: 29 Aug 2025

https://github.com/companyakis/financial-data

Financial Data & Python

data finance python

Last synced: 29 Jun 2025

https://github.com/lijesh010/roadaccidentanalysisproject

This data analysis project was completed using MS Excel, and includes the creation of a dashboard.

data data-analytics data-exploration data-visualization msexcel

Last synced: 15 Feb 2026

https://github.com/tabarzin/dh

A collection of links to various resources on Digital Humanities

data digitalhumanities opensource

Last synced: 24 Jan 2026

https://github.com/nmelgar/marathons_data_viz

Data visualization project to analyze finishing times and other data.

csv csv-files data data-analysis data-insight data-visualization data-viz dataset tableau

Last synced: 15 Feb 2026

https://github.com/mochsyahrizal/jkfkjabar_studycase

First Data Analytics Study Case

data datanalytics studycase

Last synced: 15 Feb 2026

https://github.com/lab5e/loadabledata

Simple framework-agnostic wrapper around loadable data to help encapsulate and use state changes in a UI.

async data loadable state typescript ui

Last synced: 07 May 2026

https://github.com/nagar2nd/ml-regressionmodel---cardekho-price-prediction

This repository features a machine learning model for predicting used car prices using data from CarDekho.com. The project leverages exploratory data analysis and regression techniques to empower sellers and buyers with actionable insights in the Indian used car market.

analytics cleaning-data data linear-regression machine-learning matplotlib numpy pandas python seaborn

Last synced: 16 Apr 2026

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Jan 2026

https://github.com/pocketfullofdata/electric-vehicles-market-size-analysis

This project analyzes the growth, adoption trends, and future projections of the electric vehicle (EV) market. Using data analysis and visualization techniques, it examines key factors like sales trends, and consumer adoption to understand the evolving landscape of the EV industry.

analysis data jupyter-notebook matplotlib numpy python seaborn vscode

Last synced: 07 May 2026

https://github.com/chardos/get-git-data

Access git repository data in node.

data git javascript node

Last synced: 07 May 2026

https://github.com/safwan2003/randomforest_heart_disease_prediction

A machine learning project using Random Forest Classifier to predict heart disease. Includes data preprocessing (with binning), feature selection, and model evaluation.

binning data data-science datapipeline datapreprocessing datavisaulization deep-learning machine-learning python random-forest-classifier streamlit

Last synced: 07 May 2026

https://github.com/danyal-faheem/project-logs-analyzer

This repo contains scripts to analyze project logs and display some charts related to the data

data data-visualization matplotlib pandas python streamlit

Last synced: 07 May 2026

https://github.com/jigyasag18/iit-guhawati

Empower Sakhi is a data-driven platform that uses machine learning to identify women at risk of domestic violence in India. It offers confidential self-assessments, survivor stories, and emergency resources through a trauma-informed, privacy-focused web app. The project also provides NGOs with actionable insights via Power BI dashboard for support.

aiml data dataset datavisualization domestic-violence eda jupyter-notebook label-encoding machine-learning machine-learning-algorithms machine-learning-models machinelearning machinelearningprojects powerbi python python-app random-forest random-forest-classifier streamlit streamlit-webapp

Last synced: 08 May 2026

https://github.com/soenneker/soenneker.attributes.mapto

A C# attribute for generic data mapping translation

attributes columns csharp data datatables dotnet mapping mapto maptoattribute object

Last synced: 02 Mar 2026

https://github.com/frer0t/userverse

creating api for data analysis

data data-analytics spring-boot users

Last synced: 12 Apr 2026

https://github.com/abhash-rai/regression-car-price-prediction

This repository contains my first complete data science project from web scrapping for data to data preprocessing, cleaning, exploratory data analysis, model training and deployment.

data data-science data-visualization eda exploratory-data-analysis machine-learning neural-network prediction prediction-model regression

Last synced: 08 May 2026

https://github.com/soenneker/soenneker.data.zipcode

US ZIP code data from USPS, updated daily

code csharp data dotnet usps zip

Last synced: 02 Mar 2026

https://github.com/randomfractals/unfolded-map-snippets

Html, CSS, JavaScript, and Python 🐍 vscode snippets ✂️ extension for Unfolded Map 🗺️ and Data SDKs

code data extension map sdk snippets template unfolded vscode

Last synced: 08 May 2026

https://github.com/zevio/acl

ACL Anthology corpus sample

data dataset scholarly-articles

Last synced: 01 Mar 2026

https://github.com/mwiatrzyk/modelity

Data parsing and validation library for Python

data library model parsing python tool validation

Last synced: 18 Jan 2026

https://github.com/inzhenerka/scooters_data_generator

Generate data of scooter trips for analysis

data dbt generator

Last synced: 02 Jun 2026

https://github.com/miraclx/split-merge

Efficient, flexible data stream chunker and merger

chunk data efficient merge middleware nodejs pipeline split stream

Last synced: 07 May 2026

https://github.com/shubhamsoni98/excel-practice

Excel-Practice-Questions

analysis data excel formula raw-data xlsx

Last synced: 03 Mar 2026

https://github.com/afnanenayet/ds-a

Some interview prep I've been doing. This repo is reimplementations of algorithms and data structures in Python3

algorithms data interview prep python structures

Last synced: 05 Apr 2025

https://github.com/donghquinn/gopandas

gopandas

data go golang

Last synced: 14 Oct 2025

https://github.com/anuraganalog/onyx-data

BI Visualizations to the problems in website. All the Visualization can be found at the below link

data onyx public tableau viz

Last synced: 02 Apr 2026

https://github.com/wiseql/wiseql

The wise data browser — run SQL recipes as small, observable, debuggable steps

data debugging duckdb oracle quality sql tui

Last synced: 13 Jun 2026

https://github.com/erickpeirson/jhb-data

Data from the forthcoming paper: Quantitative Perspectives on Fifty Years of the Journal of the History of Biology

data geolocation history-of-biology named-entity-recognition topic-modeling

Last synced: 04 Mar 2026

https://github.com/thomasjewson/cci-data-science-textbook

This is a short, interactive textbook aimed at introducing data science to non-IT university undergraduates. Funded by Erasmus+.

data data-science learning python textbook

Last synced: 16 Apr 2026

https://github.com/fastpix/android-data-bitmovin

FastPix Video Data SDK to monitor and analyze video playback metrics within Bitmovin for android

analytics android-sdk bitmovin data fastpix metrics player sdk video

Last synced: 16 Apr 2026

https://github.com/jigyasag18/power-bi-dashboard-project

The Ecommerce Sales Analysis Dashboard project utilizes Power BI to provide detailed insights into ecommerce sales data, enabling stakeholders to track key performance metrics and uncover trends. This interactive dashboard allows users to explore the data in real-time, offering features such as drill-down capabilities, customizable filters.

dashboard data data-visualization datacleaning datanalysis datanalytics datapreprocessing powerbi visulaization

Last synced: 04 Mar 2026

https://github.com/vara-co/tech-certifications

These are the certifications that back-up some of my skills.

certificates certifications data data-analytics skills

Last synced: 07 Jan 2026

https://github.com/arjunrao87/world-countries-graphql-api

GraphQL API for retrieving information about countries of the world

countries data database geographic-data geography graphql world

Last synced: 10 May 2026

https://github.com/veivel/f1-sentiment-analysis

An entiment analysis project on tweets about Formula 1. To be reworked.

data f1 nlp-library nlp-machine-learning

Last synced: 04 Jul 2025

https://github.com/writetome51/page-load-access

A TypeScript/Javascript class that loads a batch (array) of data from a larger set too big to be loaded all at once.

batch class data javascript load loader typescript

Last synced: 16 May 2026

https://github.com/sehgal-vishal/world-population-

World Population Sql Analysis

data dataanalysis population sql

Last synced: 05 Mar 2026

https://github.com/udhaya2823/microsoft---classifying-cybersecurity-incidents-with-machine_learning

🚨Microsoft: Classifying Cybersecurity Incidents with Machine Learning🔐 This project leverages the power of Machine Learning to classify cybersecurity incidents 🚨, improving the efficiency of Security Operation Centers (SOCs) at Microsoft. We train a model to predict incident grades, helping analysts prioritize threats with precision🎯.

classification data feature-engineering iqr-method machine-learning matplotlib model-evaluation modelselection predictive-modeling python sklearn

Last synced: 17 Apr 2026

https://github.com/ayushverma135/dbms-labfile

Created for practical learning, this DBMS lab file offers hands-on exercises covering SQL queries, normalization, indexing, and more. With clear instructions and sample datasets, students gain invaluable experience in database design and management.

data dbms dbms-lab

Last synced: 04 Feb 2026

https://github.com/theanujsinha01/mcdonalds-customer-analysis

This project analyzes customer feedback data to understand what drives people to like or dislike McDonald’s. Using Python and data visualization tools in a Jupyter Notebook, we explore how different factors—such as taste, price, health, and visit frequency—affect customer satisfaction.

case-study data data-visualization dataanalysis

Last synced: 05 Sep 2025

https://github.com/jigyasag18/amazon-prime-power-bi-dashboard

The Amazon Prime Power BI Project is a centralized data storage system containing detailed information on movies and TV shows available on Amazon Prime Video, including metadata and analytics insights. It supports data-driven decision-making for content acquisition and viewer engagement strategies. This repo is optimized for querying & analysis.

dashboard data data-visualization dataanalysis dataanalytics datacleaning dataset powerbi powerbi-dashboards powerbi-report powerbi-visuals powerbidashboard

Last synced: 05 Mar 2026

https://github.com/writetome51/public-data-container-interface

Just a TypeScript interface with 1 property: 'data'

container data interface typescript

Last synced: 15 May 2026

https://github.com/taquece/goals-per-match

basic script to calculate average football goals per match from .CSV

beginner csv data football nodejs python sports-analytics

Last synced: 09 May 2026

https://github.com/publici/state-integrity-data

Data from a comprehensive assessment of state government accountability and transparency

data

Last synced: 04 Feb 2026

https://github.com/michael-ljn/cirp-lce-2025

Prospective Global Warming Potential of Australian Low-Emission Hydrogen in a Net-Zero Emission Context

data publication

Last synced: 06 Mar 2026

https://github.com/ashfaqalizardariofficial/databasehelper

A C# database helper library to connect with the database server and perform actions insert, update, delete, select data and select multiple data from the database.

ashfaq-ali-zardari ashfaq-ali-zardari-official data database delete helper insert ms-sql-server multiple select-data server sql-server update

Last synced: 02 Apr 2026

https://github.com/muhammed-fazal/student-success-and-early-intervention-analytics-system

To consolidate scattered student performance records into a unified Data Warehouse in SQL Server. Engineer an Interactive Power BI dashboards that visualize academic trends, identifying student performance and implement predictive analytics.

analysis analytics dashboard data data-analysis data-engineering data-science data-visualization database etl etl-pipeline power-bi powerbi python sql sql-server

Last synced: 29 May 2026

https://github.com/mtnzorlu/quiz-content-builder

Structured JSON quiz data builder for developers

builder data education json vue

Last synced: 23 Jun 2026

https://github.com/dineshdhamodharan24/data-analysis

probability Analysis to customers and bascis analysis

analysis data powerbi probability python visualization

Last synced: 23 Jun 2026

https://github.com/anand-sony/mttr-dashboard

Streamlit dashboard for MTTR analysis with shift-wise loss insights and machine-level downtime tracking.

analytics business-analytics dashboard data python statistical-analysis

Last synced: 30 May 2026