An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/iv4n-ga6l/functional-dataprocessing-pipeline

A functional data processing pipeline that accepts an input file, allows specifying both input and output formats, applies specified transformations, and produces a resulting output file.

csv data datapreprocessing excel json pandas parquet pipeline python

Last synced: 06 May 2026

https://github.com/ashleydavis/brisjs-web-scraping-talk

Code to accompany my talk on web scraping for the Brisbane JavaScript meeting in September 2018

cheerio data data-acquisition data-acquisiton electron headless-browsers javascript nightmare nightmarejs nodejs web-scraping

Last synced: 06 May 2026

https://github.com/shantanujpk/bigdatacloud

Exploration of PySpark for data processing and interview prep — demonstrates handling corrupted records, applying transformations/actions, and building efficient data pipelines with practical examples.

big-data data jupyter-notebook pipeline pyspark python spark sparksql

Last synced: 07 May 2026

https://github.com/hudson-newey/data-miner

A simple data miner that collects information from an API and stores it in a file

api api-client big-data bigdata data logger logging

Last synced: 10 Jun 2026

https://github.com/abhash-rai/regression-car-price-prediction

This repository contains my first complete data science project from web scrapping for data to data preprocessing, cleaning, exploratory data analysis, model training and deployment.

data data-science data-visualization eda exploratory-data-analysis machine-learning neural-network prediction prediction-model regression

Last synced: 08 May 2026

https://github.com/vanshuchaudhary/flightpriceanalysis-

The uploaded file is a Jupyter Notebook titled "Flight Analysis". It likely involves analyzing flight-related data, potentially exploring trends, patterns, or insights using data science techniques. The analysis might include data visualization, statistical analysis, or predictive modeling.

business-analytics data data-analysis data-visualization datainsights datascience matplotlib-pyplot python seaborn seaborn-plots seaborn-python sns statistical-analysis

Last synced: 08 May 2026

https://github.com/miniql/miniql-csv

A MiniQL query resolver that loads data from CSV files.

comma-separated-values csv data query query-language

Last synced: 08 May 2026

https://github.com/basemax/okala-product-ids

A PHP script to fetch and save product IDs from Okala's online store API across multiple categories and store branches.

crawler crawler-okala crawler-php crawlers data database ids ir iran json okala okala-crawler php php-crawler product

Last synced: 09 May 2026

https://github.com/naitiknayak196/tech-layoffs-cleaning-sql-vs-python

This project cleans and analyzes a tech layoffs dataset using MySQL and Python (Pandas) to compare their efficiency in data processing. It provides business insights into workforce trends, industry stability, and economic impacts to support data-driven decision-making.

data datacleaning dataset jyputer-notebook layoffdata layoffs mysql python sql

Last synced: 09 May 2026

https://github.com/mohamedbilal1800/olympic_history_data_analysis

This project delves into the 120 Years of Olympic History: Athletes and Results dataset, analyzing athlete demographics, medal achievements, and country performances across the Summer and Winter Olympics from 1896 to 2016.

analysis data eda matplotlib-pyplot pandas python seaborn visulaization

Last synced: 09 May 2026

https://github.com/baranasoftware/curricular-api

The design and implementation of a REST API for student and course data for a Higher Ed institution.

aws data data-pipeline go golang lambda rest rest-api sqlite3 system-design terraform

Last synced: 09 May 2026

https://github.com/sebastianbrzustowicz/flight-quality-overview-microservice

Go + Docker. Microservice with parallel computations to convert raw vehicle flight data into overview raport with visualisation.

container control csv data docker drone flight go goroutines http microservice parallel-computing pdf quadcopter raport rms sse vehicle

Last synced: 10 May 2026

https://github.com/adrianoleitedasilva/adrianoleitedasilva

Me chamo Adriano, tenho 35 anos de idade, sendo 18 anos dedicados as áreas de Tecnologia da Informação e Educação.

adrianoleitedasilva automation ceo cio cto data data-science dev diretor github mobile professor python readme techlead web

Last synced: 10 May 2026

https://github.com/infinitode/crsd

A synthetic customer review sentiment dataset for sentiment analysis generated using different AI models.

ai data dataset datasets huggingface-datasets mit-license ml nlp open-source python sentiment sentiment-analysis sentiment-classification text-data

Last synced: 10 Jun 2026

https://github.com/miniql/miniql-json

A MiniQL query resolver that loads data from JSON files.

data json query query-language

Last synced: 11 May 2026

https://github.com/sehaj003/boston-bruins-roster-planning-mysql-nosql

Repository for Data Management project, Boston Bruins Roster Planning using MySQL and NoSQL along with data analysis using Python

data data-management mongodb mysql project-repository python

Last synced: 11 May 2026

https://github.com/mateogiuffra/estrd2024s1

trabajos prácticos realizados en la materia Estructura de Datos de la Universidad Nacional de Quilmes (UNQ)

c cpp data data-structures-and-algorithms eficiency functional-programming haskell unq

Last synced: 12 May 2026

https://github.com/dev88jerry/cs304

Bishop's University - CS304 Data Structures

bishops bu data data-structures python structure university

Last synced: 11 Jun 2026

https://github.com/flexiui-labs/flexi-grid

Flexi Grid is an advanced, lightweight, and customizable Angular 19 data grid component

angular data filter grid search select sort table

Last synced: 14 May 2026

https://github.com/word2vect/beijing-new-house-data-visualization

Beijing New House Data Visualization for Python Programming 2024 Fall Data Visualization Lab

data python visualization

Last synced: 13 Jun 2026

https://github.com/word2vect/beijing-pm2.5-data-process

Beijing PM2.5 Data Process for Python Programming 2024 Fall Data Visualization Lab 2

data python visualization

Last synced: 15 Jun 2026

https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020

Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).

bigquery data data-analysis data-visualization python sql tableau

Last synced: 15 Jun 2026

https://github.com/ayushman0511/data-analytics-project1

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics busine data data-anal data-enginee data-sci data-scien database datascien query reporting sql sql-query sql-server window-func

Last synced: 17 Jun 2026

https://github.com/dineshram0212/youtube-analysis

This YouTube Analysis Package provides tools for analyzing YouTube video data, including metrics on views, likes, comments, and engagement trends. Ideal for gaining insights into video performance and audience interaction patterns.

data data-visualization pandas python webscraping youtube-api-v3

Last synced: 19 Jun 2026

https://github.com/bastianolea/mineduc_personal_academico

Datos de Personal Académico, entre los años 2008 y 2024, del sistema de Educación Superior.

chile data educacion meses tiempo

Last synced: 19 Jun 2026

https://github.com/artcc/coredatademo

Demo for CoreDataGenericModule implementation

core coredata coredata-model data encrypted encrypted-data encryption persist

Last synced: 19 Jun 2026

https://github.com/svetlanam/kbl-to-csv-s3

Keboola extractor, that converts excel to CSV based on input mapping criteria and upload to S3 bucket

data data-cleaning data-transformation etl keboola s3-bucket

Last synced: 20 Jun 2026

https://github.com/mtnzorlu/quiz-content-builder

Structured JSON quiz data builder for developers

builder data education json vue

Last synced: 23 Jun 2026

https://github.com/dineshdhamodharan24/data-analysis

probability Analysis to customers and bascis analysis

analysis data powerbi probability python visualization

Last synced: 23 Jun 2026

https://github.com/gunn/covid-19-scripts

Scripts for processing COVID-19 data - e.g. converting from absolute to per capita numbers, adding fine-grained data from more countries

covid-19 data geography typescript

Last synced: 17 May 2026

https://github.com/rrwen/py-examples

Collection of python examples in each branch

beginner data download excel guide introduction links processing python reference spreadsheet url xls

Last synced: 05 May 2026

https://github.com/ot-code/sql-sabor-y-tradicion

A SQL-driven project that integrates menu and order data to reveal insights on dish performance, customer preferences, and spending trends. It informs pricing strategies, menu adjustments, and targeted promotions, ultimately enhancing the overall customer experience and driving business growth.

analytical-queries data data-aggregation data-analysis database-design join-queries mysql order-analytics relational-databases restaurant-data sql sql-script

Last synced: 08 Apr 2025

https://github.com/noedemange/orderedheatmapanalysis

OrderedHeatMapAnalysis (OHMA) is a direct data analysis framework allowing to simultaneously visualize and analyze the structure of complex datasets. An optimized seriation of rows and columns of the input data table is performed, resulting in a mapping of the whole dataset into an ordered heatmap.

analysis bi-seriation data dataanalysis heatmap r rstats seriation shiny shiny-apps

Last synced: 27 Feb 2025

https://github.com/abhijeetdasbakshi/ecommerce-insights

A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization — enabling actionable insights into e-commerce

airflow airflow-dags ci-cd-pipeline clustering dags data data-pipelines docker docker-compose docker-container dockerfile git great-expectations kafka mongodb pca-analysis postgresql pyspark t-sne umap-learn

Last synced: 04 Apr 2026

https://github.com/dilkushsingh/webscraping-with-selenium-and-beautifulsoup

Web Scrapped a popular tech gadgets website using Selenium and BeautifulSoup, also performed Data Analysis on scrapped data.

beautifulsoup data datacleaning datagathering eda exploratory-data-analysis python selenium webscraping

Last synced: 24 Feb 2026

https://github.com/seldszar/piccha

Another tree data structure

data tree

Last synced: 16 Jul 2025

https://github.com/smaug6739/data-bit

This project is a module for converting a structured dataset into a number that can be stored in a database taking up little space.

bits data nodejs

Last synced: 14 May 2026

https://github.com/faster-games/dynamic-components

Dynamic Runtime Components for Unity3D

data framework unity3d

Last synced: 11 Apr 2026

https://github.com/rodgeraraujo/open-dataverse

OpenDataverse: ETL application to filter and import open data from https://dados.ifpb.edu.br/ save on database, and exported via a Rest API.

data dataset dataverse flask ifpb pandas python

Last synced: 05 May 2026

https://github.com/turner-kendall/turner-kendall

Turner Kendall - dev, opps, sec.

config data github-config go rust security

Last synced: 31 Oct 2025

https://github.com/shudhanshusaurabh001/super_market-data-analysis-using-python

This project focuses on analyzing supermarket sales data using Python. The goal is to extract meaningful insights from the dataset, such as sales trends, customer purchasing behavior, and product performance.

analysis csv data insights matplotlib numpy pandas project python seaborn

Last synced: 06 Apr 2026

https://github.com/robertoostenveld/dcn.dsc_62002071_01_114_v1

Simon task M/EEG data [Data set].

data datalad open-data

Last synced: 23 Jan 2026

https://github.com/moeabbas6/bq_data_loader

A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.

bigquery data python

Last synced: 24 Mar 2025

https://github.com/plnech/never2late

Never 2 Late - a reinterpretation of Everest Pipkin's 'i've never picked a protected flower'

dada dada-science data generative-art glitch-art installation nlp poetry spacy vector-similarity wallpaper

Last synced: 10 Jun 2025

https://github.com/nolanbconaway/rollercoaster-tycoon-data

Every roller coaster I have built in RCT2 for iPad

data roller-coaster-tycoon

Last synced: 24 Mar 2025

https://github.com/muthupillai1204/diwali_sales_analysis

The Diwali sales analysis reviews past data to identify trends, peak buying times, popular products, and customer demographics. It assesses sales volume, revenue growth, and promotional effectiveness, helping businesses optimize marketing and inventory for future seasons.

data datacleaning eda excel jupyter-notebook matlplotlib numpy pandas python seaborn visualization

Last synced: 05 May 2026

https://github.com/thais81/gamesbox

Another desktop app in JSE/Jswing with hangman game and tic-tac-toe game. This project was made at LDNR school with 4 friends

data database hangman-game jse tictactoe tictactoe-game

Last synced: 28 Jan 2026

https://github.com/rubyonworld/ldpath

This is a ruby implementation of LDPath, a language for selecting values linked data resources.

data ldpath resource ruby

Last synced: 12 Nov 2025

https://github.com/meizuflux/cion

Python minimal data validation library

data minimal python validation

Last synced: 28 May 2026

https://github.com/zoekelepiri/winedataprediction

A machine learning application in wine quality prediction

data descriptive-statistics machine-learning-algorithms

Last synced: 05 Jan 2026

https://github.com/remcostoeten/github-and-vercel-api-showcase-dashboard

Showcase results of possible fetched data from the Github and Vercel API built in all vanilla js.

api-rest da data express-js github-api nodejs vercel-api

Last synced: 07 Mar 2026

https://gitlab.com/hailstorm75/Common

A collection of extension libraries for various use-cases

common core cpp csharp data extensions libraries library math matrix

Last synced: 07 May 2025

https://github.com/nivasharmaa/genetrack

A Java program for analyzing DNA sequences and identifying individuals based on Short Tandem Repeats (STRs). Features profile database creation, STR analysis, individual identification, and relationship detection.

data data-processing dna-analysis file-io-in-java genetic-analysis java-oop

Last synced: 25 Aug 2025

https://github.com/wittyicon29/kritika-iit-b-2023

Seletcion task for the summer projects of Kritika IIT-B

data data-analysis data-science

Last synced: 15 Mar 2025

https://github.com/hemangsharma/dataanalysis

This repo contains analysis like a dashboard and time series forecast on NASDAQ data

analysis data data-analysis data-visualization python

Last synced: 10 Mar 2026

https://github.com/vishwas-chakilam/hr-dashboard

This project involves creating an interactive HR Dashboard using Power BI for visualization and MySQL for data cleaning and analysis. It provides insights into employee performance, attrition, salary distribution, and hiring trends.

dashboard data datac datacleaning datavisualization mysql powerbi

Last synced: 23 Mar 2025

https://github.com/rezapace/newbash

This project involves managing various application shortcuts and configurations primarily for a Linux environment. It includes scripts for creating .desktop entries for applications, managing system configurations, and handling application processes.

automation backup bash data dekstop linux newbash ohmyzsh script testing zsh

Last synced: 11 Apr 2026

https://github.com/r-mahesh45/india-news-headlines-analysis

Excited to share my latest project: India News Headlines Analysis (2001–2023). This Power BI report dives deep into 21 years of Indian headlines, uncovering: Trends that defined the nation, Key themes that shaped public discourse, Insights into the evolution of media coverage.

data data-science powerbi visualization

Last synced: 05 Jan 2026

https://github.com/dhimmel/thinklytics

Continuous Thinklab project exports and analytics

analytics data rephetio thinklab travis-ci

Last synced: 23 Mar 2025

https://github.com/merekat/hb-oil-assets

Eine Analyse der Assetentwicklung im Zusammenhang mit schockartigen Anstiegen des Ölpreises seit des Markteintritts von Brent-Öl in 1986.

analyze asset data datajournalism oil python

Last synced: 16 Mar 2026

https://github.com/2022-04-11588/data-fakes

🔍 Generate realistic fake data for testing and development, enhancing your projects with simple, customizable data solutions.

data dataset developer-tools fake-content faker fakery groovy java mock phoenix python random ruby seeding struct swift-framework test-data testing

Last synced: 11 Apr 2026

https://github.com/miriswisdom/coral.bells

Guiding and Reassuring Safety, Holistically and Empathetically

civic community data engagement govhack open safety

Last synced: 28 Jan 2026

https://github.com/tjpalanca/pins

Data Pins

data pins

Last synced: 05 Jan 2026

https://github.com/equinor/fmu-sumo

Interaction with Sumo in the FMU context

analytics data fmu python subsurface sumo visualization

Last synced: 01 May 2025

https://github.com/soenneker/soenneker.dtos.idnamepair

A minimal Record type with an Id (string), Name (string), and maximum JSON compatibility

csharp data dotnet dto id name

Last synced: 12 Mar 2026

https://github.com/ksm26/ml-ai-data-science-jobs-in-canada

Explore the latest machine learning, artificial intelligence, and data science job opportunities in Canada. Stay informed about Canadian tech job market trends and find your next career move.

ai-canada ai-careers canada canadian-tech-companies canadian-tech-job-market data data-analysis data-engineering data-science data-science-careers machine-learning prompt-engineering robotics

Last synced: 06 May 2026

https://github.com/abirsaha111/ipl-2022-analysis

The IPL 2022 Analysis project is a data-driven exploration of the Indian Premier League (IPL) 2022 cricket tournament. The analysis focuses on utilizing Python programming and various libraries to analyze and visualize the performance of teams, players, and key metrics in the IPL 2022 season.

data dataana dataanalytics datavi matplotlib python

Last synced: 07 Jun 2026

https://github.com/fiddlydigital/anonimizer

A lib to replace and rehydrate sensitive data in text

anonimize anonymize data data-security prompt sanitize string string-manipulation text

Last synced: 15 Mar 2025

https://github.com/csoren66/financial-budget-analysis

Financial budget for 2021

analytics data python

Last synced: 03 Mar 2025

https://github.com/emanoelcampos/power-bi-fundamentals

Datacamp's Power BI Fundamentals Skill Track

data data-analyst data-analyst-power-bi datacamp power-bi powerbi

Last synced: 24 Jan 2026

https://github.com/robertoostenveld/dccn.dsc_3015055.00_583_v1

The FieldTrip-SimBio Pipeline for EEG Forward Solutions [Data set].

data datalad open-data

Last synced: 24 Jan 2026

https://github.com/woctezuma/hidden-gems-data

Data available to compute regional rankings of hidden gems.

data hidden-gems steam steam-reviews

Last synced: 06 Feb 2026

https://github.com/eugenedakin/des-encryption-decryption

Encrypt and Decrypt text in Xojo using DES - Written in Native Xojo Language - Cross Platform

data data-encryption-standard decryption des encryption standard xojo

Last synced: 24 Feb 2026

https://github.com/atharvapathak/twitter_sentiment_analysis_project

Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.

api bag-of-words bert cnn data gbm nltk rnn spacy twitter

Last synced: 28 Jan 2026

https://github.com/raphaellaude/usaschooldata

Cleaned and accessible school enrollment data for US schools

data duckdb duckdb-wasm education object-storage oss wasm

Last synced: 12 May 2026

https://github.com/aaronspindler/selfdrivingcar

Learning deep learning and making a self driving car in the process

car data deep deep-learning driving keras learning machine machine-learning python self self-driving-car

Last synced: 09 Apr 2026

https://github.com/jbn/vaquero

A Python library for iterative and interactive data wrangling at laptop-scale.

data data-analysis data-cleaning data-mining dirty-data elt etl etl-framework

Last synced: 10 Jun 2026

https://github.com/maxisoft/yahoo-finance-data-downloader

Automate downloading historical and recent stock data from Yahoo Finance.

data stock-market yahoo-finance

Last synced: 29 Jan 2026

https://github.com/spatialcurrent/go-counter

Simple library and command line program for generating frequency distributions.

big-data bigdata data

Last synced: 29 Jan 2026