An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/jigyasag18/bird-strikes-in-aviation-project

This project analyzes over a decade of U.S. bird strike data (2000–2011) to evaluate safety risks, damage trends, and cost implications in aviation. Using PostgreSQL for database management and Power BI for dashboard visualization, it uncovers critical insights into when, where, and how wildlife impacts aircraft. Key findings inform strategically.

bird-strike-prevention bird-strike-prevention-in-real-airport data data-analysis data-analysis-project data-visualisation data-visualization data-visualization-project data-visualizations database dataset dax-query postgresql postgresql-database powerbi powerbi-desktop powerbi-report powerbi-visuals sql sql-database

Last synced: 09 May 2026

https://github.com/roovedot/unet-cnn-for-road-segmentation

(In Progress) Unet architecture with CNNs (Convolutional Neural Networks) aimed at Road Segmentation

cnn cnn-for-visual-recognition cnn-pytorch computer-vision data data-engineering data-science unet unet-image-segmentation unet-pytorch

Last synced: 01 Jul 2025

https://github.com/tasosfotiadis/time-series-forecasting-for-bitcoin

This project forecasts Bitcoin’s daily closing price using time series models. Data from Jan 2021 to Mar 2022 is processed by converting timestamps, resampling, and handling missing values. LSTM and ARIMA models are evaluated on MAE, RMSE, and MAPE, with LSTM achieving better accuracy while ARIMA is faster in training and inference.

arima bitcoin data data-analysis data-science deep-learning forecasting jupyter-notebook neural-networks python time-series

Last synced: 06 May 2026

https://github.com/purarue/HPI-personal

Personal HPI modules/scripts

data history lifelogging

Last synced: 30 Mar 2025

https://github.com/ashishsingh789/titanic_dataset_eda_and_visualization

This repository contains an exploratory data analysis (EDA) of the Titanic dataset. Key analyses include survival rates by gender, passenger class, age distribution, family size, and correlation heatmaps.

data data-science dataanalysis matplotlib numpy pandas pandas-dataframe python seborn visualisation

Last synced: 11 Apr 2026

https://github.com/abdullahashfaqvirk/earth-engine-data-scraper

A Python based web scraper designed to extract and organize dataset metadata from the Google Earth Engine Datasets Catalog for research, and analysis purposes.

beautifulsoup data data-science python requests scraper web-scraping

Last synced: 10 May 2026

https://github.com/rezapace/newbash

This project involves managing various application shortcuts and configurations primarily for a Linux environment. It includes scripts for creating .desktop entries for applications, managing system configurations, and handling application processes.

automation backup bash data dekstop linux newbash ohmyzsh script testing zsh

Last synced: 11 Apr 2026

https://github.com/vim89/flowforge

Let's be honest - most data pipeline frameworks treat types as suggestions. Config files are strings. Schemas are "validated" at runtime. Data quality is an afterthought. So, let's do differently

archetype data data-contracts data-engineering data-pipelines data-quality data-science database dataengineering datapipeline etl etl-framework pipelines scala scalability spark spark-sql spark-streaming

Last synced: 14 Apr 2026

https://github.com/elissorokin/data-analyst-portfolio

Это репозиторий, в котором я демонстрирую свои навыки, делюсь проектами и отслеживаю прогресс в области анализа данных и Data Science.

ab-testing data data-analysis datalense matplotlib numpy pandas plotly portfolio postgresql python scipy seaborn sql statistical-analysis

Last synced: 09 Apr 2026

https://github.com/boratechlife/tensorflow-questions-datasets

A Tensorflow questions Datasets to help you practice Machine learning and Train Models

data datapreprocessing datasets machinelearning modeltrain questions tensorflow

Last synced: 23 Mar 2025

https://github.com/asma-hachaichi/imdb-movies-rating-prediction

This project collects movies information from IMDb using web scraping, then uses this data to guess movie ratings. It combines the skills of gathering data from the internet to predict how well movies are liked.

beautifulsoup4 data data-science machine-learning movies movies-reviews prediction python scraping

Last synced: 31 Mar 2025

https://github.com/e22m4u/ts-data-schema

Валидация данных и приведение типов для TypeScript

data schema typescript validation

Last synced: 05 Aug 2025

https://github.com/eharshit/end-to-end-vendor-insights

End-to-end analysis of vendor performance for wholesale/retail businesses, featuring data ingestion, cleaning, insights, and interactive Power BI dashboards.

analysis analysis-algorithms analytics dashboard data data-analysis datascience jupyter jupyter-notebook pandas powerbi powerbi-report retail wholesale

Last synced: 07 Oct 2025

https://github.com/prajjwol09/sql_retail_analysis_project

This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.

data data-analysis datacleaning dataexploration pgadmin4 sql

Last synced: 15 Feb 2026

https://github.com/haimonmon/j3mify

Convert your jejemon word into a formal sentence or word

data jejemon nlp normalization python regex tagalog tokenization

Last synced: 12 Oct 2025

https://github.com/pythoncoderunicorn/startrek

a repo for Star Trek data from Technical Manuals

data klingon-language star-trek vulcan

Last synced: 07 Oct 2025

https://github.com/fiddlydigital/anonimizer

A lib to replace and rehydrate sensitive data in text

anonimize anonymize data data-security prompt sanitize string string-manipulation text

Last synced: 15 Mar 2025

https://github.com/psyteachr/psyteachrdata

Datasets for psyTeachR Books

data

Last synced: 23 Mar 2025

https://github.com/ohspc89/better_call_jin

A repository containing mentoring materials for a Ph.D. student in Neuroscience

data matlab spss-statistics visualization visualization-tools wrangling-data

Last synced: 08 Oct 2025

https://github.com/aiwithqasim/project_allocation_system

Project Allocation System (PAS) automates and simplifies the process of Allocating projects to students. Teachers can simply add details on prompting for input and perform a number of operation modules including Adding Projects, Updating Projects, Searching Projects , Deleting Projects and Display All Projects

algorithms-and-data-structures algorthims c-plus-plus data data-structures linked-list

Last synced: 08 Oct 2025

https://github.com/rahulthedevil/metric-converter

A simple utility package for converting between metric units such as meters, kilometers, grams, kilograms, liters, and more. Simple and powerful way for Units Convert solution

convert converter data fraction imperial length mass measurements metric metrics ratio system temperature unit unit-conversion unit-converter units uom utilities weight

Last synced: 08 Oct 2025

https://github.com/quantumudit/test-store-data-analysis

This repository showcases a web scraper with a pipeline structure for efficient data extraction and transformation from websites. The tool can be tailored to leverage its capabilities for insightful data analysis, providing valuable insights and informed decision-making.

data data-visualization dataanalytics python python-webscraping webscraper webscraping-data

Last synced: 11 Apr 2026

https://github.com/mnz1365/saving-record-time-text

date saving in text file with python

data python txt-files writefile

Last synced: 18 Jul 2025

https://github.com/adilsaid64/real-time-data-monitoring

Exploring what a real-time data drift monitoring solution could look like within MLOps

data datadrift grafana machine-learning mlops mlops-workflow prometheus python software-engineering

Last synced: 04 Aug 2025

https://github.com/danieljdufour/fast-b64

Quickly Convert between B64 and Binary Strings

b64 base64 base64-decoding base64-encoding binary bits compression data

Last synced: 08 Oct 2025

https://github.com/kenjyco/libs

Easily install kenjyco libs

api cli command-line data helper kenjyco libs python

Last synced: 16 May 2026

https://github.com/adamouization/python-machine-learning-data-science-notes

:orange_book: Jupyter notebooks containing useful Python code and notes for general Machine Learning and Data Science projects.

data data-science data-visualization guide jupyter jupyter-notebook machine-learning matplotlib notes numpy pandas pandas-dataframe python seaborn

Last synced: 11 Apr 2026

https://github.com/s-babaeizadeh/next-mini-app

nextjs mini application

css data nextjs reactjs

Last synced: 11 Apr 2026

https://github.com/djdhairya/whatsapp-chat-analysis

WhatsApp chat analysis is a multidimensional process that delves into the content, structure, and dynamics of conversations within the platform. It provides valuable insights for personal reflection, organizational decision-making, and improving communication strategies.

data data-science dataanalytics datapreprocessing machine-learning ml

Last synced: 08 Oct 2025

https://github.com/dms-codes/scrape-kesaintblanc-id

Kesaintblanc Data Scraper This Python script is designed to scrape product data from the Kesaintblanc website. It collects information about products, including product name, URL, price, image URLs, status, stock, and more. The scraped data is saved to a CSV file for further analysis.

data kesaintblanc python webscraper

Last synced: 27 May 2026

https://github.com/udofia2/crudwithdatabase

A simple Nodejs app that connect to a database.

crud data databse

Last synced: 08 Oct 2025

https://github.com/gunjanmimo/d3-visualization

D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. It makes use of Scalable Vector Graphics, HTML5, and Cascading Style Sheets standards. It is the successor to the earlier Protovis framework

d3js data data-science data-visualization reactjs

Last synced: 29 Apr 2026

https://github.com/leevilaukka/alkometriikka

Tool to search Alko database and see some fun stats about different beverages

data gh-pages svelte typescript xlsx

Last synced: 18 May 2026

https://github.com/dansalahi/query-builder-experiment

Customized Query Builder for creating Rules and Groups

data data-structures jsonlogic query-builder reactjs typescript validation

Last synced: 11 Apr 2026

https://github.com/greedchikara/dsajs

Data Structures and Algorithms written in Javascript

algorithms data structures

Last synced: 09 Apr 2026

https://github.com/neptun-software/neptun.data.generators

Send scraped data from neptun-scraper to CHATGPT to generate training data for NEPTUN.AI.

data generator

Last synced: 30 Jul 2025

https://github.com/guilyx/airplane-booking

Simple airline ticket reservation program.

algorithms data linked-list

Last synced: 25 Jun 2025

https://github.com/anarya22/e-commerce_analysis

E-Commerce_Analysis is a data analysis project performed on the Superstore_USA dataset. It explores various aspects of e-commerce performance, including sales trends, customer demographics, product categories, and regional performance. The analysis includes data cleaning, visualizations, and insights on factors influencing sales and profitability.

analysis analytics cleaning-data data

Last synced: 09 Oct 2025

https://github.com/remcostoeten/github-and-vercel-api-showcase-dashboard

Showcase results of possible fetched data from the Github and Vercel API built in all vanilla js.

api-rest da data express-js github-api nodejs vercel-api

Last synced: 07 Mar 2026

https://github.com/maxisoft/yahoo-finance-data-downloader

Automate downloading historical and recent stock data from Yahoo Finance.

data stock-market yahoo-finance

Last synced: 29 Jan 2026

https://github.com/psyteachr/sdg-data

Data relevant to the UN Sustainable Development Goals

data

Last synced: 09 Oct 2025

https://github.com/kaijagahm/2023-10-20-stlzoo

Data Carpentry workshop, hosted at the St. Louis Zoo. Beta testing the new ecology data lesson.

data data-science ecology r rstudio

Last synced: 05 Feb 2026

https://github.com/g3th/fit_file_decoder

Decodes '*.fit' files and returns readable values.

bytes data decoder fit-file hex parsing

Last synced: 30 Jun 2025

https://github.com/ehvenga/data.driven.modeling

Repository to practice data driven modelling

data data-modeling

Last synced: 23 Mar 2025

https://github.com/gabrielcsapo/bluse

⚗️ blend and fuse data with ease

data normalize utility

Last synced: 15 Mar 2025

https://github.com/oliver021/helppad-net

Versatile .NET Toolkit: A Comprehensive Set of Miscellaneous Helpers, Classes, and Utilities

assert async checks cryptographic-algorithms data date dotnet fluent functional functional-programming hash helpers parallel pipe pipeline pointers review supports tasks

Last synced: 15 Jun 2026

https://github.com/nel-zi/insighthire_agency

Built a web scraping solution using BeautifulSoup to extract job listings from MyJobMag, cleaned the data, and loaded it into PostgreSQL with SQLAlchemy for better job data management.

data dataloading datatransformation sql webscraping

Last synced: 16 May 2025

https://github.com/nel-zi/nuga_bank

Developed an automated data exploration and cleaning pipeline for Nuga Bank to streamline data preparation, ensure consistent data quality, and normalize datasets into structured databases for efficient analysis and reporting.

data data-automation data-visualization datacleaning datatransformation etl-automation etl-pipeline

Last synced: 16 May 2025

https://github.com/ddeepanshu-997/support_vector_regression--svr-

In this repository i performed a support vector regression on real life data , initially i performed some data preprocessing technique in order to filter out the data flaws then undergoes the process of model building i.e SVM regression in order to make a machine learning regression model.

data data-science regression-analysis regression-models svm-model svm-regression

Last synced: 03 Aug 2025

https://github.com/r-mahesh45/india-news-headlines-analysis

Excited to share my latest project: India News Headlines Analysis (2001–2023). This Power BI report dives deep into 21 years of Indian headlines, uncovering: Trends that defined the nation, Key themes that shaped public discourse, Insights into the evolution of media coverage.

data data-science powerbi visualization

Last synced: 05 Jan 2026

https://github.com/taeefnajib/ibm-applied-data-science-capstone

This repository is for my IBM Applied Data Science Capstone Project. All the notebooks and other files are uploaded. If you are benefited by this repository by any means, please feel free to "Star" it and follow me. Thanks.

advance capstone capstone-project data data-science ibm ibm-watson jupyter jupyter-notebook notebook notebook-jupyter project science spacex spacex-api

Last synced: 14 Mar 2025

https://github.com/lotfiferaga/instagram-reach-analysis

The Instagram Reach Analysis project aims to develop a Python-based tool to analyze the reach and engagement metrics of Instagram posts.

analytics data data-science datavisualization python

Last synced: 18 Jun 2026

https://github.com/knowcnu12/metamask-wallet-recovery-funds-phrase-data-seed-token

This repository provides tools and guidelines for securely recovering MetaMask Wallet funds using recovery phrases, seed data, and tokens. It ensures safe and reliable methods for recovering access to your wallet and managing your cryptocurrency assets.

bitcoin blockchain cryptocurrencies cryptocurrency data ethereum funds metamask metamask-bot metamask-desktop metamask-extension metamask-plugin metamask-snap metamask-wallet phrase recovery seed token wallet wallet-security

Last synced: 08 Mar 2026

https://github.com/jk-oster/laravel-collection-trend

Generate trends from collections. Easily generate charts or reports.

charts collections data laravel php reports trends

Last synced: 03 Aug 2025

https://github.com/sillyash/untappd-viz

A data visualisation page using public datasets and HTML/CSS/JS with D3.js.

beer beer-statistics data data-analysis data-visualization kaggle kaggle-dataset public-dataset school-project

Last synced: 18 May 2026

https://github.com/jacopodl/jcollections

Common data structures for the C language

c collections data data-structures jcollections

Last synced: 30 Jul 2025

https://github.com/alecxcode/table-parser

Python Table Parser (data extraction)

automation data extraction python robotic-process-automation

Last synced: 04 May 2026

https://github.com/redgoose-dev/baguni

이미지를 보관하고 탐색하는 웹 프로그램

data explorer file management upload

Last synced: 14 Apr 2026

https://github.com/theopenwebjp/theopenweb-data-loader

Package for loading data to local project

data downloader import javascript typings

Last synced: 10 Oct 2025

https://github.com/j-sephb-lt-n/joes_giant_toolbox

A large collection of general python functions and classes that I use in my daily work

ascii browser classifier data dataviz gcp mime nlp python regex search statistics supervised web-scraping

Last synced: 10 Oct 2025

https://github.com/abdullahashfaqvirk/Earth-Engine-Data-Scraper

A Python based web scraper designed to extract and organize dataset metadata from the Google Earth Engine Datasets Catalog for research, and analysis purposes.

beautifulsoup data data-science python requests scraper web-scraping

Last synced: 27 Sep 2025

https://github.com/dhimmel/adeptus

ADEPTUS -- differential gene expression signatures of disease

adeptus data differential-expression disease gene-expression genes rephetio

Last synced: 05 Jan 2026

https://github.com/bastianolea/minsal_suicidios

Casos de intento de suicidio y suicidio consumado en Chile

chile comunas data genero salud tiempo

Last synced: 19 Jan 2026

https://github.com/loggdme/kyro

Collection of utilities and examples for creating efficient data pipelines in go with parallel queues and, rate limitiers and much more.

data package

Last synced: 14 Jan 2026

https://github.com/chowington/bg-counter-tools

A set of tools that can pull data from Biogents BG-Counter smart mosquito traps and convert them into a Darwin Core compliant format.

bg-counter biogents darwin-core data internet-of-things mosquito-prevalence population-dynamics

Last synced: 10 Oct 2025

https://github.com/dhimmel/thinklytics

Continuous Thinklab project exports and analytics

analytics data rephetio thinklab travis-ci

Last synced: 23 Mar 2025

https://github.com/ikcede/hinge-data-ts-wrapper

Typescript wrapper for exported Hinge data

data hinge typescript

Last synced: 10 Oct 2025

https://github.com/sushmashreeps/python

This repository showcases a comprehensive Python project, demonstrating expertise in backend development, data analysis, and machine learning. Built with Python 3.x, the project utilizes popular libraries like Django, Flask, NumPy, pandas, and scikit-learn. The project features efficient data processing, robust API integration, and scalable archite

api data data-science dataanalysis datavisualization game gamedeveloment python

Last synced: 12 May 2026

https://github.com/fuzzt/location-analyzer

The Location Data Analyzer is a Spring Boot application that offers insights on location data, such as counting locations by type, calculating average ratings, and identifying the most reviewed and incomplete entries. It features a simple frontend (HTML, CSS, JavaScript) and is deployed on Render.

analysis api average css data deployment docker fetch-api frontend html javascript location maven ratings render restful-api reviews spring-boot techstack

Last synced: 11 Apr 2026

https://github.com/dumkydewilde/mcp-memory-layer

A template for building your own BI MCP with dbt, LLMs and multi-user corrections

bi data dbt llm mcp-server

Last synced: 13 Mar 2026

https://github.com/writetome51/pagination-page-info

Intended to help a separate Paginator class paginate data. Specifically, this class contains the properties `itemsPerPage` and `totalPages`, which will be used by other classes

batch data javascript paginate pagination typescript

Last synced: 09 May 2026

https://github.com/jun-labs/jq

🧷 Let's practice jq.

data jq json json-data parse

Last synced: 27 Sep 2025

https://github.com/nukopian/shell-series

Extract columns from tabular text

automation data shell

Last synced: 11 Oct 2025

https://github.com/praxtube/dogg

CLI tool to log data manually

data data-logger log logger

Last synced: 10 Jun 2026

https://github.com/avestura/shell-dads

❓ Show a random tip from NIST DADS (https://xlinux.nist.gov/dads) every time you open your terminal

algorithms dads data data-structures ds nist

Last synced: 23 Oct 2025

https://github.com/ybelenko/openapi-data-mocker-interfaces

Package with OpenApiDataMocker interfaces.

data fake faker interface mock mocker oas oas3 openapi swagger

Last synced: 05 Jan 2026

https://github.com/carlosrs14/parallel-data-preprocessig-system

A parallel data preprocessing system using threads and synchronization mechanisms (barrier, busy-waiting, condition variables) to clean and prepare data for AI training.

barrier-method c condition-variable data operative-systems parallel-computing posix preprocessing synchronization threads

Last synced: 24 Jul 2025

https://github.com/snimmagadda1/luigi-etl-example

🔍 Example of an ETL pipeline using Spotify's Luigi

data luigi luigi-pipeline python spotify

Last synced: 30 Mar 2025

https://github.com/theipster/property-data

Tooling to track real estate / property market events, analyse trends and generate insights.

data property real-estate

Last synced: 24 Jan 2026

https://github.com/tks18/xl-pq-handler

A Pythonic Power Query (.pq) File Manager for Excel & Power BI Automation

analytics automation data excel power-query powerbi python xlwings

Last synced: 20 Jan 2026

https://github.com/abirsaha111/ipl-2022-analysis

The IPL 2022 Analysis project is a data-driven exploration of the Indian Premier League (IPL) 2022 cricket tournament. The analysis focuses on utilizing Python programming and various libraries to analyze and visualize the performance of teams, players, and key metrics in the IPL 2022 season.

data dataana dataanalytics datavi matplotlib python

Last synced: 07 Jun 2026

https://github.com/team-hydrogen/2025-adc-data

All files relating to the computation of the data provided

data jupyter-notebook nasa-app-development-challenge

Last synced: 11 Apr 2025

https://github.com/seldszar/piccha

Another tree data structure

data tree

Last synced: 16 Jul 2025

https://github.com/terracrow/tml

Easy to use data manipulation package using YAML.

data database db node npm tml yml

Last synced: 26 Feb 2025

https://github.com/nisanth2004/springboot-kafka-real-world-project-wikimedia

Creating a project about Wikimedia using Kafka involves building a system that leverages Apache Kafka for data streaming and processing related to Wikimedia data.

async broker communication data java kafka message real-time real-time-analytics springboot wikimedia

Last synced: 14 May 2026

https://github.com/jigyasag18/aircraft-data-management

This repository offers a comprehensive simulation of global military air deployments involving 10 countries, aircraft models, mission types, and strategic zones. It analyzes air power distribution, mission intent (offensive, defensive, support), and geopolitical positioning. The project provides structured insights into regional & zone level threat

aircraft-data aircraft-performance data data-analysis data-visualization database database-management dataset datavisualisation mysql powerbi powerbi-report powerbi-visuals sql

Last synced: 04 Feb 2026