An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/lablnet/alibaba_scraper

This is a robust web scraper that extracts data from the Alibaba website. It's multi-threaded and utilizes Playwright to efficiently scrape data from the website. This script is capable of scraping the entire Alibaba site, which would take approximately 4-6 months to complete.

alibaba data ecom mit-license open-source products scraper

Last synced: 15 Mar 2025

https://github.com/project-renard/test-data

Files for testing

data

Last synced: 27 Feb 2026

https://github.com/abhash-rai/regression-car-price-prediction

This repository contains my first complete data science project from web scrapping for data to data preprocessing, cleaning, exploratory data analysis, model training and deployment.

data data-science data-visualization eda exploratory-data-analysis machine-learning neural-network prediction prediction-model regression

Last synced: 08 May 2026

https://github.com/vianneymi/amplifai

Amplifai is a package that allows you to transform your raw unstructured text into structured data in a few lines of codes.

data data-mining extraction langchain llm pydantic

Last synced: 27 Feb 2026

https://github.com/pawamoy/keycut-data

Keyboard shortcuts data stored in YAML files

data keyboard-shortcuts

Last synced: 12 Feb 2026

https://github.com/foundationallm/.github

A platform accelerating delivery of secure, trustworthy enterprise copilots.

agent ai data enterprise generative-ai large-language-model llm ml tool

Last synced: 12 Feb 2026

https://github.com/artcc/coredatademo

Demo for CoreDataGenericModule implementation

core coredata coredata-model data encrypted encrypted-data encryption persist

Last synced: 19 Jun 2026

https://github.com/lckylke/vizweb

Web application for data visualization:)

data expressjs nextjs web

Last synced: 08 May 2026

https://github.com/j0a0m4/olympics

Final Project for Data Engineering Accelerated LATAM

data olympics spark

Last synced: 13 Feb 2026

https://github.com/matthewgferrari/covid-contextualizer

A Coronavirus Contextualizer for the USA

data react visualization

Last synced: 26 Jun 2026

https://github.com/gsinghjay/ywcc-307-003

Group Presentations

cloud data government

Last synced: 04 Feb 2026

https://github.com/infinitode/pywebscrapr

An open-source Python web scraping tool. Supports both image scraping and text scraping.

data data-collection data-science open-source pip scraping web-scraper

Last synced: 14 Feb 2026

https://github.com/sanand0/iss-location

Tracks the International Space Station position. A demo of how to use GitHub Actions to schedule commits weekly.

data

Last synced: 14 Feb 2026

https://github.com/sakan811/show-leaving-soon-tracker-website

This is a Vue.js application that displays shows that are leaving each platform soon, featuring a countdown timer for each title based on the user's local timezone.

data hbo hbomax netflix shows streaming tv-shows vue vuejs web webapp website

Last synced: 18 Mar 2025

https://github.com/molinsagustin/cinedata

# CineData Trabajo práctico grupal para la materia Ingeniería de Datos I en la Universidad Argentina de la Empresa. El mismo consistió en el desarrollo de una base de datos relacional en Microsoft SQL Server Managment Studio utilizando metodología Ágil SCRUM, que se utilizó desde el relevamiento de requisitos hasta la implementación final.

agile data data-modeling database diagram entity-relationship-diagram microsoft-sql-server relational-databases relational-model scrum scrum-agile sql sqlserver

Last synced: 28 Feb 2026

https://github.com/g-schumacher44/analyst_resource_hub

A collection of guidebooks, quickref, and resources for data analysis

analytics bigquery data lookerstudio machine-learning model python sql yaml-configuration

Last synced: 20 Jun 2026

https://github.com/chaewonkong/kaggle-competitions

kaggle competitions and lessions

ai data kaggle-competition ml

Last synced: 15 Mar 2025

https://github.com/writetome51/public-data-container-interface

Just a TypeScript interface with 1 property: 'data'

container data interface typescript

Last synced: 15 May 2026

https://github.com/madhuresh2011/genai-powered-data-analytics-by-tata

I recently participated in Tata iQ's job simulation on the Forage platform, and it was incredibly useful to understand what it might be like to be on a data analytics team in an AI transformation consulting role.

chatgpt data dataanalytics eda excel gemini generative-ai internships powerpoint presentation

Last synced: 14 Feb 2026

https://github.com/natarizkie2/neurochain-airdrop-bot

🍋 — A smart bot designed to complete data tasks like true/false selections automatically, with multi-account support for extra convenience.

airdrop automated bot data multi-account natarizkie neurochain nodejs web3

Last synced: 10 Jun 2026

https://github.com/miniql/miniql-inline

A MiniQL query resolver for inline data.

data query query-language

Last synced: 27 May 2026

https://github.com/rezapace/newbash

This project involves managing various application shortcuts and configurations primarily for a Linux environment. It includes scripts for creating .desktop entries for applications, managing system configurations, and handling application processes.

automation backup bash data dekstop linux newbash ohmyzsh script testing zsh

Last synced: 11 Apr 2026

https://github.com/boratechlife/tensorflow-questions-datasets

A Tensorflow questions Datasets to help you practice Machine learning and Train Models

data datapreprocessing datasets machinelearning modeltrain questions tensorflow

Last synced: 23 Mar 2025

https://github.com/justinyahin/wpdf

Create, filter, sort and display users data on your WordPress site.

data filtering wordpress

Last synced: 18 Apr 2026

https://github.com/basemax/okala-product-ids

A PHP script to fetch and save product IDs from Okala's online store API across multiple categories and store branches.

crawler crawler-okala crawler-php crawlers data database ids ir iran json okala okala-crawler php php-crawler product

Last synced: 09 May 2026

https://github.com/sanchittechnogeek/overscripted-analysis

Geolocation and user language extraction analysis from Mozilla Overscripted dataset

analysis data data-analysis mozilla

Last synced: 23 Mar 2025

https://github.com/psyteachr/psyteachrdata

Datasets for psyTeachR Books

data

Last synced: 23 Mar 2025

https://github.com/checco9811/data-engineering-bootcamp-homework

Homework solutions for DataExpert.io data engineering bootcamp

apache-spark data data-engineering sql

Last synced: 14 Mar 2025

https://github.com/cityofnewyork/nyco-wp-open-data-transients

Interface for saving Open Data endpoints as WordPress Transients. Maintained by @NYCOpportunity

civic-tech composer data nycopportunity open-data plugin transients wordpress

Last synced: 10 Apr 2026

https://github.com/badranalyst/covid-deaths-dashboard-with-tableau

This project showcases an interactive dashboard developed in Tableau to visualize COVID-19 deaths data. It provides insights into trends, geographical distributions, and key metrics related to mortality during the pandemic. The dashboard aims to enhance understanding of the data, supporting public health analysis and decision-making.

covid-19 dashboard data data-analysis data-visualization dataset tableau tableau-dashboards visualization

Last synced: 02 Mar 2026

https://github.com/dansalahi/query-builder-experiment

Customized Query Builder for creating Rules and Groups

data data-structures jsonlogic query-builder reactjs typescript validation

Last synced: 11 Apr 2026

https://github.com/anuppm9917/data-processing-and-csv-to-json-using-python-project

This project guides you through processing data from CSV to JSON format using Python. You'll learn to cleanse, validate, and transform data with pandas, numpy, csv, and json libraries, ensuring it's ready for POS system integration. This will help improve data integrity and streamline integration.

csv-files data data-analysis data-cleaning data-collection data-transformation data-validation python3 transformation

Last synced: 16 Apr 2026

https://github.com/nel-zi/nuga_bank

Developed an automated data exploration and cleaning pipeline for Nuga Bank to streamline data preparation, ensure consistent data quality, and normalize datasets into structured databases for efficient analysis and reporting.

data data-automation data-visualization datacleaning datatransformation etl-automation etl-pipeline

Last synced: 16 May 2025

https://github.com/r-mahesh45/india-news-headlines-analysis

Excited to share my latest project: India News Headlines Analysis (2001–2023). This Power BI report dives deep into 21 years of Indian headlines, uncovering: Trends that defined the nation, Key themes that shaped public discourse, Insights into the evolution of media coverage.

data data-science powerbi visualization

Last synced: 05 Jan 2026

https://github.com/nagar2nd/financial-analysis-power-bi

This project analyzes financial and credit card usage data using Power BI and DAX, focusing on customer behavior, credit risk, and financial performance. It includes insights on spending trends, delinquency rates, churn indicators, and satisfaction scores to drive better financial management and customer retention strategies.

analysis data dax dax-functions dax-query excel powerbi

Last synced: 03 Mar 2026

https://github.com/dhimmel/adeptus

ADEPTUS -- differential gene expression signatures of disease

adeptus data differential-expression disease gene-expression genes rephetio

Last synced: 05 Jan 2026

https://github.com/metapsy-project/data-depression-anxiety-transdiagnostic

Database of transdiagnostic treatment of depression and anxiety

data

Last synced: 01 Apr 2026

https://github.com/cmda-tt/course-25-26

🎓 tech track · 2025-2026 · curriculum and syllabus 📊

d3 data datavis functional javascript programming research svelte visualization

Last synced: 20 Jan 2026

https://github.com/halyusa16/mysql-employee-analysis

This project focuses on analyzing employee data through querying, performing table joins to connect related information, aggregating salary statistics, and using subqueries to extract meaningful insights.

data data-analytics data-exploration database mysql self-project sql

Last synced: 20 Jan 2026

https://github.com/bonnevoyager/quick-storage

Simple key/value storage module with persistency.

browser data fs indexeddb javascript key-value nodejs persistence quick server storage

Last synced: 16 Apr 2026

https://github.com/ashakoen/bls-data-extract

This repository contains scripts and a database schema to set up and manage a local SQLite database for storing and querying the Average Price data from the U.S. Bureau of Labor Statistics. It includes tools for downloading the latest data from the BLS website and fetching Consumer Price Index (CPI) data via the BLS API.

data government sqlite us

Last synced: 01 Apr 2026

https://github.com/2022-04-11588/data-fakes

🔍 Generate realistic fake data for testing and development, enhancing your projects with simple, customizable data solutions.

data dataset developer-tools fake-content faker fakery groovy java mock phoenix python random ruby seeding struct swift-framework test-data testing

Last synced: 11 Apr 2026

https://github.com/suryadev99/stream_processing_website_click_data

Stream Processing of website click data using Kafka and monitored and visualised using Prometheus and Grafana

clickdata data dataengineering docker flink-kafka flink-metrics flink-stream-processing git grafana kafka kafka-streams kafka-topic prometheus psql python

Last synced: 10 Mar 2026

https://github.com/gkannan-codes/habitableexos

With Earth’s habitability under strain, we ask: which known exoplanets could humans live on? Using NASA’s Exoplanet Archive, we score planets 0–1 (1 ≈ Earth) from five Earth-normalized features to rank top candidates.

data html kaggle matplotlib-pyplot numpy pandas plotly python seaborn visualization

Last synced: 11 Apr 2026

https://github.com/jameshenderson12/data-lists

This respository contains lists of useful data that can be used in a variety of projects.

countries data list names scottish text

Last synced: 05 Mar 2026

https://github.com/sehgal-vishal/world-population-

World Population Sql Analysis

data dataanalysis population sql

Last synced: 05 Mar 2026

https://github.com/amethyst-php/collection

Simple as the name, this package allow you to create collection of other models.

amethyst amethyst-package api collection data laravel

Last synced: 17 Apr 2026

https://github.com/smeltier/data-structures-c

This repository contains C language implementations of the main data structures covered in the Algorithms and Data Structures course. The implementations were developed as part of my hands-on learning process and include sequential lists, linked lists, and other fundamental structures.

algorithms algorithms-and-data-structures c c-language c-programming data data-structures data-structures-c structures-c

Last synced: 16 May 2025

https://github.com/scjoaoantonio/trab_datascience

Este projeto tem como objetivo analisar os posts da rede social Bluesky. A aplicação interativa foi desenvolvida utilizando Streamlit e permite a coleta e visualização de dados, além de oferecer análises avançadas como previsão de engajamento, modelagem de tópicos e análise de sentimentos.

bluesky data data-science streamlit

Last synced: 09 May 2026

https://github.com/michael-ljn/cirp-lce-2025

Prospective Global Warming Potential of Australian Low-Emission Hydrogen in a Net-Zero Emission Context

data publication

Last synced: 06 Mar 2026

https://github.com/mecha-cms/x.time

Creates page time data if it does not exist.

data date extension page time

Last synced: 23 Mar 2025

https://github.com/jprando/mattkillua

Estudo sobre .Net Core

data dbcontext domain efcore netcore

Last synced: 23 Mar 2025

https://github.com/amethyst-php/attendance

Indicate the attendance/absence of an employee in a defined office with a range of dates

amethyst amethyst-package api attendance data laravel

Last synced: 17 Apr 2026

https://github.com/khushi-sabarad/data_analysis

linkedin learning capstone project

data data-engineering matplotlib pandas python

Last synced: 10 May 2026

https://github.com/peterhellberg/bugsnag-data

Dump Bugsnag data using the Data access API

bugsnag data go

Last synced: 22 Jun 2026

https://github.com/ehvenga/data.driven.modeling

Repository to practice data driven modelling

data data-modeling

Last synced: 23 Mar 2025

https://github.com/klima7/social-insight

Web application in Flask to analyse and visualize Facebook data.

analysis data facebook flask insights python social web

Last synced: 18 Apr 2026

https://github.com/opdev1004/crumbdbjs

JSON files based database Javascript

data data-storage data-store database database-management nodejs

Last synced: 18 Apr 2026

https://github.com/bbfh-dev/protox

Go library for (de-)serializing custom protocols

binary data format go library parsing protocol reader writer

Last synced: 01 Jul 2025

https://github.com/mipacd/holochatstats

A VTuber chat log (and general) analytics platform

data flask hololive postgresql python visualization vtuber youtube

Last synced: 05 Apr 2026

https://github.com/codbex/codbex-hestia-data-sample

Sample data for codbex-hestia

data module sample

Last synced: 05 Apr 2026

https://github.com/cracko298/planet-life-save-converter

Convert your Planet-Life Saves To and From Base64 & *.planet files.

base64 base64-decoding base64-encoding data python python-script python3 save-converter save-data save-files

Last synced: 15 Mar 2025

https://github.com/phelipe-sempreboni/certificates

Tutorial intended for information about my licenses and certificates acquired over time.

certificate certificates certification course data database datascience licences license-management marketing marketing-analytics python sql

Last synced: 16 May 2026

https://github.com/notthestallion/pca__3d-and-from-scratch__principal-component-analysis

In this project, I will be implementing Principal Component Analysis (PCA) from scratch on an ecological footprint consummation database for countries and a three-dimensional scale using a movie database. The goal of this project is to gain a deeper understanding of PCA and to demonstrate its capabilities in exploring complex datasets.

data data-science database pca pca-analysis principal-component-analysis principal-component-analysis-pca principle-component-analysis

Last synced: 10 May 2026

https://github.com/omers/sre-devops-tools

Tools and useful sources for SRE and DevOps

awsome awsome-list data devops monitoring sre tools

Last synced: 20 Apr 2026

https://github.com/karo23361/toy-store-kpi-power-bi

PowerBI Portfolio Project

csv data data-visualization powerbi

Last synced: 03 Feb 2026

https://github.com/brightway-lca/bw_io

IO tools for Brightway LCA framework

bw3 data life-cycle-assessment python

Last synced: 10 Jun 2026

https://github.com/afeiship/data-arary

Data array with some new methods.

array data data-structure js list

Last synced: 11 May 2026

https://github.com/sehaj003/boston-bruins-roster-planning-mysql-nosql

Repository for Data Management project, Boston Bruins Roster Planning using MySQL and NoSQL along with data analysis using Python

data data-management mongodb mysql project-repository python

Last synced: 11 May 2026

https://github.com/schluppeck/2024-abdsa-notes

some notes related to DS's presentation

abdsa data python rstats science

Last synced: 21 Apr 2026

https://github.com/schijioke-uche/data-analysis-with-python-an-spss-model

With this Python notebook algorithm, you can use SPSS Model notebook to build machine learning pipelines that you can use to iterate rapidly during the model building process in data analysis. Whether you're trying to find the right algorithm or experimenting with different ways of preparing your data, you can create reproducible research that's easily understood by any member of your team with Hypothesis definition.

anova cp4a cp4d cp4i cp4s data ibm ibm-cloud jeffrey-chijioke-uche jeffrey-solomon-chijioke-uche openshift python python3 redhat t-test

Last synced: 22 Apr 2026

https://github.com/rbcavi/factorio-mod-data

The modpacke data for factorio-viewer

data factorio factorio-data factorio-mod-data

Last synced: 23 Apr 2026

https://github.com/howwohmm/fetchgram

era-adjusted Instagram content intelligence — scrape any public profile, OCR every image, measure what actually works. free, local, no API keys.

analytics cli content-strategy data instagram ocr python scraper

Last synced: 06 Jun 2026

https://github.com/hruth-vik/sales-analysis-report

SalesScope is a powerful sales analytics dashboard that extracts insights, reveals trends, and drives strategy from raw data.

analytics data powerbi-report powerbi-visuals python

Last synced: 24 Apr 2026

https://github.com/ybelenko/openapi-data-mocker-interfaces

Package with OpenApiDataMocker interfaces.

data fake faker interface mock mocker oas oas3 openapi swagger

Last synced: 05 Jan 2026

https://github.com/soenneker/soenneker.dtos.idnamepair

A minimal Record type with an Id (string), Name (string), and maximum JSON compatibility

csharp data dotnet dto id name

Last synced: 12 Mar 2026

https://github.com/purarue/HPI-personal

Personal HPI modules/scripts

data history lifelogging

Last synced: 30 Mar 2025

https://github.com/rubix982/product-quality-classification

This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.

data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp

Last synced: 25 Apr 2026

https://github.com/simonbolivarpy/vault-decode-py

Simple Tools for decode crypto data, from extensions wallet, Metamask, Ronin, TrustWallet, TronLink(old), etc.

data decode decrypt metamask passwords python ronin salt tronlink trustwallet vault

Last synced: 15 Mar 2025

https://github.com/jigyasag18/multiple-disease-detection-app

This repository contains the implementation of a Multiple Disease Detection System, which employs advanced machine learning techniques for early detection and prediction of prevalent diseases, including diabetes, heart disease, and Parkinson's disease. The system utilizes a variety of patient health metrics such as demographics and medical history.

data datapreprocessing machine-learning machine-learning-algorithms machinelearningmodel prediction python streamlit streamlit-webapp

Last synced: 07 Jun 2026

https://github.com/sebastian-diaz-berdecia/analisis-popularidad-de-series-y-generos-de-series

Consultas SQL para el análisis de la popularidad de series y géneros series de la base de datos NetflixDB.

business-analytics bussiness-intelligence data data-analysis database mysql mysql-database sql

Last synced: 12 May 2026

https://github.com/luminati-io/seleniumbase-with-proxy

SeleniumBase with authenticated proxies to bypass restrictions, enhance web scraping, and manage rotating proxies for better data extraction.

data data-collection proxy-server python residential-proxy selenium seleniumwire web-scraping

Last synced: 27 Apr 2026

https://github.com/demkeys/lazydatatransfer

Lazy method to transfer upto 64kb of data over the network using UDP

data data-trans network python transfer udp

Last synced: 07 Jun 2026