An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/priyanka7411/customer-flight-prediction-app-mlflow

A comprehensive project predicting flight prices and customer satisfaction using machine learning models, deployed through interactive Streamlit apps.

classification customer-satisfaction data data-cleaning data-visualization feature-engineering flight-price-prediction machine-learning mlflow python regression streamlit

Last synced: 12 May 2026

https://github.com/realabbas/instagram-user-meta-data

Instagram User Meta Data 📷 can be fetched using this script in an easy to use JSON Object for displaying Instagram Cards.

data instagram javascript metadata nodejs profile user xray

Last synced: 10 May 2026

https://github.com/tushar2704/interview-quest

Interview-Quest is comprehensive collection of interview questions and answers that can help you prepare for technical interviews. Whether you're a seasoned developer looking to brush up on your skills or a job seeker preparing for your next big opportunity, this repository aims to provide valuable resources to enhance your interview readiness.

artificial-intelligence data data-science interview interview-questions machine-learning

Last synced: 23 Jan 2026

https://github.com/mheadd/SamDotNet

:office: A C# wrapper for the SAM.gov API.

api business client data gov-api government

Last synced: 30 Apr 2025

https://github.com/12joan/not-analytics

don't be creepy.

data metrics privacy

Last synced: 30 Apr 2025

https://github.com/avto-dev/data-migrations-laravel

Package for database data migrations

data database laravel migrations package

Last synced: 12 Jul 2025

https://github.com/davorg/data-tree

Perl library for handling trees

data perl tree

Last synced: 02 Apr 2025

https://github.com/katerynazakharova/common-ml

Creating this lib for ML tasks, because I'm bored of copy-pasting the same functions for different projects.

data data-processing deep-learning lib machi

Last synced: 26 Mar 2025

https://github.com/alpheustangs/jder

A standardized structure for JSON responses

api data error json response specification structure

Last synced: 26 Mar 2025

https://github.com/williamzebrowski/assistant-api

OpenAI Assistant API integrated with Elasticsearch, Logstash & Kibana

ai chatapp chatgpt conversational-ai data elasticsearch kibana llm-inference llms openai rag

Last synced: 16 Feb 2026

https://github.com/soulyma/web_crawler

A focused web crawler to extract and structure Arabic content from web pages. Designed for researchers, data analysts, and developers working on Arabic language datasets.

beautifulsoup4 crawler csv data json python structured-data

Last synced: 15 May 2026

https://github.com/eddybrando/peru-year-names

Directory of Peru's official year names

data json peru

Last synced: 23 Jul 2025

https://github.com/dhimmel/erc

Processing human Evolutionary Rate Covariation data

data erc evolution evolutionary-rate-covariation genes hetionet human rephetio

Last synced: 23 Jul 2025

https://github.com/cyberoctane29/cyclistic-bike-share--analyzing-rider-behavior

Analyzed Cyclistic's bike-share data to uncover usage differences between casual riders and annual members. Utilized SQL and MySQL for data processing, R for visualisation, and Kaggle for collaboration. Insights will guide marketing strategies to convert casual riders into annual members.

data dataanalysis dataanalytics database rlanguage rmarkdown spreadsheet sql

Last synced: 22 May 2026

https://github.com/jbdesbas/custom-scripts

Custom SQL functions or scripts

data database sql

Last synced: 23 Feb 2025

https://github.com/tupizz/data-processing-pipeline-aws

This project is a serverless application built with the Serverless Framework, TypeScript, and AWS services. It provides an enrichment service that processes contact information and enriches it with additional data.

aws data pipeline serverless typescript

Last synced: 13 May 2026

https://github.com/phatdev12/diem-thi-tuyen-sinh-10-da-nang

Danh sách điểm thi tuyển sinh 10 Đà Nẵng 2023-2024

data data-science dataanalytics dataset json

Last synced: 28 Jun 2025

https://github.com/qetdr/names-genders

Surnames, genders, and gender probabilities data extraction script and dataset

data python

Last synced: 01 May 2026

https://github.com/tbrowder/classfactory

Provides tools to create a data collection with classes to manipulate the persistent data.

class data persistent raku

Last synced: 04 Apr 2025

https://github.com/hoangsonww/fred-banking-data-analysis

💸 AI-powered banking data explorer that combines FRED API insights with vector search, regression analysis, and interactive chat via OpenAI, Claude, and Gemini. Built with TypeScript, React, and Express for seamless full-stack performance.

anthropic chartjs claude-ai data data-analysis data-analytics data-science data-visualization fred fred-api gemini google-generative-ai logistic-regression multiple-regression openai pinecone react regression typescript vector-database

Last synced: 09 Apr 2025

https://github.com/sarincr/basics-of-julia-programming-language

Julia is a high-level, high-performance, dynamic programming language. While it is a general purpose language and can be used to write any application, many of its features are well-suited for high-performance numerical analysis and computational science.

data data-analysis data-mining data-science data-visualization dataanalysis dataanalytics datascience julia julia-language julia-library julia-package julialang machine-learning

Last synced: 19 May 2026

https://github.com/ybelenko/openapi-data-mocker-server-middleware

PSR-15 HTTP Server Middleware to create mock responses from OpenAPI Schemas(OAS 3.0).

data fake faker middleware mock mocker oas oas3 openapi psr-15 swagger

Last synced: 15 Jun 2025

https://github.com/stdlib-js/array-base-count-same-value

Count the number of elements that are equal to a given value in an array.

array count countif data javascript node node-js nodejs same stdlib structure sum summation total types

Last synced: 21 Apr 2026

https://github.com/mvuorre/psyarxivdb

Datasette serving PsyArXiv preprint metadata

data datasette open-science preprints psyarxiv

Last synced: 14 May 2026

https://github.com/raigu/ordered-lists-sync

Library for synchronizing ordered data with the minimum of insert and delete operations. Suitable for lage data sets in isolated environments

data lists ordering sync syncrhonization update

Last synced: 12 Jan 2026

https://github.com/tillahoffmann/idxhound

🐶 Track indices across one or more numpy selections.

data numpy scientific-computing

Last synced: 14 May 2026

https://github.com/m-muecke/isocountry

R package containing ISO codes for countries and currencies

country-codes currency-codes data iso-3166-1 iso-4217 r r-package

Last synced: 20 Mar 2025

https://github.com/real-veersandhu/cia-country-comparison

Data analysis system on the CIA World Factbook

data

Last synced: 25 Feb 2025

https://github.com/michellepellon/jobx

A modern, powerful job scraper for LinkedIn, Indeed and beyond.

compensation data data-analysis indeed indeed-scraping jobs jobsearch linkedin linkedin-scraper

Last synced: 17 Jan 2026

https://github.com/MikeBairdRocks/Fluky

[floo-kee]: obtained by chance rather than skill.

data framework mock netcore netstandard nuget random vscode

Last synced: 02 Apr 2025

https://github.com/lunastev/wson-rust

WSON data serialization parser

data parser serialization

Last synced: 07 Apr 2025

https://github.com/2kabhishek/pokemon-stats

Gotta stat 'em all 🖲🐭

d3 data emoji pokemon rollup statistics

Last synced: 14 May 2026

https://github.com/stdlib-js/array-base-filled4d-by

Create a filled four-dimensional nested array according to a provided callback function.

alloc allocate array callback data fill filled foreach generic javascript map matrix multidimensional node node-js nodejs stdlib strided structure types

Last synced: 07 Sep 2025

https://github.com/sandravizz/global_inequality_story

Dataviz Project about Global Inequality

data data-visualization inequality

Last synced: 03 Jul 2025

https://github.com/kevinsames/spark-fuse

spark-fuse is an open-source toolkit for PySpark — providing utilities, connectors, and tools to fuse your data workflows together.

data databricks fabric pyspark python spark

Last synced: 08 May 2026

https://github.com/thomd/git-scrape-hacker-news

scrape hacker news metadata for data analysis

data data-science git-scraping hacker-news

Last synced: 16 Sep 2025

https://github.com/stdlib-js/array-base-any-by-right

Test whether at least one element in an array passes a test implemented by a predicate function, while iterating from right to left.

any array data generic javascript node node-js nodejs predicate some stdlib structure test types validate

Last synced: 14 Apr 2025

https://github.com/jonsafari/toy-data

Embeddable submodule of parallel/monolingual text data, for use in testing code and sanity checks

data language-data machine-translation nlp sanity-checks toy-data

Last synced: 06 Nov 2025

https://github.com/epogrebnyak/business-conditions-digest-2017

Replicate illustration from Business Conditions Digest

data economics

Last synced: 22 Mar 2025

https://github.com/sermetpekin/perse

Perse is an experimental Python package that combines some of the most widely-used functionalities from the powerhouse libraries Pandas, Polars, and DuckDB into a single, unified DataFrame object. The goal of Perse is to provide a streamlined and efficient interface, leveraging the strengths of these libraries to create a versatile data handling.

data data-science data-structures duckdb pandas polars

Last synced: 09 May 2026

https://github.com/millengustavo/salarios-data-science

Aplicativo Streamlit de exploração dos dados da Pesquisa de mercado de Data Science feita pelo Data Hackers

brasil brazil ciencia-de-dados data data-science heroku salarios salary

Last synced: 07 Oct 2025

https://github.com/aruneshbasak/python-dsa-problems-geeksforgeeks-160-days

I will upload my daily Python DSA problems solved on GeeksforGeeks and post it here!

algorithms-and-data-structures and data data-structures dsa python python3 structure

Last synced: 08 May 2025

https://github.com/qeeqbox/data-security

Safeguarding your personal information (How your info is protected)

data data-security infosecsimplified qeeqbox security

Last synced: 19 Mar 2026

https://github.com/qeeqbox/data-lifecycle-management

Data Lifecycle Management (DLM) is a policy-based model for managing data in an organization

data data-lifecycle-management infosecsimplified lifecycle management qeeqbox

Last synced: 07 Mar 2026

https://github.com/kerlossony/nested-formdata

Nested-FormData is a Function designed to handle nested form data structures in a simplified and efficient way. It helps in managing complex form data, making it easier to work with forms that require hierarchical data

data forms javascript nested-structures nextjs reactjs typescript

Last synced: 08 Mar 2026

https://github.com/luminati-io/crunchbase-dataset-samples

A sample of 1001 Crunchbase companies with key data points, extracted using the Bright Data API.

crunchbase crunchbase-api crunchbase-scraper data database datasets webscraper-api webscraping

Last synced: 17 Mar 2025

https://github.com/sixarm/sixarm_ruby_fab

SixArm.com → Ruby → Fab gem to fabricate sample data for testing

data fabrication factory fake gem mock ruby

Last synced: 24 Jul 2025

https://github.com/jimbrig/jimstaskviews

CRAN Task Views and Shiny App https://jimstaskviews.jimbrig.com

cran data docs rstats shiny-app submodules task-views

Last synced: 06 Mar 2026

https://github.com/cont-limno/lagosus-reservoir

Data module classifying lakes as natural lakes or reservoirs in the conterminous U.S.

data module

Last synced: 17 Jan 2026

https://github.com/DataHerb/dataherb-flora

DataHerb Flora: The core of DataHerb

data data-mining data-science datascience dataset datasets

Last synced: 08 May 2025

https://github.com/fjc0k/vue-merge-data

Intelligently merge data for Vue render functions.

data merge-data render-functions vue

Last synced: 17 May 2026

https://github.com/mikebairdrocks/fluky

[floo-kee]: obtained by chance rather than skill.

data framework mock netcore netstandard nuget random vscode

Last synced: 17 May 2026

https://github.com/ate329/nsl-kdd-feature-extractor

Python-based tool designed to process network traffic packets and extract features compliant with the NSL-KDD dataset format.

cyber-security cybersecurity data data-science extractor feature-extraction machine-learning network-analysis nsl-kdd nsl-kdd-dataset

Last synced: 30 Oct 2025

https://github.com/snegovoy98/data-storage

This is test version of data storage

data of storage test version

Last synced: 19 Jul 2025

https://github.com/prioritizr/prioritizrdata

Conservation planning data sets

data r spatial-data

Last synced: 19 Jul 2025

https://github.com/inzhenerka/scooters_data_uploader

Загрузка данных в PostgreSQL в рамках курса по dbt от Инженерка.Тех

data dbt postgresql

Last synced: 04 May 2026

https://github.com/bytraembedded/Laptop-Price-Prediction-with-Machine-Learning

The Laptop Price Prediction with Machine Learning project provides a system to predict the price of laptops based on various features such as processor type, RAM size, storage capacity, and more/

airflow data data-science data-visualization fastapi heroku-deployment machine-learning-algorithms matplotlib-pyplot numpy pandas python reactjs seaborn

Last synced: 30 Dec 2025

https://github.com/redodo/shipper

Hide encrypted data in files.

audio data images python steganography

Last synced: 26 Mar 2025

https://github.com/am-i-groot/summer-intern-iitguwahati-spml

Developed an automated Water Quality Monitoring System (WQMS) at IIT Guwahati, using the pH-W218 sensor and K-Means Clustering to assess water potability. The project enhances water quality evaluation through machine learning-based classification.

algorithm data data-visualization kmeans-clustering machine-learning python report sensor signal-processing

Last synced: 17 May 2026

https://github.com/muhammad-fiaz/ason

ASON: Adaptive Structured Object Notation - Python library for dynamic data serialization, providing flexibility and simplicity.

adaptive-structure-object-notation api ason cli client data file file-format file-sharing file-upload json json-data json-parser open-source opensource parser parsing python python3

Last synced: 02 Feb 2026

https://github.com/desilinguist/hanukkah-of-data-2022

My solutions to Hanukkah of Data 2022

2022 data hanukkah pandas python

Last synced: 17 May 2026

https://github.com/saboye/web-scraping-with-python

A web scraping project using Python's "Requests" and "BeautifulSoup" libraries to extract structured data from one or more websites. This project involves sending HTTP requests to the target website(s), retrieving the HTML content of the website(s), and parsing this content to extract the desired data in a usable format.

beautifulsoup csv data data-harvesting data-mining python request web webscraping

Last synced: 18 Jul 2025

https://github.com/giscience/measures-rest-oshdb-docker

Scripts for starting measures for geospatial datasets in docker container, using the OSHDB

data dggs docker geospatial mesure openstreetmap rest

Last synced: 18 Apr 2026

https://github.com/akhi07rx/f1-statistics-dashboard

A comprehensive command-line tool for analyzing Formula 1 race data using the FastF1 library.

akhi07rx cli cli-tools data f1 f1-score f1cli f1dashboard f1stats fastf1 formula1 opensource race race-analytics

Last synced: 23 May 2026

https://github.com/velocitatem/cellviz

Cellular Automata inspired by live-data visualization, designed to handle multidimensional and high-throughput data efficiently.

cellular-automata conways-game-of-life data economics

Last synced: 29 Jul 2025

https://github.com/lakecountryhuntclub/dnr-map-data-model

Data Model for the 2023 DNR Pheasant Stocking Property Data

data data-model documentation excel gis hunting mapping powerquery vba

Last synced: 29 Jul 2025

https://github.com/vtalks/youtube_data_api3

A python3 library to interact with Youtube Data API.

api client data library python python3 youtube

Last synced: 09 Apr 2026

https://github.com/charliecm/meteorite-landings

Data visualization of meteorite landings on Earth.

astronomy d3 data data-visualization mapbox space visualization

Last synced: 18 Apr 2026

https://github.com/fairspec/fairspec-application

Fairspec Application is a visual tool for managing and validating tabular and structured data

ckan csv data dataset excel fair fairspec json ods polars python quality schema sqlite table typescript validation zenodo

Last synced: 23 May 2026

https://github.com/joeyism/py-cifar10

This library was created to allow an easy usage of CIFAR 10 DATA. This is a wrapper around the instructions givn on the CIFAR 10 site

cifar cifar-10 cifar10 data machine-learning machinelearning

Last synced: 30 Jul 2025

https://github.com/asuozzo/medicare-data-analysis

An analysis of Medicare Part D data in Vermont

data python

Last synced: 04 May 2026

https://github.com/mouneshgouda/learn_dsa

This repository explores fundamental data structures and their implementations. Learn how to organize and manipulate data efficiently for various programming tasks. (Feel free to add your specific focus areas here, e.g., algorithms, interview prep)

c data queue sorting-algorithms stack structured-data

Last synced: 30 Jul 2025

https://github.com/visenger/prada

Profiling Datasets

cleaning data dataset profiling

Last synced: 24 Aug 2025

https://github.com/gappeah/london-housing-price-dashboard

This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.

data data-analysis data-visualization excel visual

Last synced: 31 Jul 2025

https://github.com/derrickbaruga7/python-data-analysis

This project analyzes ORU’s off-season sewer usage using Python, with `pandas` for data handling, histograms and line plots for exploration, and a `scipy`-based model for prediction. Pearson’s correlation and visualizations help reveal key trends and relationships.

analytics data data-science visualization

Last synced: 31 Jul 2025

https://github.com/dannyben/datamix

DSL for manipulating tabular data

csv data data-analysis data-engineering gem ruby tabular-data

Last synced: 31 Jul 2025

https://github.com/flowsynx/plugin-postgresql

FlowSynx plugin to interfaces with PostgreSQL for CRUD operations. Supports JSONB, full-text search, and advanced query features.

data database flowsynx postgresql postgresql-database sql

Last synced: 09 May 2026

https://github.com/chandraprakash-bathula/keywords_prediction-machine-learning-integration

Keywords Prediction Model Built the Model By: Data Cleaning Removing Stopwords Constructing Word2vec Advancing to TF-IDF Weighted Word2vec.

algori artifici data machine-learning tf-idf weighted-word2vec word2vec

Last synced: 08 Nov 2025

https://github.com/ajsalemo/python-pandas-datalib

Testing and experimenting with some simple Pandas functionality using Flask to serve the parsed data.

csv data flask json pandas pandas-dataframe pandas-series python tabular tabular-data terminal

Last synced: 09 Apr 2026

https://github.com/tonykipkemboi/ens_subgraph_data

Query On-Chain Data from Subgraphs by The Graph Protocol using Python

data subgraphs thegraphprotocol web3

Last synced: 17 Sep 2025

https://github.com/stephaniehicks/flowsorted.blood.wgbs.blueprint

A Bioconductor ExperimentHub data package for flow sorted purified whole blood cell types measured using DNA methylation on WGBS platform from BLUEPRINT

bioconductor bioconductor-package bisulfite-sequencing blood data dna-methylation flowsort wgbs

Last synced: 25 Sep 2025

https://github.com/v6ntage/sql-sales_data-analytics-project

This repository contains a SQL scripts demonstration analytical techniques.

analytics business-analytics data data-analysis database query sql sql-server

Last synced: 12 Apr 2026

https://github.com/chalk-ai/roadmap

Chalk public roadmap

chalk data data-science mlops pipeline python

Last synced: 17 Jan 2026

https://github.com/simranjeet97/leetcode_practice

Practicing the Leet Code Codes for Competitive Programming

algorithms amazon coding competitive-programming data data-structures facebook google leetcode python

Last synced: 03 Aug 2025

https://github.com/theryston/db-mycro

A node module with a json database that saves data in a specific directory, similar to sqlite, but in JSON

base crud data database db db-mycro javascript json jsondatabase nodejs nosql typescript

Last synced: 09 Apr 2026

https://github.com/undistraction/grid-model

A small API for creating a grid and accessing the positions of the cells, rows and columns within it.

2d calculations cells data grid layout model

Last synced: 04 Aug 2025

https://github.com/tiaanduplessis/country-currency-data

Data about currencies of countries

countries currencies data symbols

Last synced: 08 Aug 2025