An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/allanotieno254/powerbi-dax-filter-context

This repository contains a Power BI project that explores **DAX Filter Context**, a crucial concept in DAX calculations. The project focuses on **Bank Loan Analysis**, demonstrating how different filter contexts affect DAX formulas.

business-intelligence data data-analysis dax dax-functions powerbi powerbi-visuals visualization

Last synced: 08 Jan 2026

https://github.com/tupizz/python-data-manipulation

Data manipulation and visualization with Python 2.x

csv data pandas python

Last synced: 09 May 2026

https://github.com/abhroroy365/market_analysis

This project explores customer segmentation and market analysis in the context of online retail using an online retail dataset. By applying advanced analytics, we aim to uncover insights that can drive strategic decisions and enhance business performance.

clustering data data-analysis data-visualization kmeans-clustering machine-learning market-analysis python silhouette-analysis

Last synced: 09 May 2026

https://github.com/fatihemres/pinch

File reader app with SwiftUI. Using data and models.

data models swift swiftui

Last synced: 17 May 2026

https://github.com/blackroad-os-inc/blackroad-portal

BlackRoad Portal — unified search routing to 30+ BlackRoad services.

blackroad cloudflare-workers data search

Last synced: 04 Apr 2026

https://github.com/scx567888/scx-data

✨ SCX Data

data java scx

Last synced: 05 Apr 2025

https://github.com/smac-group/smacdata

Data sets used in various packages.

data r

Last synced: 02 Apr 2025

https://github.com/thicclatka/tetration

New file format for tensors

cli data fileformat mmap tensors

Last synced: 26 May 2026

https://github.com/luminati-io/Google-Maps-dataset-samples

A sample dataset of over 1000 Google Maps businesses, extracted using the Bright Data API, ideal for competitor analysis, location-based marketing, and market strategies.

api data dataset google-maps maps web-scraping

Last synced: 09 Apr 2025

https://github.com/posixpascal/apple_appstore_search

📊 get public App Store data of your app in a ruby hash — that's it.

appstore data gem ios ruby

Last synced: 16 Mar 2025

https://github.com/nmsud/formdata

🗃️ Data from the NMSUD Form submissions

api data json unification-day

Last synced: 16 May 2026

https://github.com/grace-mengke-hu/redditpushshiftapi

This package is for collecting Reddit dataset and organize the data in Mongo Database

collection data reddit

Last synced: 13 Jun 2025

https://github.com/sahraiidle/email-spam-detector

Email/SMS spam detector with a Flask UI/API, tuned ML models (TF‑IDF + SVM/LogReg/NB), and a ready-to-run web form plus JSON endpoint for predictions.

data machine-learning numpy pandas python randomforest scikit-learn spam-classifier spam-detection svm

Last synced: 24 Jan 2026

https://github.com/robertoostenveld/dccn.dsc_3015055.00_583_v1

The FieldTrip-SimBio Pipeline for EEG Forward Solutions [Data set].

data datalad open-data

Last synced: 24 Jan 2026

https://github.com/woctezuma/hidden-gems-data

Data available to compute regional rankings of hidden gems.

data hidden-gems steam steam-reviews

Last synced: 06 Feb 2026

https://github.com/eugenedakin/des-encryption-decryption

Encrypt and Decrypt text in Xojo using DES - Written in Native Xojo Language - Cross Platform

data data-encryption-standard decryption des encryption standard xojo

Last synced: 24 Feb 2026

https://github.com/flexthink/matricize

A convenience library to convert between pure Python objects and their vectorized representations

data machine-learning numpy python

Last synced: 09 May 2026

https://github.com/atharvapathak/twitter_sentiment_analysis_project

Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.

api bag-of-words bert cnn data gbm nltk rnn spacy twitter

Last synced: 28 Jan 2026

https://github.com/pdoup/enegry

Time-Series dataset combining multiple sources to explain the broader Greek energy market

data dataset day-ahead-auction energy-markets exploratory-data-analysis forecasting futures-market greek-energy-market renewable-energy time-series-data weather-data

Last synced: 07 May 2025

https://github.com/quangandrei1003/france_air_pollution_pipeline

End-to-end air pollution data pipeline for French metropolitan cities using Airflow, Python, dbt, BigQuery.

airflow bigquery data data-analytics data-engineering data-modeling data-visualization dbt docker etl pandas python terraform

Last synced: 13 Apr 2026

https://github.com/white-gecko/lineage-dump

RDF dump of the device information from the lineage wiki

data dataset lineageos rdf

Last synced: 28 May 2026

https://github.com/dynamiatools/module-importer

DynamiaTools extension to work with excel files for import data

data dynamia excel import java zk

Last synced: 06 Feb 2026

https://github.com/etmendz/mendz.data.oracle

Provides a generic Mendz.Data-aware context for ADO.Net-compatible access to Oracle databases.

ado-net context data database datasettings mendz oracle

Last synced: 13 Apr 2026

https://github.com/naitiknayak196/tech-layoffs-cleaning-sql-vs-python

This project cleans and analyzes a tech layoffs dataset using MySQL and Python (Pandas) to compare their efficiency in data processing. It provides business insights into workforce trends, industry stability, and economic impacts to support data-driven decision-making.

data datacleaning dataset jyputer-notebook layoffdata layoffs mysql python sql

Last synced: 09 May 2026

https://github.com/maxisoft/yahoo-finance-data-downloader

Automate downloading historical and recent stock data from Yahoo Finance.

data stock-market yahoo-finance

Last synced: 29 Jan 2026

https://github.com/ismailhakkii/digital_vault

This project can be used for securing data, similar to a real vault.

data digital security-data vault

Last synced: 25 Mar 2025

https://github.com/istinnew/etl-pipeline-ganz-project

End-to-end ETL pipeline project for collecting, transforming, and loading data into a cloud-based database using Python, MySQL, and Google Cloud Analytics

cloud cloud-engineering cloud-services data data-science dataanalytics database database-schema googlecloud mysql mysql-database python python-lambda

Last synced: 20 Apr 2026

https://github.com/zhukovanan/stepik_

The completed tasks of different data or computer science related fields on stepik

data statistical-learning statistics stepik-course

Last synced: 21 Apr 2026

https://github.com/schluppeck/2024-abdsa-notes

some notes related to DS's presentation

abdsa data python rstats science

Last synced: 21 Apr 2026

https://github.com/mozzo1000/web-analytics

Website analysis tools and data

analysis analytics data website

Last synced: 21 Apr 2026

https://github.com/vishwas-chakilam/movies-review-scraping-analysis

A project for collecting, cleaning, and analyzing movie data. Includes scripts for web scraping (deprecated) and using the OMDb API to fetch movie details. Analyze and visualize data with Python and Power BI to uncover insights and trends in movie ratings and genres.

data dataanalysis datacleaning datavisualization matplotlib-python numpy-library pandas python webscraping

Last synced: 21 Apr 2026

https://github.com/stefen-taime/llm-rag-mtl-public-hospital

Ce projet développe un modèle de type Retrieve-Augment-Generate (RAG) pour répondre aux questions en utilisant les données publiques des avis laissés sur Google pour des hôpitaux à Montréal

data google-reviews hopital hospital hub ia llm montreal open-source quebec rag

Last synced: 21 Apr 2026

https://github.com/schijioke-uche/data-analysis-with-python-an-spss-model

With this Python notebook algorithm, you can use SPSS Model notebook to build machine learning pipelines that you can use to iterate rapidly during the model building process in data analysis. Whether you're trying to find the right algorithm or experimenting with different ways of preparing your data, you can create reproducible research that's easily understood by any member of your team with Hypothesis definition.

anova cp4a cp4d cp4i cp4s data ibm ibm-cloud jeffrey-chijioke-uche jeffrey-solomon-chijioke-uche openshift python python3 redhat t-test

Last synced: 22 Apr 2026

https://github.com/grimen/python-humanizer

A human/developer friendly value humanizer - for Python.

data debug debugging format formatting humanize humanizer log logging print printing value

Last synced: 05 Jun 2026

https://github.com/howwohmm/fetchgram

era-adjusted Instagram content intelligence — scrape any public profile, OCR every image, measure what actually works. free, local, no API keys.

analytics cli content-strategy data instagram ocr python scraper

Last synced: 06 Jun 2026

https://github.com/hruth-vik/sales-analysis-report

SalesScope is a powerful sales analytics dashboard that extracts insights, reveals trends, and drives strategy from raw data.

analytics data powerbi-report powerbi-visuals python

Last synced: 24 Apr 2026

https://github.com/mlkav/tri-hita-karana

Project Tri Hita Karana - Future Knowledge G20 Bali. DTS Kominfo x Binar Academy.

bali data data-science g20 science

Last synced: 06 Jun 2026

https://github.com/shwetajanwekar/prediction-with-regression

prediction with regression for salary_hike and delivery time dataset

data data-science datset exploratory-data-analysis matplotlib pandas plot prediction r2-score seaborn sns

Last synced: 25 Apr 2026

https://github.com/denisecase/620-mod6-web-scraping

Notes on how to get started scraping content from the web

beautifulsoup4 data mining python

Last synced: 11 Apr 2025

https://github.com/saikatharryc/motionchart-d3js

A dynamic Motion chart Built with D3 js.

chart d3js data data-science

Last synced: 23 Dec 2025

https://github.com/elimu-ai/ml-event-simulator

🤖 Simulation of learning events and assessment events

data learning-analytics machine-learning ml

Last synced: 28 Feb 2025

https://github.com/demkeys/lazydatatransfer

Lazy method to transfer upto 64kb of data over the network using UDP

data data-trans network python transfer udp

Last synced: 07 Jun 2026

https://github.com/o-rumiantsev/exchange

Data Exchange System (Prototype)

chat css data exchange system websocket

Last synced: 27 Apr 2026

https://github.com/oguzhanfatihkucuk/data-analytics-project-kafka-spark

The data in this project was collected in a database using Apache Kafka and processed with Apache Spark Streaming. The project aims to create a forecasting model and analyze sales forecasts per customer.

big-data data data-visualization hadoop kafka ml mlpipeline plt pyhton spark

Last synced: 28 Apr 2026

https://github.com/n-ce/localstorage-data-interchange-manager

Implementation of local storage data interchange using map data structure.

data export import javascript js-maps json localstorage

Last synced: 28 Apr 2026

https://github.com/kitpymes/netcore-serialize-data

El objetivo es resguardar datos secretos encriptando y serializando archivos .json y convertirlos en archivos .dat.

csharp data decrypt encrypt json net netcore2 serialize

Last synced: 29 Apr 2026

https://github.com/mtalhaofc/nutrition_system

A simple AI-powered web app built using Streamlit that provides personalized weekly meal plans and nutrition recommendations based on user demographics, health goals, and nutritional preferences.

cosine-similarity data data-science food machine-learning model nutrition pandas python streamlit

Last synced: 29 Apr 2026

https://github.com/stdlib-js/array-struct-factory

Return a constructor for creating arrays having a fixed-width composite data type.

array composite data factory javascript node node-js nodejs stdlib struct structure typed typed-array types

Last synced: 29 Apr 2026

https://github.com/shoaib1522/data-aggregator-tool-in-python

This all are the illustration of the things used in " Data Aggregation Tool " as a scenario of Data Science Engineer written in Document(PDF)

data data-science dataaggregation lists python-script python3 sets-python tuples

Last synced: 29 Apr 2026

https://github.com/diegoperea20/pytorch-vs-tensorflow

Testing the differences of the pytorch and tensorflow libraries in the different prediction and classification applications, each of them gives improvements depending on the problem they are assigned or data set assigned.

classification data images prediction pytorch tensorflow

Last synced: 29 Apr 2026

https://github.com/istinnew/eniac_ab_insight

Dive into a comprehensive analysis aimed at boosting iPhone 13 sales by optimizing the Click-Through Rate (CTR) of the “SHOP NOW” button, compare different button designs and determine the most effective strategy for increasing engagement.

ab-testing data data-analysis data-engineering data-science data-visualization google googlecolab libraries python testing testing-tools visual-studio-code

Last synced: 29 Apr 2026

https://github.com/dxtaner/graphql_events

Graphql-Events

data events graphql

Last synced: 29 Apr 2026

https://github.com/gvatsal60/ds-on-kaggle

A collection of data science projects, experiments, and insights from Kaggle competitions and datasets

data data-science data-visualization numpy pandas python3

Last synced: 29 Apr 2026

https://github.com/patrickdavies100/pipeline38

An application to automate the creation and execution of SQL queries.

data pandas-dataframe pipeline postgresql psycopg2 sqlalchemy

Last synced: 30 Apr 2026

https://github.com/ddeepanshu-997/datascience-e-commerce-shopping-details-

in this project i am going to apply data preprocessing technique on the dataset in order to clean the data using libraries, etc. make some insights/analyses to findout the hotpicks of the shopping along with some data visualsation libraries to get the trends and many more aspects in order to make a small contribution to the field of data science

cleaning-data data data-science data-visualization dataframe datapreprocessing dataset libraries matplotlib-pyplot numpy pandas plots python visualization

Last synced: 30 Apr 2026

https://github.com/dhimmel/hgnc

Extracting human gene families from HGNC

data gene-families genes hgnc hugo human

Last synced: 01 May 2026

https://github.com/lut-ful/ibm-capstone-project-stack-overflow-job-survey

IBM Data Analyst professionale certificate program final project.

cognos data data-analytics looker power-bi python sql statics

Last synced: 01 May 2026

https://github.com/acovaci/orbit

ORBIT: an Open source Rust-based implementation of a data Build Tool, inspired by DBT

cargo clap-rs data data-warehouse dbt rust rust-lang tokio-rs

Last synced: 16 Mar 2025

https://github.com/linguini1/edueval

The BorealisAI Let's Solve It mentorship project: summarizing student feedback submissions on their professor into one cohesive paragraph for faculty consideration during performance reviews.

ai data data-analysis data-science machine-learning machinelearning nlp python pytorch sentiment-analysis

Last synced: 01 May 2026

https://github.com/eshitakundu/disease-outbreak-predictor

Disease Outbreak Predictor: A Streamlit-based web application for predicting diabetes, heart disease, and Parkinson's disease using machine learning models.

data data-science disease-prediction healthcare-application jupyter-notebook machinelearning ml notebook prediction python streamlit streamlit-webapp

Last synced: 01 May 2026

https://github.com/matthewgferrari/covid-contextualizer

A Coronavirus Contextualizer for the USA

data react visualization

Last synced: 26 Jun 2026

https://github.com/0xhericles/spamdetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 02 May 2026

https://github.com/jneidel/animal-names

Dataset of 100 common animal names

animals data dataset json names opendata

Last synced: 25 Mar 2025

https://github.com/radekbednarik/covid-czech-data-api

Library to make it easy to work with REST API of official Czech Covid data.

api covid-19 data deno library typescript

Last synced: 02 May 2026

https://github.com/tn3w/moviedb-json

A JSON library with 981,530 films.

data database db json movie movie-database movies

Last synced: 03 May 2026

https://github.com/v-mayya/quantitative-analysis-data-dashboard

Quantitative survey data analysis using R

data data-analysis data-visualization flourish r

Last synced: 01 Apr 2025

https://github.com/qrailibs/dataflow

✨ Data processing in Node.js made multithreaded and type-safe.

data dataprocessing multithread node

Last synced: 04 May 2026

https://github.com/maxwelllzh/gis-tutorial-

Tutorials for Columbia University GIS Club

data python

Last synced: 04 May 2026

https://github.com/keminghe/osu

Unofficial and publicly-available NPM data-package about The Ohio State University.

college data majors ohio-state organizations public students university unofficial

Last synced: 06 Jan 2026

https://github.com/jdanielgoh/cobertura-campanias

En una democracia ¿caben todas las voces? Proyecto para visualizar el monitoreo de radio y TV que realiza el INE de las candidaturas presidenciales 2024

d3js data datavisualization vue

Last synced: 09 Jun 2026

https://github.com/bkataru/spotigo

AI-powered local music intelligence platform with a task runner server core to retrieve and backup spotify account data to storage(s) at set periodic intervals

ai backup cron data go intelligence local-llm music ollama rag runner spotify task-runner tool-calling

Last synced: 16 Jan 2026

https://github.com/srevenant/data-science-alpine

A docker container for data science, using alpine linux and python3

alpine data numpy pandas python3 science scipy xgboost

Last synced: 05 May 2026

https://github.com/edjoukou/pizza-sales-report

A data analysis project using SQL with MySQL database

analysis data mysql powerbi visualization

Last synced: 05 May 2026

https://github.com/munas-git/codm-review-analysis-and-predictions

Sentiment analysis on Call of Duty Mobile Google Play Store user reviews with ML model to classify new reviews.

data flask machine-learning python sentiment-analysis

Last synced: 05 May 2026

https://github.com/rdmurphy/deno-quaff

A port of the quaff Node.js library to Deno.

archieml csv data deno json toml yaml

Last synced: 05 May 2026

https://github.com/chanchalsoorma/web-scraping

This repo aims to provide a straightforward, easy-to-use scraping code written in Python.

beautifulsoup beautifulsoup4 data python request selenium webscraping

Last synced: 05 May 2026

https://github.com/sohomm/predict-insurance-charges

A predictive model to estimate the insurance charges based on a client's attributes, such as age and health factors. It offers a practical application of ml in business, enabling more accurate pricing models and helping companies manage risk while delivering personalized pricing strategies to clients.

administration algorithm bot data decision-trees download easy finance github java machine-learning management model neural-network nlp prediction project science trading university

Last synced: 05 May 2026

https://github.com/donmaruko/python-eda-toolkit

CLI-runned EDA with 30 commands utilizing text-related functions, statistical calculations, data visualization, and data manipulation.

data data-analysis data-science data-visualization matplotlib pandas scipy seaborn statistical-analysis statistics wordcloud

Last synced: 06 May 2026

https://github.com/ksm26/ml-ai-data-science-jobs-in-canada

Explore the latest machine learning, artificial intelligence, and data science job opportunities in Canada. Stay informed about Canadian tech job market trends and find your next career move.

ai-canada ai-careers canada canadian-tech-companies canadian-tech-job-market data data-analysis data-engineering data-science data-science-careers machine-learning prompt-engineering robotics

Last synced: 06 May 2026

https://github.com/parthds02/analyzing-student-success-with-data

Discover key factors influencing student performance through data analysis and visualization. Explore gender, parental education, sports, and ethnicity impacts.

data datascience jupyter-notebook kaggle python pythonlibraries

Last synced: 06 May 2026