An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/varun-khorgade/sentimentscope-e-commerce-review-analyzer

Analyzed customer reviews and purchase data to extract sentiment and behavioral insights. Built SQL-based ETL for data preparation and visualized results using Python and Power BI dashboards for actionable business decisions.

analytics customer-beheviour dashboard data data-visualization dataextraction natural-language-processing nlp pandas powerbi python sentiment-analysis sql textblob

Last synced: 17 Apr 2026

https://github.com/himel-sarder/web-scraping-it-jobs-dataset

This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.

data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping

Last synced: 22 Feb 2026

https://github.com/kamal-singh22/ai-driven-emotional-sentiments-analysis

This project leverages machine learning to analyze and classify the emotional sentiment of textual data. The goal is to accurately identify and categorize emotions, aiding applications in customer feedback analysis, social media sentiment analysis, and mental health monitoring.

analysis artificial-intelligence data emotion nlp-machine-learning python sentiment-analysis streamlit text-classification

Last synced: 14 Apr 2026

https://github.com/danielgiljam/orbit-utils

A collection of utility packages for Orbit.js.

data inference orbit orbitjs schema synchronization type typescript validation zod

Last synced: 01 May 2026

https://github.com/stdlib-js/datasets-cdc-nchs-us-births-1969-1988

US birth data from 1969 to 1988, as provided by the Center for Disease Control and Prevention's National Center for Health Statistics.

america babies births data dataset datasets javascript node node-js nodejs stdlib time-series timeseries united-states us usa

Last synced: 19 Apr 2025

https://github.com/athari22/house_sales_in_king_count_usa

The idea of the project is to do a Data analysis in a Real Estate Investment Trust. The Trust would like to start investing in Residential real estate.

analysis data data-science data-visualization ibm ibm-watson linearregression machine-learning matplotlib numpy pandas sklearn-library

Last synced: 01 May 2026

https://github.com/FAIMS/OpenDataPresentation

Brian Ballsun-Stanton's presentation

context data presentation

Last synced: 03 Apr 2025

https://github.com/dennyglee/open-covid19-public

A collaboration between SCRI and Databricks on the analysis of open COVID-19 datasets.

covid-19 data data-analytics data-engineering data-science nlp

Last synced: 22 Jun 2025

https://github.com/lagden/injection

Inject data into file

data file inject nodejs

Last synced: 24 Apr 2026

https://github.com/lmuffato/project-restaurant-orders-trybe

Projeto restaurant orders - Projeto avaliativo da Trybe do Bloco 36: Estrutura de Dados I: Arrays, Hashmaps e Sets

array array-set csv data data-analysis hashmap python set trybe trybe-projects

Last synced: 13 Sep 2025

https://github.com/0xleif/onionstash

Store Onions 🧅

data swift

Last synced: 05 Apr 2025

https://github.com/gcoronelc/ucv_gdi-1_202302-b2

Taller de Gestión de Datos e Información I con Gustavo Coronel.

data data-science data-structures database databases online oracle query relational-databases security sql sql-server

Last synced: 19 May 2026

https://github.com/snegovoy98/data-storage

This is test version of data storage

data of storage test version

Last synced: 19 Jul 2025

https://github.com/tobinchilongo/oop-school-library

This project consists of Ruby script for the school library app. I implemented encapsulation and inheritance with Ruby by creating classes to represent students and teachers in the school.

data database gemfile input-output preserve rspec-testing rubocop unit-test

Last synced: 02 May 2026

https://github.com/bhpcv252/dda-binapprox-on-fits

Using the binapprox algorithm to efficiently estimate the median of each pixel from a set of astronomy images in FITS files.

astronomy data median python

Last synced: 22 Mar 2025

https://github.com/eugenedakin/steganography-pictures

Add and remove a picture-in-a-picture with steganography

compare data steganography steganography-tools xojo

Last synced: 12 Feb 2026

https://github.com/patelabhi574/hotel_reservation_analysis

Analyzing data collected by hotel to make future prediction for the owner of what are the segments they are making most profit & also which are the patterns & trends which have been seen over the past years in the booking in different times throughout the year and price setting on the website in peak time as per availability index.

data data-visualization datamodeling looker-studio powerbi reporting sql-query sql-server

Last synced: 19 Feb 2026

https://github.com/jensz12/uhc

Datapack til Minecraft 1.13+ UHC

data minecraft pack

Last synced: 21 Sep 2025

https://github.com/aleklukanen/chapterhousedb-example-app

An example application using the ChapterhouseDB processing engine

arrow data database event golang parquet processing stream

Last synced: 18 Apr 2026

https://github.com/prioritizr/prioritizrdata

Conservation planning data sets

data r spatial-data

Last synced: 19 Jul 2025

https://github.com/lunastev/reflectlm

ReflectLM is a self-reflective, language-structure-only AI model that learns exclusively through interaction. It starts with zero factual knowledge but can engage in dialogue, evaluate its own responses, and remember conversations for future learning.

ai data language-model llm model open-source ts web

Last synced: 22 Jun 2025

https://github.com/maxnowack/elastic-sync

Connector to sync mongodb documents into a elasticsearch index

data elasticsearch mongodb sync

Last synced: 20 Jan 2026

https://github.com/gbburleigh/quick-seeders

Generate realistic test data quickly with Quick-Seeders, a Python library offering a wide range of data types and schema definitions. Control data variance, probabilities, and output formats, including SQL. Simplify your data seeding process and improve testing efficiency.

data dataset faker generator python seeder sql test

Last synced: 03 Apr 2025

https://github.com/bytraembedded/Laptop-Price-Prediction-with-Machine-Learning

The Laptop Price Prediction with Machine Learning project provides a system to predict the price of laptops based on various features such as processor type, RAM size, storage capacity, and more/

airflow data data-science data-visualization fastapi heroku-deployment machine-learning-algorithms matplotlib-pyplot numpy pandas python reactjs seaborn

Last synced: 30 Dec 2025

https://github.com/mustika-putri-m/-tableu-laporan-data-karyawan-growian

I am currently pursuing a data analysis certification at GROWIA, where I've learned to use tools such as Python, SQL, Google Big Query, Google Data Studio, Advanced Microsoft Excel, and Tableau. This course has enhanced my ability to analyze data using KPIs and business metrics, enabling me to solve business problems more effectively

data data-visualization tableau

Last synced: 17 Feb 2026

https://github.com/ferhatgec/tuc

TinyUrl CLI, generate short link/s from terminal.

data little python3 request script

Last synced: 18 Feb 2026

https://github.com/amyflo/cs448b

Exploring r/LoveLetters

d3-visualization d3js data react reactjs visualization

Last synced: 18 May 2026

https://github.com/clabe45/kaz

Minimalistic local storage cli

cli data minimalistic storage utility

Last synced: 17 Jul 2025

https://github.com/cont-limno/lagosus-reservoir

Data module classifying lakes as natural lakes or reservoirs in the conterminous U.S.

data module

Last synced: 17 Jan 2026

https://github.com/pythongiant/data-analytics-wolfram-alpha

A data analysis porgram using wolfram alpha

analytics api data wolfram-alpha

Last synced: 04 Apr 2025

https://github.com/public-health-scotland/waiting_times_clinical_prioritisation

This repository contains the Reproducible Analytical Pipeline (RAP) to produce the quarterly statistics on clinical prioritisation, part of the Stage of Treatment (SoT) publication.

data healthcare nhs public-health scotland shiny shiny-app treatment waiting-time

Last synced: 26 Jul 2025

https://github.com/jbdesbas/custom-scripts

Custom SQL functions or scripts

data database sql

Last synced: 28 Jun 2026

https://github.com/stdlib-js/ndarray-empty

Create an uninitialized ndarray having a specified shape and data type.

data empty javascript matrix ndarray node node-js nodejs stdlib structure types vector

Last synced: 14 May 2025

https://github.com/am-i-groot/summer-intern-iitguwahati-spml

Developed an automated Water Quality Monitoring System (WQMS) at IIT Guwahati, using the pH-W218 sensor and K-Means Clustering to assess water potability. The project enhances water quality evaluation through machine learning-based classification.

algorithm data data-visualization kmeans-clustering machine-learning python report sensor signal-processing

Last synced: 17 May 2026

https://github.com/inzhenerka/scooters_data_uploader

Загрузка данных в PostgreSQL в рамках курса по dbt от Инженерка.Тех

data dbt postgresql

Last synced: 04 May 2026

https://github.com/labgua/ilmeteo

Acquisizione dati dal sensore SHT71 e trasmissione in rete in Real-Time

acquisition data humidity humidity-sensor iot raspberry-pi real-time realtime rpi sht71 temperatura temperature temperature-sensor umidita web

Last synced: 24 Apr 2026

https://github.com/jen-uis/loan-status-prediction

This repository contains project materials for the Winter STAT 206 class, University of California, Riverside, A. Gary Anderson School of Management.

data data-analysis data-analytics data-cleaning data-visualization descriptive-analytics julia julia-language jupyter-notebook predictive-analytics predictive-modeling team-collaboration

Last synced: 02 Jan 2026

https://github.com/harmonydata/harmony_examples

Example Jupyter notebook and R scripts using Harmony in real research problems

data data-harmonisation data-harmonization harmonisation psychology python r research

Last synced: 11 Jul 2025

https://github.com/lisakey/convert-csv-to-sav

We used python 🐍 to convert a csv file into a sav file with all the modifications needed to open it in IBM spss and be able to analyse our data.

analysis chardet convert csv data databases ibm os pandas pyreadstat python sav spss sys transformations

Last synced: 08 May 2026

https://github.com/shuklayash02/complete_data_analysis_project

A Full Data Analysis project where a sales data is ask,prepare,process,analyze,share and act through data analysis process

data data-visualization dataanalysis database datacleaning powerbi sql

Last synced: 16 Jul 2025

https://github.com/iosdec/adstorage

Automatic Data Storage - iOS

data ios objective-c public storage xcode

Last synced: 21 Mar 2025

https://github.com/junkwaxdata/cardlists

Sports Card set lists in easily consumable JSON Format for databases, apps, websites, and more!

baseball baseball-cards baseball-data bowman data dataset datasets donruss fleer json json-schema panini topps upper-deck

Last synced: 13 Mar 2025

https://github.com/viveknathani/maketest

A command line tool to generate test data. 📊

command-line data golang testing-tools

Last synced: 08 Jun 2026

https://github.com/plurid/deserve

Own Your Data · Control The Code

data owner

Last synced: 16 Jul 2025

https://github.com/DataHerb/dataherb-flora

DataHerb Flora: The core of DataHerb

data data-mining data-science datascience dataset datasets

Last synced: 08 May 2025

https://github.com/jvrck/australianpayphones

Get Australian payphone data in GeoJSON format.

australia data geojson geojson-data scraper

Last synced: 04 Apr 2025

https://github.com/cliffano/volothamp

Random D&D stuffs my son and I dabble with

data dungeons-and-dragons info little-godzilla

Last synced: 06 Apr 2025

https://github.com/sambacha/yearn-finance-data

data repo for proposed YIP-DATA

cryptocurrency data erc20 ethereum exchange yearn yip yyip

Last synced: 18 May 2026

https://github.com/stefanpietrusky/facts

Repository for the article in the online magazine Data Science Collective.

ai arxiv-papers beautifulsoup data flask-application gensim llama matplotlib ollama plotly pyldavis python selenium webdriver

Last synced: 09 May 2026

https://github.com/umbaji/yodi

This is the official repository for Yodi, the speech recognition model for 8 words, in Ewè. The yodi package is also useful for rapid inference inference on speech data, especially on the mini_speech datasets.

data data-visualization keras python3 speech-recognition tensorflow

Last synced: 12 Jan 2026

https://github.com/kingsley-ezenwaka/app-profile-data-analysis

A Python data analysis project that aims to propose an app profile based on analysis of Google Playstore dataset.

analysis data jupyter-notebook matplotlib pandas python seaborn

Last synced: 29 Apr 2026

https://github.com/agustinmusanti/sqlchallenge-2

This repository contains my solutions to a SQL challenge using MySQL, centered around a fictional retail company called TechMarket. The challenge covers various SQL tasks such as data retrieval, manipulation, and analysis, simulating real-world scenarios within a retail business environment.

challenge data mysql

Last synced: 03 Apr 2025

https://github.com/dineshpinto/geist-finance-subgraph

Subgraph for the Geist Finance protocol on the Fantom blockchain.

assemblyscript blockchain data fantom graphql typescript

Last synced: 17 May 2026

https://github.com/josephbarbierdarnal/cieri-analytics.com

CIERI Analytics is the applied research department of the non-profit organization CIERI.

analysis behavior data identity research

Last synced: 12 Jan 2026

https://github.com/alireza29675/goudi

GOUDI is a multi-layer data visualization application, inspired by mind maps and some other thinking and describing methods.

analysis data goudi visualization

Last synced: 11 Jul 2025

https://github.com/Greatwoman23/Market-Basket-Analysis

Unlock the power of data-driven sales optimization with Market Basket Analysis. Explore frequent itemsets and association rules to strategically enhance product placement, design targeted promotions, and adapt to seasonal trends. Elevate your business strategy with insights tailored for boosting sales and engaging customers effectively.

analysis analytics analytics-product data data-science jupyter medium-articles notebook-jupyter python

Last synced: 04 May 2025

https://github.com/canelmas/data-producer

Fake data producer for Kafka, console and http endpoints

data fake-content fake-data fakerjs kafka kafka-producer

Last synced: 05 Apr 2025

https://github.com/tomasfarias/pipeline

A simple data pipeline done as a challenge project

challenge data python

Last synced: 29 Mar 2025

https://github.com/priyanshubiswas-tech/aws-etl-pipeline-on-cloud-using-glue-athena-lambda-and-redshift

Serverless ETL pipeline on AWS using Glue, Lambda, Athena, and Redshift — automates data ingestion, transformation, and analytics with scalable, event-driven architecture.

athena aws aws-glue data data-engineering etl etl-pipeline lambda redshift

Last synced: 02 May 2026

https://github.com/davidgamero/gatech-covid-chart

Line chart showing COVID19 cases per day at Georgia Tech

covid covid19 data gatech

Last synced: 28 Oct 2025

https://github.com/utkarshverma439/simple-sms-spam-detector

Built a Python text classification model for spam detection in SMS. Explored data, preprocessed text, utilized TF-IDF, trained a classifier, and addressed visualization challenges, yielding practical insights.

data data-science data-visualization spam-detection

Last synced: 20 Jun 2025

https://github.com/nitsc/spell-from-threebodytrilogy

Implemented the process of extrapolating from Gaia stellar data, to 3D visualizations, to three-views, to three-view signals, to three-view audio of signals, and even their inversions. This project proves the feasibility of the Logic (Luoji)'s “spell” from “The Three Body Problem” trilogy.

3d 3d-graphics astronomy astronomy-astrophysics audio audio-processing data data-science data-visualization gaia graph information-technology information-visualization numpy python python-3 python3 signal signal-processing visiualization

Last synced: 02 May 2026

https://github.com/yasir13001/moonai_api

This MoonAI API service built with FastAPI that calculates and provides detailed Moon and Sun astronomical data based on user input such as date, latitude, longitude, elevation, and timezone.

ai almanac api astro-ai astronomy data data-science fastapi fastapi-api gemini groq-api hilal-detection html islamic-calenda llama llm-integration moon python

Last synced: 20 Jun 2025

https://github.com/makosai/covid19datachart

A basic chart for checking corona data. Written in a single HTML file for convenience. Grab the single file and run it anywhere. Or visit the webpage.

chart chartjs corona coronavirus coronavirus-analysis covid-19 covid-2019 covid19 covid19-data data data-analysis datasets

Last synced: 23 Feb 2026

https://github.com/incubrain/awesome-maharashtra-data

A collection of datasets specific to Maharashtra, India. WIP

ai artificial-intelligence data data-analysis data-science datasets maharashtra marathi

Last synced: 23 May 2026

https://github.com/yoursrijit/data-structure-with-java

A data structure is a named location that can be used to store and organize data. And, an algorithm is a collection of steps to solve a particular problem. Learning data structures and algorithms allow us to write efficient and optimized computer programs.

data datastructures dsa-algorithm java linked-list

Last synced: 13 Mar 2025

https://github.com/pbinkley/tweets-libraries-covid19

A twarc harvest of tweets related to libraries during the COVID-19 outbreak, starting 2020-03-02

data social

Last synced: 06 Mar 2026

https://github.com/divithraju/divith-raju-data-mining

This project focuses on customer segmentation using data mining techniques, specifically K-Means clustering, to classify customers into distinct groups based on their purchasing behaviors. The goal is to analyze customer data and segment them into clusters for targeted marketing strategies and better customer relationship management.

algorthims analytics apache business client connector data dataarchitecture database dataengineering datamining datascience hadoop k-means-clustering mysql project project-repository pyspark python3 spark

Last synced: 06 Mar 2026

https://github.com/priyanka7411/customer-flight-prediction-app-mlflow

A comprehensive project predicting flight prices and customer satisfaction using machine learning models, deployed through interactive Streamlit apps.

classification customer-satisfaction data data-cleaning data-visualization feature-engineering flight-price-prediction machine-learning mlflow python regression streamlit

Last synced: 12 May 2026

https://github.com/denko5/sales-analysis

A complete SQL-based sales analysis project covering Africa, showcasing data cleaning, exploratory analysis, insights, and lessons learned. The project highlights sales trends, regional performances, and marketing effectiveness across multiple platforms.

africa data data-analysis data-science exploratory-data-analysis insights kenya sales sql

Last synced: 24 Jan 2026

https://github.com/sksubhadeep/nashville-housing-data-cleaning-project-using-sql

SQL Data Cleaning Project on Nashville Housing Dataset

data datacleaning sql

Last synced: 19 Mar 2026

https://github.com/habedi/adbis-2023-paper

This repository hosts the code and data used for the experiments reported in the paper titled "Diversification of Top-k Geosocial Queries", published in ADBIS 2023

artifacts conference-paper data experiments graphs java research-paper

Last synced: 19 May 2026

https://github.com/anuraganalog/365-data-science

A Repository which contains lecture notes, exercise, solutions

365 data exercises ipynb lecture notes pdfs python python3 science solutions sql

Last synced: 15 May 2026

https://github.com/owengombas/genyus

🐍 Lyrics analysis with genius.com, Python and Jupyter Notebooks

api data data-science genius jupyter-notebook lyrics python statistics

Last synced: 20 May 2026

https://github.com/tushar2704/interview-quest

Interview-Quest is comprehensive collection of interview questions and answers that can help you prepare for technical interviews. Whether you're a seasoned developer looking to brush up on your skills or a job seeker preparing for your next big opportunity, this repository aims to provide valuable resources to enhance your interview readiness.

artificial-intelligence data data-science interview interview-questions machine-learning

Last synced: 23 Jan 2026

https://github.com/chompfoods/sdk-typescript-fetch

Fetch TypeScript SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database fetch food grocery ingredients nutrition raw recipe-api recipes sdk typescript

Last synced: 03 May 2026

https://github.com/cainmi/easy-pull-from-repository

A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now

data html javascript python schema

Last synced: 04 Apr 2025

https://github.com/cpanse/tartare

raw file collection recorded on Thermo Fisher Scientific mass spectrometers for extented unit testing

bioconductor blob data r unittesting

Last synced: 03 Apr 2025

https://github.com/aikuyun/flinkx

flinkx 一些修改

data flink

Last synced: 04 Apr 2025

https://github.com/sstendahl/giscan

Simple tool to read and analyze existing GISAXS data

cbf data diffraction diffraction-analysis gisans gisaxs physics reflectivity scattering xray

Last synced: 30 Jun 2026