An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/mustafaozvardar/selenium-eksisozluk

This project is a simple web scraper built with Python using Selenium. It extracts and prints the content of popular entries from a specific EksiSozluk page.

data python selenium selenium-python

Last synced: 29 Apr 2026

https://github.com/mevlutcelik/turkey-cities-data

📍 Türkiye şehirlerine ait şehir verisi paketi: Plaka, koordinat (lat/lon), nüfus (2024 ADNKS) ve coğrafi bölge bilgilerini içerir.

cities coordinates data json nufus plaka turkey turkiye typescript

Last synced: 10 Mar 2026

https://github.com/ashleydavis/brisjs-web-scraping-talk

Code to accompany my talk on web scraping for the Brisbane JavaScript meeting in September 2018

cheerio data data-acquisition data-acquisiton electron headless-browsers javascript nightmare nightmarejs nodejs web-scraping

Last synced: 06 May 2026

https://github.com/erickpeirson/jhb-data

Data from the forthcoming paper: Quantitative Perspectives on Fifty Years of the Journal of the History of Biology

data geolocation history-of-biology named-entity-recognition topic-modeling

Last synced: 04 Mar 2026

https://github.com/0xnu/data-analyst-training

The repository contains training materials for data analysts.

data data-analysis data-analyst

Last synced: 25 Aug 2025

https://github.com/franckalbinet/maris-crawlers

Automated data harvesting of MARIS data sources

automation data marine-radioactivity

Last synced: 25 Aug 2025

https://github.com/luminati-io/google-maps-dataset-samples

A sample dataset of over 1000 Google Maps businesses, extracted using the Bright Data API, ideal for competitor analysis, location-based marketing, and market strategies.

api data dataset google-maps maps web-scraping

Last synced: 03 Jan 2026

https://github.com/shantanujpk/bigdatacloud

Exploration of PySpark for data processing and interview prep — demonstrates handling corrupted records, applying transformations/actions, and building efficient data pipelines with practical examples.

big-data data jupyter-notebook pipeline pyspark python spark sparksql

Last synced: 07 May 2026

https://github.com/living-with-machines/zoonyper

Code to make it easy to import and process Zooniverse annotations and their metadata in Python/Jupyter Notebooks

crowdsourcing data data-processing data-science python zooniverse

Last synced: 04 Jul 2025

https://github.com/ksimicevic/discord-message-analyzer

Analyzing discord messages in Jupyter notebook

analysis data discord messages

Last synced: 16 Apr 2026

https://github.com/arjunrao87/world-countries-graphql-api

GraphQL API for retrieving information about countries of the world

countries data database geographic-data geography graphql world

Last synced: 10 May 2026

https://github.com/anuragagarwal96/hospital-mortality-rate-sql-analysis

In this project, I have taken a hospital dataset from Kaggle, analysed it and predicted the mortality rate of patients who have been admitted in hospitals. I have utilised a combination of SQL, Tableau and Microsoft Excel for this project.

data data-visualization dataanalysis dataanalysisusingsql excel msexcel mssqlserver sql tableau tableau-public

Last synced: 09 Mar 2026

https://github.com/nmsud/formdata

🗃️ Data from the NMSUD Form submissions

api data json unification-day

Last synced: 16 May 2026

https://github.com/zeptosec/bpscrapper

Shows history of oil prices

data data-visualization database nodejs scraper

Last synced: 13 Apr 2026

https://github.com/prishabhanot/facial_recognition_pca

A face recognition system using Principal Component Analysis (PCA) for dimensionality reduction and a Support Vector Machine (SVM) classifier for classification. PCA extracts essential features (eigenfaces) from facial images, significantly reducing computational complexity while retaining critical information for accurate recognition.

data eigenfaces facial-recognition pca python reducing-computational-complexity reducing-data-dimensions svm-classifier

Last synced: 01 Mar 2025

https://github.com/stdlib-js/ndarray-vector-uint32

Create an unsigned 32-bit integer vector (i.e., a one-dimensional ndarray).

constructor ctor data javascript ndarray node node-js nodejs stdlib structure types uint32 vec vector

Last synced: 25 Apr 2026

https://github.com/jameshenderson12/data-lists

This respository contains lists of useful data that can be used in a variety of projects.

countries data list names scottish text

Last synced: 05 Mar 2026

https://github.com/sehgal-vishal/world-population-

World Population Sql Analysis

data dataanalysis population sql

Last synced: 05 Mar 2026

https://github.com/stdlib-js/wasm-base-dtype2wasm

Return the WebAssembly data type associated with a provided array data type value.

array base data dtype javascript node node-js nodejs stdlib type types util utilities utility utils wasm webassembly

Last synced: 09 May 2026

https://github.com/posixpascal/apple_appstore_search

📊 get public App Store data of your app in a ruby hash — that's it.

appstore data gem ios ruby

Last synced: 16 Mar 2025

https://github.com/greedchikara/dsajs

Data Structures and Algorithms written in Javascript

algorithms data structures

Last synced: 09 Apr 2026

https://github.com/luminati-io/ZoomInfo-dataset-samples

A sample dataset of over 1000 ZoomInfo companies, extracted using the Bright Data API, ideal for market growth, lead generation, and market analysis.

b2b business companies data data-extraction database dataset datasets web-scraping zoominfo

Last synced: 09 Apr 2025

https://github.com/paulrosset/cyclone

Network data consumption monitoring

data monitoring network networking

Last synced: 23 Aug 2025

https://github.com/naveenk-ds/redbus_web_screaping.app.py

🚌 Red Bus Project Overview The Red Bus Project is a web scraping and visualization tool built with Selenium to extract bus information from the RedBus website. It stores the data in a MySQL database and provides an interactive visualization interface using Streamlit. The goal is to deliver insights into bus schedules, prices, ratings, etc...

data data-science database-management pandas pyhton selenium-webdriver sql

Last synced: 11 Apr 2026

https://github.com/braiso-22/ejercicio-seguro-medico

Ejercicio de acercamiento a los datos para hacer predicciones

data data-science dataset ia insurance jupyter-notebook ml python python3

Last synced: 24 Apr 2026

https://github.com/jigyasag18/amazon-prime-power-bi-dashboard

The Amazon Prime Power BI Project is a centralized data storage system containing detailed information on movies and TV shows available on Amazon Prime Video, including metadata and analytics insights. It supports data-driven decision-making for content acquisition and viewer engagement strategies. This repo is optimized for querying & analysis.

dashboard data data-visualization dataanalysis dataanalytics datacleaning dataset powerbi powerbi-dashboards powerbi-report powerbi-visuals powerbidashboard

Last synced: 05 Mar 2026

https://github.com/kaungkhantkyaw1997/mock-schema-generator

A tool for generating mock data and implementations based on schema definitions. Ideal for testing and development.

data generator mock schema testing

Last synced: 05 Mar 2026

https://github.com/jwszolek/accelerated-data-generator

Ultra-fast random data generator. It gives you an ability to generate almost 1M of rows in around second.

bash csv data data-generator generator shell

Last synced: 02 Apr 2026

https://github.com/ahmad-ali-rafique/wine-quality-dataset

Comprehensive analysis and modeling of the Wine Quality dataset, including exploratory data analysis (EDA), data preprocessing, model training, and performance evaluation using MSE and RMSE.

analytics data datacleaning decision-tree-regression exploratory-data-analysis gradient-boosting-regressor linear-regression machine-learning mean-square-error model

Last synced: 21 Aug 2025

https://github.com/boratechlife/tensorflow-questions-datasets

A Tensorflow questions Datasets to help you practice Machine learning and Train Models

data datapreprocessing datasets machinelearning modeltrain questions tensorflow

Last synced: 23 Mar 2025

https://github.com/remidumas/rstats

RStats weblog

data ia r science stats

Last synced: 25 Mar 2025

https://github.com/giscience/measures-rest-oshdb-app

A frontend for providing measures for geospatial datasets, using the OSHDB

data dggs geospatial measure openstreetmap rest

Last synced: 20 Apr 2026

https://github.com/khushi-sabarad/data_analysis

linkedin learning capstone project

data data-engineering matplotlib pandas python

Last synced: 10 May 2026

https://github.com/andypicke/ev_station_explorer

Shiny App to visualize electric-vehicle charging station data

data electric-vehicles r shiny-apps visualization

Last synced: 29 Jul 2025

https://github.com/beeracs/llama

Run Llama models in your web browser using JavaScript and WebAssembly. Explore light and dark modes easily. 🌐🐱👤

ai data fine-tuning framework gpt langchain large-language-models llama3 llamaindex llm lora machine-learning nlp peft qlora qwen rlhf vllm

Last synced: 10 May 2026

https://github.com/danielrosehill/value-factors-data-vis

Streamlit app containing visualisations of the Global Value Factors Database (GVFD) released by the IFVI in 2024

data data-visualization sustainability sustainability-data

Last synced: 29 Jul 2025

https://github.com/shadeglare/genum

The ES Next tools to process data in a LINQ manner

data linq processing typescript

Last synced: 13 Apr 2026

https://github.com/rishitabansal9/adult-census-income-prediction

This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.

data data-analysis data-science feature-engineering random-forest-classifier

Last synced: 25 Mar 2025

https://github.com/juangesino/research-project

Course files for Research Project @ University of Amsterdam

data data-science economics stata

Last synced: 02 Jan 2026

https://github.com/foreteternelle/pokemonstudiodataapi

The GitHub repository of the Pokémon Studio Data Api

api data fangame

Last synced: 02 Apr 2026

https://github.com/bkestelman/dasy-ml

DaSy DataSynthesizer - Create synthetic data with desired statistical properties for machine learning research.

data data-science machine-learning

Last synced: 14 Jan 2026

https://github.com/rachelresende/projeto-finan-as

Este repositório é referente a um curso de análise de dados para finanças que realizei em 2025 na Udemy.

analytics data financas finance finance-management

Last synced: 19 Aug 2025

https://github.com/roovedot/unet-cnn-for-road-segmentation

(In Progress) Unet architecture with CNNs (Convolutional Neural Networks) aimed at Road Segmentation

cnn cnn-for-visual-recognition cnn-pytorch computer-vision data data-engineering data-science unet unet-image-segmentation unet-pytorch

Last synced: 01 Jul 2025

https://github.com/rugwiroparfait/alx_sql

This repo is where I save my queries and learning materials in Data Science program from ALX

anaconda data data-analysis jupyter-notebook sql

Last synced: 19 Aug 2025

https://github.com/h4fide/politicalcompassbot

This Python project allows you to take a quiz and find out where you fit on the political compass. Give it a try and see where you stand!

bot data greedy-algorithms politics python python3 sql telegram

Last synced: 19 Aug 2025

https://github.com/woctezuma/recent-sales-data

Data available to estimate sales of Steam games during release week.

data sales steam

Last synced: 05 Feb 2026

https://github.com/dahmansphi/analysis_from_start_to_end

The Big Bang of Data Science- Analysis from the Start to The End- [Book Two]

analysis data data-analytics data-mining data-science hypothesis-testing jamovi machine-learning

Last synced: 08 Jan 2026

https://github.com/bala-1409/sales-forecasting-datascience-project

Develop a data science project using historical sales data to build a regression model that accurately predicts future sales. Preprocess the dataset, conduct exploratory analysis, select relevant features, and employ regression algorithms for model development. Evaluate model performance, optimize hyperparameters, and provide actionable insights.

data data-analysis data-science data-visualization datacleaning exploratory-data-analysis machine-learning-algorithms modelfitting prediction predictive-analytics predictive-modeling python3 regression-models salesforecast supervised-learning

Last synced: 26 Apr 2026

https://github.com/fcoagz/rate-reader-epv

pyDolarVenezuela API utilities, image processing (EnParaleloVzla) to extract currency exchange rates from specific platforms, validating content against expected patterns

data finance json processing-images pydolarvenezuela

Last synced: 14 Jun 2025

https://github.com/amethyst-php/attendance

Indicate the attendance/absence of an employee in a defined office with a range of dates

amethyst amethyst-package api attendance data laravel

Last synced: 17 Apr 2026

https://github.com/kashifkhan7/cleaning-analysis_cli

Analyze sales data easily with our CLI app. Gain insights on revenue trends and visualize results using Python, Pandas, and Matplotlib. 🚀📊

conditional-statements css data datacleaning exception-handling exiftool html json matplotlib-pyplot metadata metadata-extraction pandas-python python sales-analysis seaborn-python speech-to-text transcription youtube

Last synced: 13 Apr 2026

https://github.com/miraclx/split-merge

Efficient, flexible data stream chunker and merger

chunk data efficient merge middleware nodejs pipeline split stream

Last synced: 07 May 2026

https://github.com/cljoly/data

📊 Data sets to populate some parts of my website (mostly https://cj.rs/open-source/).

data open-source sqlite wip

Last synced: 03 May 2026

https://github.com/shadmanshaikh/data-analysis-and-ml-work

All of my work in Data Analysis and Machine learning

analytics artificial-intelligence data machine-learning

Last synced: 05 Jul 2025

https://github.com/rorylshanks/devdb-client

This is the repository for the official command line client for DevDB (https://devdb.cloud)

cloud data database-management development

Last synced: 29 May 2026

https://github.com/pdoup/enegry

Time-Series dataset combining multiple sources to explain the broader Greek energy market

data dataset day-ahead-auction energy-markets exploratory-data-analysis forecasting futures-market greek-energy-market renewable-energy time-series-data weather-data

Last synced: 07 May 2025

https://github.com/musamairshad/dsa-python

This repository contains all the material related to Data Structures and Algorithms implemented in Python.

algorithms data datastructures efficiency python searching-algorithms sorting-algorithms

Last synced: 25 Mar 2025

https://github.com/desoga10/nety-form

In this tutorial, I show you how to send data from a form to the Netlify dashboard. I also show you how to create a form using Materialize.

contact-form css css3 data form forms html html5 materialize materialize-css materializecss-framework netlify

Last synced: 03 Jan 2026

https://github.com/smac-group/smacdata

Data sets used in various packages.

data r

Last synced: 02 Apr 2025

https://github.com/vishwas-chakilam/twitter-sentiment-analysis

Twitter Sentiment Analysis is a Python project that analyzes the sentiment of tweets based on a user-defined keyword. It uses Tweepy to fetch tweets from the Twitter API and TextBlob for sentiment analysis. The application features a user-friendly GUI with Tkinter, displaying tweet sentiment as positive, negative, or neutral.

api data data-science dataanalysis python3 textblob-sentiment-analysis tkinter tweepy-api

Last synced: 11 Mar 2025

https://github.com/karosi12/ng-data-share

Angular communication with input and output properties

angular communication data data-binding input output sharing typescript

Last synced: 16 Jan 2026

https://github.com/bmcollier/contiguous

Provides COBOL-style contiguous data structures in Python

cobol contiguous data python

Last synced: 14 Jan 2026

https://github.com/Coko7/vegapull-records

Cards dataset for One Piece TCG

data one-piece one-piece-card-game one-piece-tcg tcg

Last synced: 28 Apr 2025

https://github.com/q-aware-labs/bias-insights

Bias detection project for the Chicago Face Database (CFD)

ai chicago-data-portal data data-science llm statistical-analysis

Last synced: 21 Jan 2026

https://github.com/meokullu/prefill

PreFill adds desired characters onto output values to increase their legibility.

alignment data data-analysis data-engineering data-science legibility

Last synced: 17 Jan 2026

https://github.com/soenneker/soenneker.constants.data

A set of commonly used constants related to various types of data

constants csharp data dotnet

Last synced: 12 Mar 2026

https://github.com/isaacmaffeis/imad-2023

Model Identification and Data Analysis (IMAD) | University course

data data-analysis data-science model model-identification

Last synced: 09 May 2026

https://github.com/2kabhishek/pybank

Data Analysis for the silliest Bank 💰🏦

csv data data-science learning pandas python topic1 topic2

Last synced: 12 May 2026

https://github.com/rajkumarbestha/nsedataextractor

NSEDataExtractor

data python python3

Last synced: 26 Mar 2025

https://github.com/boytchev/coursedataviz

Supplementary materials for "Data Visualization" course

data fmi su visualization

Last synced: 16 Mar 2025

https://github.com/sadratehranian/data-collection-and-machine-learning

create a model using logistic regression to predict whether the fire alarm of a smoke detector should sound or not. Second, predicts whether an electric drive in a production plant may be faulty or not.

data data-analysis data-science datacollection logistic-regression machine-learning ml nn

Last synced: 05 Jan 2026

https://github.com/roshaka/samplr

Samplr is a Python decorator for selecting a subset of items from a list, with options for customisation and informative console printouts.

data data-analysis data-engineering decorators list python sampling

Last synced: 14 Jan 2026

https://github.com/austinhartzheim/career-fair-backend

Backend for ECS Career Fair app

data django python

Last synced: 13 Apr 2026

https://github.com/stupidcucumber/elephant-crawler

System for mining texts from websites.

data data-mining-python python

Last synced: 25 Apr 2026

https://github.com/krakozaure/pyzzy

Set of packages to simplify development in Python

configuration data formats json library logging logs python3 toml utils yaml

Last synced: 14 Jan 2026

https://github.com/albanecoiffe/jo2024_visualization

Tableau de bord avec Streamlit sur les JO de Paris 2024.

data streamlit visualization

Last synced: 30 Apr 2026