An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/mevlutcelik/turkey-cities-data

📍 Türkiye şehirlerine ait şehir verisi paketi: Plaka, koordinat (lat/lon), nüfus (2024 ADNKS) ve coğrafi bölge bilgilerini içerir.

cities coordinates data json nufus plaka turkey turkiye typescript

Last synced: 10 Mar 2026

https://github.com/pranjaldhamane/social-media-sentiment-analysis

This project aims to analyze sentiment in Twitter data to understand attitudes towards specific topics or entities. It seeks to uncover positive and negative sentiment patterns, detect potential cyberbullying or hate speech, and provide insights into Twitter's overall sentiment landscape.

data dataanalysis logistic-regression nlp-machine-learning python sentiment-analysis twitter

Last synced: 18 Apr 2026

https://github.com/stdlib-js/ndarray-vector-int8

Create a signed 8-bit integer vector (i.e., a one-dimensional ndarray).

constructor ctor data int8 javascript ndarray node node-js nodejs stdlib structure types vec vector

Last synced: 24 Apr 2026

https://github.com/rubyonworld/ldpath

This is a ruby implementation of LDPath, a language for selecting values linked data resources.

data ldpath resource ruby

Last synced: 12 Nov 2025

https://github.com/bertrand31/one-billion-rows-challenge

🌪️ Pushing Scala to its limits to aggregate a billion rows' worth of data in 2.42 seconds

competitive-programming competitive-programming-contests data data-engineering data-processing performance scala

Last synced: 05 Sep 2025

https://github.com/plnech/never2late

Never 2 Late - a reinterpretation of Everest Pipkin's 'i've never picked a protected flower'

dada dada-science data generative-art glitch-art installation nlp poetry spacy vector-similarity wallpaper

Last synced: 10 Jun 2025

https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas

SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.

data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase

Last synced: 18 May 2026

https://github.com/moeabbas6/bq_data_loader

A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.

bigquery data python

Last synced: 24 Mar 2025

https://github.com/smaug6739/data-bit

This project is a module for converting a structured dataset into a number that can be stored in a database taking up little space.

bits data nodejs

Last synced: 14 May 2026

https://github.com/gustavonav/daily-youtube-extraction

Projeto que completa a criação de um ambiente para extração, armazenamento e processamento de dados do Youtube

airflow data minio python3 spark

Last synced: 21 Feb 2026

https://github.com/0xHericles/SpamDetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 24 Mar 2025

https://github.com/0xHericles/ufcg-geojson

GeoJSON file containing the blocks and buildings of the Federal University of Campina Grande.

data data-visualization geojson map open-source ufcg university

Last synced: 24 Mar 2025

https://github.com/beriberikix/senml-zephyr

A codec for encoding and decoding Sensor Measurement Lists (SenML) for Zephyr

codec data iot senml sensor zephyr-rtos

Last synced: 24 Mar 2025

https://github.com/keanteng/kaggledata

📊Data Source For Program Testing

data dataset excel

Last synced: 24 Mar 2025

https://github.com/v-mayya/quantitative-analysis-data-dashboard

Quantitative survey data analysis using R

data data-analysis data-visualization flourish r

Last synced: 01 Apr 2025

https://github.com/ludwing-mj/manipulacion_ej

Ejercicio utilizado en la seccion numero ocho del manual para ejemplificar las herramientas proporcionadas por el tydyverse para la manipulacion de datos.

data manipulate-data package r

Last synced: 01 Apr 2025

https://github.com/dhruvil-26/powerbi-projects

This repository contains Power BI projects showcasing data analysis and interactive dashboards. Each project includes detailed visualizations and insights on diverse topics such as loan analysis, sales performance, and customer behavior.

customer-behavior-analysis data data-analysis interactive-dashboards loan-analysis powerbi sales-performance visualization

Last synced: 04 Feb 2026

https://github.com/alextanhongpin/node-github-api

:page_with_curl: sample github api queries with nodejs for scraping purposes

data github-api nodejs

Last synced: 06 May 2026

https://github.com/steveanik/kestra

Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

data data-engineering data-integration data-pipeline data-quality elt etl low-code orchestration pipelines scheduler workflow workflow-engine

Last synced: 06 Jan 2026

https://github.com/thingston/extractor

Collection of PHP classes to extract data from HTML pages.

data html php

Last synced: 14 Jan 2026

https://github.com/ournet/videos-data

Ournet videos data module

data ournet video videos

Last synced: 04 Apr 2025

https://github.com/powersyang/visualization

data visualization templates 数据可视化模板

data templates visualization

Last synced: 24 Mar 2025

https://github.com/bkataru/spotigo

AI-powered local music intelligence platform with a task runner server core to retrieve and backup spotify account data to storage(s) at set periodic intervals

ai backup cron data go intelligence local-llm music ollama rag runner spotify task-runner tool-calling

Last synced: 16 Jan 2026

https://github.com/4strium/data-analysis-france

🔍 Script allowing the analysis and recovery of precise data on French cities.

cities csv data france python research

Last synced: 01 Apr 2025

https://github.com/denisecase/dc-mailer

Send an email using Python

alerts data email python streaming

Last synced: 11 Apr 2025

https://github.com/sajjad425/missingvalue

This repository provides a guide on handling missing values in Python, covering identification methods, imputation techniques (mean, median, mode, fill, interpolation), advanced methods (KNN, multiple imputation), and best practices. It includes practical examples for both numerical and categorical data.

data data-analysis-python data-science missing-value-handling missing-value-imputation

Last synced: 04 Apr 2025

https://github.com/frer0t/userverse

creating api for data analysis

data data-analytics spring-boot users

Last synced: 12 Apr 2026

https://github.com/davorg/cookingvinyl

Web site with info about Cooking Vinyl records

cooking-vinyl data hacktoberfest music perl

Last synced: 02 Apr 2025

https://github.com/dahsie/machine_learning_from_scratch

This project aims to implement some machine learning basic techniques(e.g. MinMaxScaler, StandardScaler, TD-IDF, PCA, Logistic Regression, LDA, KNN, Naive Bayes Classifier) using only pyton, numpy and pandas. This will enable me to have hone my data scientist skills

classification clustering data data-processing datascience machienlearning nlp nltk numpy pandas python regression

Last synced: 04 May 2026

https://github.com/veivel/f1-sentiment-analysis

An entiment analysis project on tweets about Formula 1. To be reworked.

data f1 nlp-library nlp-machine-learning

Last synced: 04 Jul 2025

https://github.com/dsietz/rust-daas

An example of implementing the DaaS pattern using Rust

archconf daas data kafka rust rust-lang

Last synced: 05 Sep 2025

https://github.com/rorylshanks/devdb-client

This is the repository for the official command line client for DevDB (https://devdb.cloud)

cloud data database-management development

Last synced: 29 May 2026

https://github.com/codegouvfr/codegouvfr-data

🧢 Data for code.gouv.fr

bluehats codegouvfr data

Last synced: 05 Mar 2026

https://github.com/raghavendranhp/youtube_data_harvesting

The "YouTube Data Analyzer" is a versatile tool for businesses and content creators, enabling them to gather, analyze, and harness valuable insights from multiple YouTube channels. With streamlined data collection, storage in MongoDB, migration to SQL, and a user-friendly Streamlit interface, it empowers users to make data-driven decisions

apiintegration data datacollection eda googleapi googleapiclient matplotlib mongodb mysql mysqlconnector numpy oops pandas pymongo python pythonoops sql sqlalchemy streamlit youtube-api

Last synced: 13 Apr 2026

https://github.com/living-with-machines/zoonyper

Code to make it easy to import and process Zooniverse annotations and their metadata in Python/Jupyter Notebooks

crowdsourcing data data-processing data-science python zooniverse

Last synced: 04 Jul 2025

https://github.com/musamairshad/dsa-python

This repository contains all the material related to Data Structures and Algorithms implemented in Python.

algorithms data datastructures efficiency python searching-algorithms sorting-algorithms

Last synced: 25 Mar 2025

https://github.com/nmsud/formdata

🗃️ Data from the NMSUD Form submissions

api data json unification-day

Last synced: 16 May 2026

https://github.com/stdlib-js/ndarray-vector-uint32

Create an unsigned 32-bit integer vector (i.e., a one-dimensional ndarray).

constructor ctor data javascript ndarray node node-js nodejs stdlib structure types uint32 vec vector

Last synced: 25 Apr 2026

https://github.com/luminati-io/LinkedIn-dataset-samples

Sample dataset of 1001 LinkedIn companies, extracted via Bright Data API, featuring essential data points for competitive analysis and market insights.

data database dataset linkedin linkedin-api linkedin-data linkedin-dataset linkedin-scraper sample web-scraping

Last synced: 09 Apr 2025

https://github.com/smac-group/smacdata

Data sets used in various packages.

data r

Last synced: 02 Apr 2025

https://github.com/vishwas-chakilam/twitter-sentiment-analysis

Twitter Sentiment Analysis is a Python project that analyzes the sentiment of tweets based on a user-defined keyword. It uses Tweepy to fetch tweets from the Twitter API and TextBlob for sentiment analysis. The application features a user-friendly GUI with Tkinter, displaying tweet sentiment as positive, negative, or neutral.

api data data-science dataanalysis python3 textblob-sentiment-analysis tkinter tweepy-api

Last synced: 11 Mar 2025

https://github.com/scx567888/scx-data

✨ SCX Data

data java scx

Last synced: 05 Apr 2025

https://github.com/nagipragalathan/linkedin_backup_datas

This repository contains the backup data from my previous LinkedIn account. Unfortunately, my old LinkedIn account was compromised and subsequently blocked by LinkedIn. As a result, I created a new account, but that too got blocked for reasons unknown to me.

backup blocked data linkedin linkedin-account memory nagipragalathan recovery storage

Last synced: 18 Jan 2026

https://github.com/afolabi022/getting-and-cleaning-data-course-project

Tidy Dataset Creation for Human Activity Recognition" This repository contains the code and files for cleaning and transforming the Human Activity Recognition Using Smartphones dataset into a tidy format. The project demonstrates data wrangling skills in R, including merging datasets

data data-science datacleaning r

Last synced: 25 Mar 2025

https://github.com/shadeglare/genum

The ES Next tools to process data in a LINQ manner

data linq processing typescript

Last synced: 13 Apr 2026

https://github.com/q-aware-labs/bias-insights

Bias detection project for the Chicago Face Database (CFD)

ai chicago-data-portal data data-science llm statistical-analysis

Last synced: 21 Jan 2026

https://github.com/buffdelta/basketball_ref_webscraper

Python package to make webscraping from basketball-reference easy

basketball data python python-library webscraping

Last synced: 14 Jan 2026

https://github.com/dahmansphi/analysis_from_start_to_end

The Big Bang of Data Science- Analysis from the Start to The End- [Book Two]

analysis data data-analytics data-mining data-science hypothesis-testing jamovi machine-learning

Last synced: 08 Jan 2026

https://github.com/2kabhishek/pybank

Data Analysis for the silliest Bank 💰🏦

csv data data-science learning pandas python topic1 topic2

Last synced: 12 May 2026

https://github.com/newrelic-experimental/newrelic-java-atomikos

Gives status of Atomikos Data Sources since this information is unavailable via JMX

atomikos data instrumentation java nrlabs nrlabs-data nrlabs-java-verify nrlabs-odp observability-data

Last synced: 30 May 2026

https://github.com/yeti-robotics/past-scouting-data

❄️ Scouting Data from Previous Events/Seasons ❄️

data first frc

Last synced: 06 Jan 2026

https://github.com/boytchev/coursedataviz

Supplementary materials for "Data Visualization" course

data fmi su visualization

Last synced: 16 Mar 2025

https://github.com/roshaka/samplr

Samplr is a Python decorator for selecting a subset of items from a list, with options for customisation and informative console printouts.

data data-analysis data-engineering decorators list python sampling

Last synced: 14 Jan 2026

https://github.com/romaintailhurat/dagster-playground

Playing with Dagster 🐙

data pipelines python3

Last synced: 14 Jun 2025

https://github.com/albanecoiffe/jo2024_visualization

Tableau de bord avec Streamlit sur les JO de Paris 2024.

data streamlit visualization

Last synced: 30 Apr 2026

https://github.com/affan005-ai/tesla-stock-prediction

This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models

data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn

Last synced: 05 Oct 2025

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Jan 2026

https://github.com/abdullahashfaqvirk/earth-engine-data-scraper

A Python based web scraper designed to extract and organize dataset metadata from the Google Earth Engine Datasets Catalog for research, and analysis purposes.

beautifulsoup data data-science python requests scraper web-scraping

Last synced: 10 May 2026

https://github.com/pythoncoderunicorn/startrek

a repo for Star Trek data from Technical Manuals

data klingon-language star-trek vulcan

Last synced: 07 Oct 2025

https://github.com/rahulthedevil/metric-converter

A simple utility package for converting between metric units such as meters, kilometers, grams, kilograms, liters, and more. Simple and powerful way for Units Convert solution

convert converter data fraction imperial length mass measurements metric metrics ratio system temperature unit unit-conversion unit-converter units uom utilities weight

Last synced: 08 Oct 2025

https://github.com/anarya22/e-commerce_analysis

E-Commerce_Analysis is a data analysis project performed on the Superstore_USA dataset. It explores various aspects of e-commerce performance, including sales trends, customer demographics, product categories, and regional performance. The analysis includes data cleaning, visualizations, and insights on factors influencing sales and profitability.

analysis analytics cleaning-data data

Last synced: 09 Oct 2025

https://github.com/kaijagahm/2023-10-20-stlzoo

Data Carpentry workshop, hosted at the St. Louis Zoo. Beta testing the new ecology data lesson.

data data-science ecology r rstudio

Last synced: 05 Feb 2026

https://github.com/quetz-al/quetzal-openapi-client

Autogenerated Python client for the Quetzal API

client data data-science openapi-client openapi3 python quetzal

Last synced: 10 Oct 2025

https://github.com/loaiwalid07/automation_data_overviwe

This is Streamlit app that gives an overview for a dataset you upload

automation data data-analysis data-exploration data-science data-transformation data-visualization

Last synced: 19 May 2026

https://github.com/j-sephb-lt-n/joes_giant_toolbox

A large collection of general python functions and classes that I use in my daily work

ascii browser classifier data dataviz gcp mime nlp python regex search statistics supervised web-scraping

Last synced: 10 Oct 2025

https://github.com/azkarmoulana/winter-of-data-2019

:snowflake: :snowman: Winter of Data is coming..... :wolf:

data data-science machine-learning mathematics

Last synced: 05 Feb 2026

https://github.com/myavuzokumus/simplemodelcomparison

This application allows users to upload datasets, handle missing data, and compare different imputation strategies.

algorithm data data-science machine-learning preprocessing streamlit

Last synced: 21 Jan 2026

https://github.com/writetome51/pagination-page-info

Intended to help a separate Paginator class paginate data. Specifically, this class contains the properties `itemsPerPage` and `totalPages`, which will be used by other classes

batch data javascript paginate pagination typescript

Last synced: 09 May 2026

https://github.com/nukopian/shell-series

Extract columns from tabular text

automation data shell

Last synced: 11 Oct 2025

https://github.com/mr-chang95/udacity-starbucks-challenge

Data Science Project for Udacity's Data Scientist Program. Using Python in Jupyter Notebook.

data data-science data-visualization numpy pandas sklearn

Last synced: 14 Apr 2026

https://github.com/sebhoss/countries-and-cities

dolt database for countries and their cities

cities countries data database dolt

Last synced: 11 Oct 2025

https://github.com/equinor/sumo-wrapper-python

Thin python wrapper to interact with Sumo API

analytics data fmu python subsurface sumo

Last synced: 19 Jan 2026

https://github.com/drzax/light-up-brisbane

Where, what and why various public places in Brisbane are lit up.

brisbane data git-scraping

Last synced: 19 Jan 2026

https://github.com/jhpoelen/bees

Content-based iDigBio prototype

biodiversity data ecololgical informatics provenance

Last synced: 18 Mar 2026

https://github.com/digital-media/cv_data

Datasets used for courses/tutorials at the Digital Media Department

computer-vision data image-processing images

Last synced: 14 Oct 2025

https://github.com/brandonzylstra/essence

🧘🏼‍♂️ Relaxed Rails Modeling & Migrations

active-record data database gem hcl modeling rails ruby ruby-on-rails yaml

Last synced: 14 Apr 2026

https://github.com/datamine/yelp-date

Does being on a date impact the score on a yelp review? Let's find out!

data ipython ipython-notebook pandas python python-2 yelp yelp-reviews

Last synced: 14 Apr 2026

https://github.com/fatihilhan42/nba-players-data-1950-to-2021

In this project, the data of the NBA players between the years 1950-2021 were examined. After the NBA players' season, height, performance, averages of points, teams and positions they played were obtained through csv files, important tables and graphs were created using data cleaning and data visualization algorithms.

data data-analysis data-engineering data-science data-visualization

Last synced: 16 Oct 2025