An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/clagiordano/marketplaces-data-export

LIbrary that share the same interface and provide adapters for online marketplaces services

adapter amazon api clagiordano data ebay ebay-api export marketplaces mws mws-api rest soap

Last synced: 22 Mar 2025

https://github.com/afnanenayet/academic-pinetable

A revamp of the Dartmouth academic timetable. Designed to be intuitive and make searching for classes much easier.

dartmouth data design dev python scraping ui web

Last synced: 11 Jan 2026

https://github.com/kwame-mintah/ml-data-copy-to-aws-s3

Automatically copy new data to an AWS S3 bucket for Machine Learning.

aws aws-actions aws-s3 data

Last synced: 14 May 2026

https://github.com/luminati-io/zoominfo-dataset-samples

A sample dataset of over 1000 ZoomInfo companies, extracted using the Bright Data API, ideal for market growth, lead generation, and market analysis.

b2b business companies data data-extraction database dataset datasets web-scraping zoominfo

Last synced: 17 Mar 2025

https://github.com/robsteranium/user2022-ldf-talk

Slides from my useR! 2022 talk about the Linked-Data Frames package

data data-frame linked-data r rdf

Last synced: 19 Apr 2025

https://github.com/toofancodes/h1b-dashboard-insights

An interactive Tableau dashboard that visualizes H1B visa data from the USCIS Employer Data Hub, offering insights into application trends, top employers, and geographic distributions. Showcases advanced data visualization, analytics, and business intelligence skills.

analysis analytics business-intelligence dashboard data data-visualization h1b h1b-visa interactive-data tableau

Last synced: 20 Jan 2026

https://github.com/eslamdyab21/apara-data-gui

Custom application for Apara's data wrangling scripts, Technologies used are Qt-designer, PyQt5 for the GUI and Pandas, Numpy for the data work.

csv data data-analysis data-wrangling gui pandas pyqt5-desktop-application qt5-gui

Last synced: 17 May 2026

https://github.com/sajjadanwar0/booking.com-scraping

Scraping booking.com using Selenium and Beautiful Soup

crawler data python scraping selenium

Last synced: 18 Oct 2025

https://github.com/tomasfarias/louis

Yet another challenge project

challenge data python

Last synced: 29 Mar 2025

https://github.com/jigyasag18/fake-news-prediction-project

The Fake News Prediction App Repository offers a machine learning project that focuses on identifying the authenticity of news articles as fake or real. It uses a dataset of 20,000 articles and employs methods such as TF-IDF vectorization and the Porter stemming algorithm, achieving around 97% classification accuracy with logistic regression model.

data datapreprocessing logistic-regression machine-learning machine-learning-algorithms numpy pandas prediction stemming vectorization

Last synced: 08 Jun 2026

https://github.com/tkxwaweru/python_data_manipulation

Manipulating the MASSIVE dataset using python

data dataanalysis excel python

Last synced: 11 Jan 2026

https://github.com/jefking/copyblobs

Copies all files in a container to another container, in another storage account.

aci arm azcopy azure blob container copy data file files from instant move one-time simple storage sync template to transfer

Last synced: 27 Mar 2025

https://github.com/pcpp94/elexon_pipeline_gb_demand

Guidelines and code snippets for extracting and processing Elexon gross demand data on Databricks. Provides half-hourly GB demand at sectoral (Domestic, Non-domestic), GSP-area granularity, settlement demand, and embedded generation. Supports non-commodity cost calculations for CfD, RO, and FiT.

data electricity elexon gb octopusenergy power powerdata pypsa uk

Last synced: 12 Jul 2025

https://github.com/phtrempe/l2a

This is a small project which aims to show an example of applied machine learning in Python 3 with the Keras library and its TensorFlow backend to train a neural network model for it to learn to add two integers.

applied data data-science deep-learning keras machine-learning neural-network tensorboard tensorflow

Last synced: 05 May 2026

https://github.com/pyfig/s21_data-science-bootcamp

School21 Bootcamp Data Science

data data-science numpy pandas python school21

Last synced: 26 Jun 2025

https://github.com/realbxnnie/accountservice

A Simple DataStoreService wrapper with session backuping and session locking.

data lua luau roblox

Last synced: 29 Jul 2025

https://github.com/johndelatto/-universities-to-pursue-a-master-s-degree-in-machine-learning

Best Master’s Programs in Machine Learning (ML) for 2021 These are the best universities to pursue a master’s degree in machine learning, with research rankings in AI and machine learning

ai api data education project school

Last synced: 17 Jun 2025

https://github.com/stdlib-js/array-base-assert-any-has-property

Test whether at least one element in a provided array has a specified property, either own or inherited.

any array assert data generic has javascript node node-js nodejs prop property stdlib structure test types validate

Last synced: 07 May 2025

https://github.com/echang1802/normandy

Normandy is a python framework for data pipelines, which main objective is standardizing your team code and provide a data treatment methodology flexible to your team needs.

analytics business-intelligence data dataengineering datascience etl pipeline

Last synced: 11 Mar 2026

https://github.com/danielrosehill/ghg-ebitda-correlations

Streamlit data visualisation examining correlation between emissions & profitability

data sustainability sustainability-data

Last synced: 14 Mar 2025

https://github.com/darshjasani/claims-analysis

This repository contains a comprehensive analysis of claims data, detailing the workflow from data preprocessing to model evaluation. The goal of this analysis is to build predictive models to improve claims prediction and management.

analysis data linear machine-learning python

Last synced: 16 May 2026

https://github.com/ahmedkhaled404/data-cleaning-and-eda-layoffs-mysql

This project involves cleaning a dataset containing information about layoffs from companies around the world.

data data-analysis data-cleaning data-preprocessing datacleaning eda exploratory-data-analysis mysql sql

Last synced: 08 Jun 2026

https://github.com/ethenkem/pygraphsurvey

A python base web app that provide graphical analysis on data collected from surveys and the system has its on built in form fiiling where admin can set question and sent a link for the forms to be filled and then the system provide anylysis on the collected data. Form feature include selection options, range values file inputs etc

data

Last synced: 12 Jan 2026

https://github.com/gui-sitton/carsells

In this project I am an analyst on the Crankshaft List. Hundreds of free vehicle advertisements are published on the site every day. I need to study the data collected over the last few years and determine which factors influence the price of a vehicle.

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 20 May 2026

https://github.com/avijeetpandey/quizzez

Implementation of quizzez application using kotlin

android data kotlin viewmodel

Last synced: 20 May 2026

https://github.com/himanshub16/lekhpal

Monitor and catalog Twitter feed matching your desired keywords

analytics data data-catalog data-filtering mongodb twitter twitter-streaming-api

Last synced: 14 May 2026

https://github.com/koder77/l1vmgodata

l1vmgodata - a simple data base for exchange data between programs

base data database go simple

Last synced: 20 May 2026

https://github.com/akesling/csvb

Have CSV? Use CSVB!

analytics csv data database

Last synced: 02 Feb 2026

https://github.com/ramonrsv/f1_data

Provides consolidated access to various sources of Formula 1 information and data, including event schedules, session results, timing and telemetry data, as well as historical information about drivers, constructors, circuits, etc.

data f1 rust

Last synced: 07 Apr 2026

https://github.com/basis-company/data-player.js

in memory data layer for fast access to plain normalized data

collection data model traversal

Last synced: 25 Feb 2025

https://github.com/szc126/metadata-nnd-vocalo-twitter

ボカロ系新着動画ツイートを収集 - "new VOCALOID/UTAU videos" tweet collection

data nico-nico-douga niconico vocaloid

Last synced: 20 May 2026

https://github.com/raruto/cockpit-sample-data

Sample data installer addon for Cockpit CMS

addon cockpit data sample

Last synced: 17 Mar 2025

https://github.com/axafrance/azureml-to-openshift-talk

Scale your dev IA: From dev AzureML to prod OpenShift in one click

ai axa azureml data learn ml openshift raise-the-bar talk

Last synced: 16 Feb 2026

https://github.com/stdlib-js/array-base-any-has-property

Test whether at least one element in a provided array has a specified property, either own or inherited.

any array assert data generic has javascript node node-js nodejs prop property stdlib structure test types validate

Last synced: 20 May 2026

https://github.com/shubhamsoni98/analysis-with-sql

This project focuses on creating and managing a database for a music record company to perform various analyses on bands, albums, and songs. Using SQL, the goal is to create a structured relational database with relevant tables, insert necessary data, and perform queries that provide insights into the relationships between bands, albums, and songs.

analys analysis data data-science database dbms mysql mysqlworkbench project query schema sql

Last synced: 03 Jan 2026

https://github.com/pooja-manjunatha/nyc_parking_violations_dbt

This project uses dbt to transform NYC parking violations data through a layered architecture: Bronze: Raw ingested data Silver: Cleaned and enriched data Gold: Aggregated tables for analytics Using DuckDB as the warehouse backend, it ensures data quality with tests and documentation. The project enables reliable analysis of parking violations

data data-analysis data-engineering dbt duckdb python sql

Last synced: 14 May 2026

https://github.com/giosil/export-as

A convenience library for exporting data in different formats.

data data-export export exporter java

Last synced: 26 Jul 2025

https://github.com/azaz9026/loan_approval_prediction

Welcome to the Loan Approval Prediction repository! This project aims to build a predictive model that can determine whether a loan application should be approved or denied based on various features. Purpose The goal of this repository is to develop a machine learning model that can accurately predict loan approval decisio

data data-analysis data-visualization eda machine-learning numpy pandas python statistics

Last synced: 06 Apr 2026

https://github.com/kenanbek/youtube-data

YouTube stats data over YouTube Data API v3 using Python.

data python youtube youtube-api

Last synced: 13 May 2026

https://github.com/lorinczakos/sql-projects

This is a collection of my SQL scripts that I wrote and were approved through my course with GoIT Romania Data Analyst course

bigquery cte data data-analysis dbeaver marketing-analytics postgresql project-repository sql vscode

Last synced: 16 May 2026

https://github.com/kunalkumar2001/coffee_sales_project_using_excel_power-bi_and_sql

Coffee Shop Sales Dashboard built using Power BI for visualization and SQL for data extraction and transformation. The project dives deep into sales performance, providing actionable insights for data-driven decisions.

analytics data dataanalytics mssql powerbi sql

Last synced: 26 Jun 2025

https://github.com/the-tech-idea/beep.winform.sample

Application for Managing your Different DataSources . Still in Alpha.please be patient

application data data-science database dataset integeration mysql nosql oracle postgres sqlite sqlserver workflow-engine workflows

Last synced: 08 Jul 2025

https://github.com/apigear-io/template-cpp14

C++14 technology template

conan cpp cpp14 data library

Last synced: 18 Feb 2026

https://github.com/lisakey/lisakey

I am passionate about Python 🐍 and SQL 🗃️ for data analysis 📊, and I actively develop projects in these languages.

analysis analyst data dataanalysis dataanalyst java python sql

Last synced: 02 May 2026

https://github.com/preranarao03/madhav_e-commerce_dashboard

This repository features the Madhav_E-Commerce_Dashboard built with Power BI. It provides interactive visualizations for analyzing e-commerce sales performance, product categories, customer segments, and geographic data, aiding in data-driven business decisions.

analysis data powerbi

Last synced: 30 Jan 2026

https://github.com/jszafran/personal-aws-data-lake

Personal, cloud based (AWS), data lake for experimenting with cloud services.

aws cloud data data-engineering dataengineering datalake etl terraform

Last synced: 20 May 2026

https://github.com/gustavonav/youtubeextractorflask

Aplicação para Extração e tratamento de dados do Youtube.

data full-stack mysql pipelines python web

Last synced: 14 Jun 2025

https://github.com/valyaevgeorgiy/r_basic

Работа с основами среды R и тем самым изучения нового языка программирования, связанного непосредственно с анализом данных и построением графиков и диаграмм.

coding data data-analysis r rstudio

Last synced: 12 Dec 2025

https://github.com/redinfinitypro/scientificsharp

Rating: (5/10) The code is a Windows Forms application for a basic scientific calculator, allowing users to perform mathematical operations like addition, subtraction, multiplication, division, trigonometrics, and logarithms.

componentmodel cryptography data drawing forms generic linq system tasks text

Last synced: 06 Apr 2025

https://github.com/nxank4/an-augment

A Python library for advanced and novel data augmentation, combining traditional techniques like cropping and blurring with state-of-the-art generative AI methods such as style transfer, image inpainting, and latent space interpolation. It boosts data diversity for robust machine learning applications.

computer-vision data data-augmentation data-augmentation-strategies data-augmentation-techniques generative-ai image image-processing synthetic-data

Last synced: 10 Mar 2026

https://github.com/alex0x4b/akutils

High-level Python library for recurring data manipulation (Pandas, Python data structure, API, file manipulation, etc.).

data dataframe pandas python

Last synced: 08 Mar 2026

https://github.com/dolanmiu/mclaren-task

A front end assessment task for Mclaren

angular data observable observables rxjs

Last synced: 16 May 2026

https://github.com/skygenesisenterprise/api-service

The Official Sky Genesis Enterprise API Service Ecosystem

api-service client cryptography data dns docker javascript nextjs service stalwart typescript websocket

Last synced: 31 Dec 2025

https://github.com/anzerr/storage.ts

Util to store data used in a service

data nodejs storage typescript util

Last synced: 20 May 2026

https://github.com/miroslavvidovic/distribution-graphs

Creating ASCII graphical histograms in the terminal with https://github.com/philovivero/distribution

ascii data graph histogram python terminal

Last synced: 24 Apr 2026

https://github.com/stdlib-js/array-base-banded-filled2d-by

Create a filled two-dimensional banded nested array according to a provided callback function.

alloc allocate array callback data fill filled foreach generic javascript map matrix multidimensional node node-js nodejs stdlib strided structure types

Last synced: 19 May 2026

https://github.com/ericmaddox/nyc-crime-analytics

Analyzes and visualizes crime data from the NYC Police Department using interactive maps and heatmaps, leveraging the NYC Open Data API.

crime-analysis crimedata data datavisualization esri folium heatmap nycopendata python python3 rtcc

Last synced: 24 Jun 2025

https://github.com/aliasgarsogiawala/dashboards

Power BI dashboards , each folder contains a pbix file and a pdf file with explanation of the dashboard

analysis dashboards data data-visualization powerbi

Last synced: 12 Feb 2026

https://github.com/charlieroth/exoexplo

Exploring NASA Exoplanet Archive Data

data exoplanets julia nasa

Last synced: 03 Apr 2025

https://github.com/harrisonwelch/pythondatascience

Repo of code from the linked-in lesson "Python: Data Analysis"

data data-science matplotlib notes numpy python tutorial

Last synced: 12 Apr 2026

https://github.com/dhi13man/rca_ace

RCA Ace is designed for organizations seeking to enhance their understanding and utilization of insights derived from Root Cause Analyses (RCAs).

analytics data enterprise open-source python python3 rca

Last synced: 10 Sep 2025

https://github.com/maximkrouk/storage

Lightweight framework for storing data (beta)

cache data keychain memmory storage swift swift5-1 userdefaults

Last synced: 02 Jul 2026

https://github.com/zeh237/superstore-data-analytics

This is a Flask based data analytics project based on the superstore dataset using flask, pandas, sql and python

analytics data data-analysis data-science data-visualization flask python superstore

Last synced: 04 May 2025

https://github.com/arthurcfranklin/acervo-musical

Este projeto consiste na criação de um banco de dados relacional para auxiliar um DJ na organização e catalogação do seu acervo musical. O objetivo é fornecer um sistema eficiente para armazenar e gerenciar informações sobre cantores, bandas, músicas e suas versões remixadas.

data database mysql mysql-database sql

Last synced: 22 Mar 2025

https://github.com/kinshukjainn/dclue-v1

Dsainone is a highly optimized Data Structures and Algorithms (DSA) library designed to provide efficient implementations of graph algorithms, trees, hashing, and linked lists while maintaining exceptional memory efficiency. The library is designed to be as fast and optimized as possible

data dsa-algorithm python

Last synced: 20 May 2026

https://github.com/UznetDev/Smoking-Prediction

This project focuses on analyzing the "Smoking" dataset and building a predictive model for smoking status based on various health metrics. The goal is to identify factors influencing smoking behavior and develop a reliable model for prediction.

ai classification data data-science kaggle-competition machine-learning ml roc-auc sklearn smoking

Last synced: 28 Mar 2025

https://github.com/mapi-developer/dapo

Simple, zero-dependency tabular data manipulation and analysis for Python.

dapo data python

Last synced: 06 Mar 2026

https://github.com/piyushkumar2025/analytical-sql-project-exploring-trends-segmentation-kpis

A complete SQL analytics project using a simulated data warehouse. It analyzes sales, customer, and product data with CTEs, joins, window functions, subqueries, and views to deliver insights on trends, segmentation, and KPIs, showing how SQL enables data-driven decisions without BI tools.

advanced-sql analytics business-intelligence data data-science-projects datascience joins kpi mysql query sql window-functions-in-sql

Last synced: 02 Jul 2025

https://github.com/ericgio/history-of-jazz

Data and visualizations based on Ted Gioia's "The History of Jazz"

data jazz

Last synced: 28 Mar 2025

https://github.com/ournet/places-data

Ournet places data module

data ournet places storage

Last synced: 04 Apr 2025

https://github.com/xmen3em/kaggle-competitions

This collection contains various projects and notebooks developed to tackle a range of Kaggle competitions, showcasing different machine learning techniques, data preprocessing methods, and model optimizations.

data data-science data-visualization deep-learning deployment ensemble-learning machine-learning-algorithms python streamlit

Last synced: 09 Apr 2026

https://github.com/ivanshero400/kutub-al-salaf-database

أضخم مكتبة مفتوحة المصدر للكتب الإسلامية التراثية | 7,878 كتابا | 40 تصنيفا | المصدر: مكتبة كيزانه (Kizanah) | تحميل مباشر من بايثون بسطر واحد

arabic books-database data hadith islamic-books islamic-heritage kizanah open-source python sqlite

Last synced: 02 Jul 2026