An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/rajlabmssm/echodata

echoverse module: Example data.

data echoverse fine-mapping genomics gwas qtl

Last synced: 17 Jan 2026

https://github.com/tkxwaweru/python_data_manipulation

Manipulating the MASSIVE dataset using python

data dataanalysis excel python

Last synced: 11 Jan 2026

https://github.com/pcpp94/elexon_pipeline_gb_demand

Guidelines and code snippets for extracting and processing Elexon gross demand data on Databricks. Provides half-hourly GB demand at sectoral (Domestic, Non-domestic), GSP-area granularity, settlement demand, and embedded generation. Supports non-commodity cost calculations for CfD, RO, and FiT.

data electricity elexon gb octopusenergy power powerdata pypsa uk

Last synced: 12 Jul 2025

https://github.com/phtrempe/l2a

This is a small project which aims to show an example of applied machine learning in Python 3 with the Keras library and its TensorFlow backend to train a neural network model for it to learn to add two integers.

applied data data-science deep-learning keras machine-learning neural-network tensorboard tensorflow

Last synced: 05 May 2026

https://github.com/echang1802/normandy

Normandy is a python framework for data pipelines, which main objective is standardizing your team code and provide a data treatment methodology flexible to your team needs.

analytics business-intelligence data dataengineering datascience etl pipeline

Last synced: 11 Mar 2026

https://github.com/indhra/cats-ijcnn-data-2004

CATS IJCNN Data 2004 Competition of Artificial Time Series

2004 artificial cats data ijcnn time-series

Last synced: 22 Mar 2025

https://github.com/ahmedkhaled404/data-cleaning-and-eda-layoffs-mysql

This project involves cleaning a dataset containing information about layoffs from companies around the world.

data data-analysis data-cleaning data-preprocessing datacleaning eda exploratory-data-analysis mysql sql

Last synced: 08 Jun 2026

https://github.com/himanshub16/lekhpal

Monitor and catalog Twitter feed matching your desired keywords

analytics data data-catalog data-filtering mongodb twitter twitter-streaming-api

Last synced: 14 May 2026

https://github.com/ioboi/obloc-data

Scrape guest counter of O'BLOC 🧗‍♀️

data scraping

Last synced: 04 Nov 2025

https://github.com/noraui/noraui-datas-webservices

noraui-datas-webservices is a RESTdataProvider for NoraUi

data noraui rest-api service spring-boot-2 spring-boot-actuator

Last synced: 17 Mar 2025

https://github.com/axafrance/azureml-to-openshift-talk

Scale your dev IA: From dev AzureML to prod OpenShift in one click

ai axa azureml data learn ml openshift raise-the-bar talk

Last synced: 16 Feb 2026

https://github.com/azaz9026/loan_approval_prediction

Welcome to the Loan Approval Prediction repository! This project aims to build a predictive model that can determine whether a loan application should be approved or denied based on various features. Purpose The goal of this repository is to develop a machine learning model that can accurately predict loan approval decisio

data data-analysis data-visualization eda machine-learning numpy pandas python statistics

Last synced: 06 Apr 2026

https://github.com/bfontaine/datatools

:triangular_ruler: Some scripts I use to work with data

data ruby script

Last synced: 23 Jul 2025

https://github.com/lisakey/lisakey

I am passionate about Python 🐍 and SQL 🗃️ for data analysis 📊, and I actively develop projects in these languages.

analysis analyst data dataanalysis dataanalyst java python sql

Last synced: 02 May 2026

https://github.com/omari-kd/recommendation-system-analysis-and-modelling

This project aims to develop a recommendation system that leverages historical user data to provide tailored recommendations across different domains, such as product recommendations, content suggestions and service optimisation.

data data-science data-science-in-r machine-learning-algorithms recommendation-system

Last synced: 08 Jan 2026

https://github.com/j-hagedorn/locals

:globe_with_meridians: A collection of tidied, neighborhood-level public datasets

address-dataset census-data census-tract data neighborhood social-sciences

Last synced: 03 Feb 2026

https://github.com/nxank4/an-augment

A Python library for advanced and novel data augmentation, combining traditional techniques like cropping and blurring with state-of-the-art generative AI methods such as style transfer, image inpainting, and latent space interpolation. It boosts data diversity for robust machine learning applications.

computer-vision data data-augmentation data-augmentation-strategies data-augmentation-techniques generative-ai image image-processing synthetic-data

Last synced: 10 Mar 2026

https://github.com/ims94/ballerina-tsv-querying

An example Ballerina project to query tsv data using Ballerina language integrated queries

ballerina ballerina-lang data olympics query sql

Last synced: 03 Feb 2026

https://github.com/zeh237/superstore-data-analytics

This is a Flask based data analytics project based on the superstore dataset using flask, pandas, sql and python

analytics data data-analysis data-science data-visualization flask python superstore

Last synced: 04 May 2025

https://github.com/vedantwalia/google-data-analytics-capstone-case-study

This is a repository of my work on data analysis as a part of the Google Data Analytics Capstone

bigquery data data-viz datavisualization-project divvy-bikes google googledataanalytics sql tableau tableau-public

Last synced: 02 Jan 2026

https://github.com/mapi-developer/dapo

Simple, zero-dependency tabular data manipulation and analysis for Python.

dapo data python

Last synced: 06 Mar 2026

https://github.com/peternaydenov/data-pool

Data layer for node apps and single page applications

cache data store

Last synced: 29 Apr 2025

https://github.com/interzoid/typescript-examples

Provides TypeScript examples for consuming several of the Cloud APIs available from Interzoid, including company name matching, individual name matching, weather, page performance, email validation, currency rates/FOREX, and global telephone information.

angular api cloud data database matching nodejs quality typescript

Last synced: 12 Jan 2026

https://github.com/interzoid/php-examples

Provides PHP examples for consuming several of the Cloud APIs available from Interzoid, including company name matching, individual name matching, weather, page performance, email validation, currency rates/FOREX, and global telephone information.

api cloud data database php quality

Last synced: 12 Jan 2026

https://github.com/cody-scott/arclint

A flexible tool to validate and improve your data in ArcGIS using regex and other methods

arcgis arcgispro data lint regex validation

Last synced: 14 May 2025

https://github.com/ssiarhei115/cv-dbase-analysis

HeadHunter CVs data base analysis

analysis cv data data-science resume

Last synced: 09 Apr 2025

https://github.com/rickstaa/ai-compute-visualizer

A StreamLit-based web application to visualize GPU inventory and AI capabilities on the Livepeer network.

ai data livepeer streamlit

Last synced: 28 Jun 2025

https://github.com/rrwen/poster-gisci-osmol

Conference poster and short paper titled "Outlier Detection in OpenStreetMap Data using the RandomForest Algorithm and Variable Contributions" for the GIScience Conference in 2016

2016 algorithm conference contribution data detection forest gis giscience learn machine open openstreetmap osm outlier paper poster random short variable

Last synced: 03 Apr 2025

https://github.com/rrwen/geohoods-to

Geospatial dataset of 1000+ aggregated variables for neighbourhoods in Toronto, ON, CA

csv data dataset geo geojson gis neighborhood neighborhoods neighbourhood neighbourhoods open open-data toronto toronto-open-data

Last synced: 25 Jun 2025

https://github.com/codehard8/web-scrapping

In this repository we have provide a web scrapping project through beautifulSoup and related files

beutifulsoup data houses-for-sale python3 requests-library-python webscraping

Last synced: 01 Jul 2025

https://github.com/jonprice99/regional-election-analysis

An analysis of election results in Allegheny County using Pandas and other Python libraries to better understand the voting habits, practices, and preferences of regional voters.

data data-visualization election-analysis election-data pandas python

Last synced: 05 May 2026

https://github.com/analyticslover/salifort-motors-turnover-project

The Salifort Motors H.R. Project serves as the capstone for the Google Advanced Analytics Program on Coursera. This project presents a business scenario and a problem on the scnario context, employee turnover. In this project, essential techniques as EDA and Data Modeling are used to analyze and predict the employee turnover rates in the company.

data data-analysis datamodeling eda machine-learning pandas python sklearn

Last synced: 10 Apr 2026

https://github.com/entitizer/data-js

Entitizer data module

data entitizer storage

Last synced: 25 Jan 2026

https://github.com/abshek7/big-data

A repository for documenting the learning related to theory and practical notes of big data computing.

big-data data data-engineering mapreduce pyspark

Last synced: 15 Jun 2025

https://github.com/ahmad-mtr/prjkt_exam_schedule_test

I hate scrolling in a list of 300+ courses of my Uni exam schedule, so I'm creating this. this's a test btw :)

data strings-manipulation

Last synced: 11 Apr 2025

https://github.com/The-Tech-Idea/Beep.winform.Sample

Application for Managing your Different DataSources . Still in Alpha.please be patient

application data data-science database dataset integeration mysql nosql oracle postgres sqlite sqlserver workflow-engine workflows

Last synced: 04 Nov 2025

https://github.com/karensaraimoralesmontiel/8-week-sql-challenge

Case Studies Solutions for the 8-Week-SQL-Challenge.

data database sql

Last synced: 02 Jan 2026

https://github.com/aiwithqasim/p1_explore-weather-trends

In this project, I'll analyze local and global temperature data and compare the temperature trends where I live to overall global temperature trends. Moreover i will use SQL query to extract data from the given Data base and i have to visualize the insight or Average temperature to find the findings.

data dataanalyst database datavisualization nanodegree udacity

Last synced: 22 May 2026

https://github.com/rickyarians/practical-statistic-car-emission

Practical Statistic Project- Car Emission in Canada - 2022

data data-science dataanalysis r rmarkdown rpubs statistics

Last synced: 22 May 2026

https://github.com/hivesolutions/crossline

Simple event pipping and storing infra-structure

counter data opencv warehouse

Last synced: 15 May 2026

https://github.com/iamyourdre/naive-bayes-classifier-js

Naive Bayes classifier developed with MySQL, ExpressJS, and NodeJS by @iamyourdre.

backend data data-science expressjs javascript mysql naive-bayes naive-bayes-algorithm naive-bayes-classifier nodejs

Last synced: 08 Apr 2026

https://github.com/iyashwantsaini/tweetify_

Twitter Data Collection, Analysis Tool

collection data twitter twitter-sentiment-analysis

Last synced: 08 Mar 2026

https://github.com/GAMELEIRA/studies-database

Esse repositório têm como objetivo alocar todo e qualquer script para aprender e praticar gerenciamento de banco de dados SQL e NoSQL. Nesse projeto, serão consolidados os principais fundamentos e princípios, além da prática de exercícios e desenvolvimento de projetos.

data database mongodb mssql mysql nosql sql

Last synced: 03 May 2025

https://github.com/mobinx/easymeet-js

EasyMeetjs is a robust and versatile TypeScript library that provides a solid foundation for building WebRTC-based applications. It simplifies the complexities of WebRTC, enabling developers to easily incorporate real-time communication features into their projects.From simple audio video calling to real time peer to peer file transfer , everything

data meeting react realtime screensharing streaming-video webrtc zoom

Last synced: 03 Jan 2026

https://github.com/merrill007/sql-data-warehouse-project

The Data Warehouse and Analytics Project is a comprehensive initiative designed to demonstrate the end-to-end process of building a modern data warehouse and deriving actionable insights through SQL-based analytics.

architecture business-intelligence crm data data-analysis database database-management datawarehouse erp etl etl-pipeline model sql sqlserver

Last synced: 22 Mar 2025

https://github.com/dcmox/moxymapper

Data mapping made easy

data json mapper

Last synced: 15 May 2026

https://github.com/engineeringmadness/gaming-ai-analytics

Using Databricks to analyze game reviews from Steam web store

data databricks llama pyspark semantic-layer

Last synced: 15 May 2026

https://github.com/prernarohra/todo-webapp

Simple Todo App for practice.

axios css data fastapi html json python typescript

Last synced: 06 Apr 2026

https://github.com/richelbilderbeek/heyahmama

Data about the Flemish/Dutch band K3

band data k3 package r r-lang r-language

Last synced: 22 May 2026

https://github.com/rrwen/twitter2return

Module for extracting Twitter data using option objects

access api data extract geo get location media oauth object option post rest return sample social stream token tweet twitter

Last synced: 03 Apr 2025

https://github.com/theanujsinha01/data-analytics-portal-

Data Analytics Portal Built a web-based data analytics tool using Streamlit, Pandas, and Plotly. Supported CSV and Excel uploads (up to 200MB) for data exploration. Features included statistical summaries, group-by aggregation, and frequency counts. Integrated interactive charts (bar, pie, line, scatter) for visual insights. This tool is live now.

analytics data portal

Last synced: 28 Apr 2026

https://github.com/kirkalyn13/xyz-books-pipeline

XYZ Books Pipeline to check and update incoming ISBNs from newly added books from the CRUD UI, and record new data to a CSV file.

api csv data go http rabbitmq

Last synced: 05 Mar 2025

https://github.com/shailu2004/azure_big_data_project

This project demonstrates a comprehensive Azure Data Engineering workflow using multiple Azure resources to process and analyze an e-commerce dataset. The dataset consists of 8 files containing details about customers, payments, orders, and other key information

ai azure cloud data data-engineering

Last synced: 08 Jul 2025

https://github.com/skygenesisenterprise/aether-calendar

Aether Calendar is a lightweight, open-source client built for privacy, speed, and seamless integration within the Aether Office ecosystem

applications calendar capacitorjs data javascript linux macos nextjs typescript windows

Last synced: 12 Apr 2026

https://github.com/realbxnnie/accountservice

A Simple DataStoreService wrapper with session backuping and session locking.

data lua luau roblox

Last synced: 29 Jul 2025

https://github.com/skygenesisenterprise/api-service

The Official Sky Genesis Enterprise API Service Ecosystem

api-service client cryptography data dns docker javascript nextjs service stalwart typescript websocket

Last synced: 31 Dec 2025

https://github.com/shubhamsoni98/analysis-with-sql

This project focuses on creating and managing a database for a music record company to perform various analyses on bands, albums, and songs. Using SQL, the goal is to create a structured relational database with relevant tables, insert necessary data, and perform queries that provide insights into the relationships between bands, albums, and songs.

analys analysis data data-science database dbms mysql mysqlworkbench project query schema sql

Last synced: 03 Jan 2026

https://github.com/kenanbek/youtube-data

YouTube stats data over YouTube Data API v3 using Python.

data python youtube youtube-api

Last synced: 13 May 2026

https://github.com/ressuman/next-blog-1-project

Next.js with TypeScript: Fetching Data and Setting Up Routes. This project demonstrates my first experience with Next.js using TypeScript. It involves fetching posts from the JSON Placeholder dummy API, setting up pages, and linking routes.

api-rest data html-css-javascript jsx nextjs14 routing typescript

Last synced: 15 May 2026

https://github.com/alex0x4b/akutils

High-level Python library for recurring data manipulation (Pandas, Python data structure, API, file manipulation, etc.).

data dataframe pandas python

Last synced: 08 Mar 2026

https://github.com/jun-labs/json-handling

🔍 Json 데이터 핸들링 예제.

data gson jackson json json-object

Last synced: 15 May 2026

https://github.com/aliasgarsogiawala/dashboards

Power BI dashboards , each folder contains a pbix file and a pdf file with explanation of the dashboard

analysis dashboards data data-visualization powerbi

Last synced: 12 Feb 2026

https://github.com/chompfoods/stub-jaxrs-jersey

JAX-RS Jersey server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery ingredients jax-rs jersey nutrition raw recipe-api recipes server server-stub stub stub-server

Last synced: 02 May 2026

https://github.com/jacopodl/jcollections

Common data structures for the C language

c collections data data-structures jcollections

Last synced: 30 Jul 2025

https://github.com/jigyasag18/credit-card-fraud-detection-using-machine-learning

This repository presents a credit card fraud detection system utilizing a Logistic Regression model trained on a dataset of 284,807 transactions with significant class imbalance. After employing under-sampling for balance, the model achieves a test accuracy of around 93.40%, showcasing the effectiveness of ML in identifying fraudulent transactions.

credit-card-fraud creditcardfrauddetection data dataset logistic-regression logisticregression machine-learning machine-learning-algorithms mlproject mlprojects

Last synced: 02 Sep 2025

https://github.com/thesfinox/fit-the-data

Data analysis using Wolfram Mathematica

analysis data data-analysis lab mathematica wolfram wolfram-mathematica

Last synced: 24 Jan 2026

https://github.com/patrikcze/meshtatic_data

Meshtastic Data Transfer - Trying some stupid thing, like transferring files over LORA network.

data meshtastic meshtastic-python

Last synced: 03 Feb 2026

https://github.com/krescruz/pegaso-data

Utilerías para el analisis de datos del Proveedor de Certificación de Factura Pegaso

cfdi-mexico data pac sat-gob

Last synced: 29 Apr 2026

https://github.com/ressuman/csv-writer-project

CSV Writer with TypeScript. This project demonstrates my implementation of a CSV writer using plain TypeScript and JavaScript, without relying on any frameworks.

data javascript typescript

Last synced: 15 May 2026

https://github.com/neptun-software/neptun.data.generators

Send scraped data from neptun-scraper to CHATGPT to generate training data for NEPTUN.AI.

data generator

Last synced: 30 Jul 2025

https://github.com/pyrustic/litedao

Intuitive interaction with SQLite database

auto-init dao data database database-access library lightweight pyrustic python sql sqlite

Last synced: 09 May 2026

https://github.com/svenruppert/_data_for_demos

Data used for demos

data datasets images ruppert sven

Last synced: 25 Jan 2026

https://github.com/ahmad-ali-rafique/random-forest-classifier-modeling

Detailed exploration of random forest classifiers, including data cleaning, model building, and performance evaluation on various datasets.

classification classification-models data dataanalytics datamodel dataset model-checking models random-forest random-forest-classifier

Last synced: 01 Jun 2026

https://github.com/ahmad-ali-rafique/random-forest-regressor-modeling

Detailed exploration of random forest regressors, including data cleaning, model building, and performance evaluation on various datasets.

data dataanalytics datacleaning evaluation-metrics modeling random-forest random-forest-regression regression regression-analysis

Last synced: 05 Mar 2025

https://github.com/ahmad-ali-rafique/electricity-consumption-analysis-household-dataset

This repository contains analysis and predictive modeling of household electricity consumption using Python. It includes data cleaning, exploratory data analysis (EDA), time series forecasting (ARIMA, SARIMA, LSTM), and model evaluation to optimize energy usage.

arima-forecasting artificial-intelligence artificial-neural-networks data data-science dataanalytics datacleaning evaluation-metrics exploratory-data-analysis long-short-term-memory lstmmodel modeling time-series timeseries-forecasting

Last synced: 23 Jun 2025

https://github.com/dms-codes/www.usu.ac.ididdirektori

Faculty and Docent Data Retrieval Script The faculty_and_docent_data_retrieval.py script is a Python script for retrieving faculty and docent data from a university website using Selenium. It includes functions to extract faculty names and docent profiles, as well as a multithreading approach to fetch data for multiple faculty-docent pairs.

data python scrape

Last synced: 26 May 2026

https://github.com/kenjyco/mongo-helper

Helper funcs and tools for working with MongoDB

aggregation-pipeline data database kenjyco mongo mongodb python

Last synced: 28 Jan 2026

https://github.com/bala-1409/sales-forecasting-datascience-project

Develop a data science project using historical sales data to build a regression model that accurately predicts future sales. Preprocess the dataset, conduct exploratory analysis, select relevant features, and employ regression algorithms for model development. Evaluate model performance, optimize hyperparameters, and provide actionable insights.

data data-analysis data-science data-visualization datacleaning exploratory-data-analysis machine-learning-algorithms modelfitting prediction predictive-analytics predictive-modeling python3 regression-models salesforecast supervised-learning

Last synced: 26 Apr 2026

https://github.com/bala-1409/loan-classification-data-science-projects

This project uses machine learning algorithms to predict the classification of loan status. The dataset is loaded and some transformation is done using SQL for getting a proper dataset with some valid informations.

data data-analysis datacleaning datascience datavisualization exploratory-data-analysis loan machine-learning machine-learning-algorithms modelfitting sql supervised-learning visualization

Last synced: 22 Mar 2025