An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/chandraprakash-bathula/keywords_prediction-machine-learning-integration

Keywords Prediction Model Built the Model By: Data Cleaning Removing Stopwords Constructing Word2vec Advancing to TF-IDF Weighted Word2vec.

algori artifici data machine-learning tf-idf weighted-word2vec word2vec

Last synced: 08 Nov 2025

https://github.com/fredhutch/gdscnsoilsites

Homepage for BioDIGS Project. Learn about the project and download data.

biodigs data metagenomics student-research

Last synced: 25 Mar 2025

https://github.com/bredalis/numpy

✨ Library to work with arrays ✨

arrays data matrix numpy numpy-arrays numpy-library python

Last synced: 06 May 2026

https://github.com/snimmagadda1/stack-exchange-dump-to-mysql

Batch pipeline to import Stack Exchange XML data dumps to relational DB

batch data mysql spring-batch stackoverflow

Last synced: 30 Mar 2025

https://github.com/tpgillam/teafiles.jl

Tea file support for Julia

data julia time-series

Last synced: 03 Oct 2025

https://github.com/lmuffato/project-ting-trybe

Projeto ting - Projeto avaliativo da Trybe do Bloco 37: Estrutura de Dados II: Listas, Filas e Pilhas

data data-analysis python queue read-file stack trybe trybe-projects

Last synced: 12 Jun 2025

https://github.com/lmuffato/project-job-insights-trybe

Projeto job insights - Projeto avaliativo da Trybe do Bloco 32: Introdução à Python

data data-science data-transformation filter python

Last synced: 12 Jun 2025

https://github.com/azrunguraya/kabyle-corpus-dataset

Dans l'univers du Traitement Automatique des Langues , l'accès à des datasets diversifiés et bien annotés est essentiel pour développer des modèles performants. Ce projet vise à combler cette lacune spécifique pour la langue taqbaylit, une langue berbère parlée principalement en Kabylie

ber berber berber-dataset corpus data dataset ia kabyle kabyle-art kb machine-learning nlp nlp-machine-learning python taqbaylit text words

Last synced: 31 Jul 2025

https://github.com/gbv/cocoda-mappings

concordances, mappings and conversion scripts to create JSKOS mappings

coli-conc data jskos

Last synced: 28 Oct 2025

https://github.com/stdlib-js/ndarray-base-empty

Create an uninitialized ndarray having a specified shape and data type.

base data empty javascript matrix ndarray node node-js nodejs stdlib structure types vector

Last synced: 19 Feb 2026

https://github.com/edugmenes/azure-data-engineering

This repository contains my first end-to-end Data Engineering project, built using Microsoft Azure Cloud and Azure Databricks with PySpark.

azure cloud data data-engineering data-lakehouse data-structures databricks delta-lake etl-pipelines lakehouse lakehouse-architectures medallion-architecture microsoft-azure pyspark spark

Last synced: 29 Jan 2026

https://github.com/eugenedakin/caesarcipher

Native Xojo code for the Caesar Cipher algorithm with an example program

caesar-cipher data decryption encryption xojo

Last synced: 07 Jan 2026

https://github.com/bastianolea/campamentos_chile

Datos del Catastro de campamentos nacional 2024, del Ministerio de Vivienda y urbanismo

chile comunas data pobreza social

Last synced: 24 Aug 2025

https://github.com/ngambip/priscilla

About my work and Experience

accounting analytics data finance-management

Last synced: 03 Feb 2026

https://github.com/viisix/corecat

Core repository of DanceCats project.

data lightweight python3

Last synced: 25 May 2026

https://github.com/cleanzr/restaurant

Restaurant data set for entity resolution

data linkage

Last synced: 11 Mar 2026

https://github.com/gorhkdwj/da_portfolio

Kim Jae Chun's DA_Portfolio

data data-analysis python sql

Last synced: 20 Feb 2026

https://github.com/grycap/cdmi-client-go

A basic Go library to perform CDMI core operations

cdmi cloud data go

Last synced: 21 Jan 2026

https://github.com/fastbolt/excel-writer

Excel-Writer component

data excel excel-export

Last synced: 14 Apr 2025

https://github.com/avahoffman/dataplay

🤸‍♂️ Load data to play with

data data-package r r-package rstats

Last synced: 25 Mar 2025

https://github.com/bolajiolayinka/graph-api-automation

An End to End Automation from Facebook Business to Data Visualization of Campaigns

data data-science

Last synced: 07 May 2025

https://github.com/melinteflxrin/softserve-bigdata-project

End-to-end data warehousing project integrating APIs, ETL workflows, and PostgreSQL for analytics and reporting.

analytics api bigdata data datawarehousing externalapi pipeline postgres postgresql python warehouse

Last synced: 26 Jan 2026

https://github.com/tether/tether-schema

Custom protocol buffer schema for data validation

data protocol schema validation

Last synced: 09 Apr 2025

https://github.com/francescodisalesgithub/data-for-developers

simple SQL database with problems and solution found on stackoverflow, documentation or chatgpt

chatgpt data database developer hacker hacking knowledge solutions sql targets

Last synced: 22 Mar 2025

https://github.com/cainmi/data-page-project

A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now

data html javascript python schema

Last synced: 21 Oct 2025

https://github.com/desininja/data-engineer-interview-questions

This repository contains all the Data Engineer Interview Questions asked by interviewers.

data data-engineer-interview-questions

Last synced: 31 Mar 2025

https://github.com/devsujay19/knowledgebase

My knowledge base built with NextJS 14, Tailwind CSS 3 and Aceternity UI.

data knowledge-base nextjs nextjs-typescript nextjs14 react server-side-rendering tailwindcss vercel

Last synced: 10 Apr 2026

https://github.com/stdlib-js/ndarray-base-to-reversed

Return a new ndarray where the order of elements of an input ndarray is reversed along each dimension.

base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure to-reversed types vector view

Last synced: 12 Apr 2026

https://github.com/agavitalis/sample-c-codes

A collection of small projects I carried out on audino as an electronic engineering student despite felling in love with website development.

ageteller atm binary data gpcalculator logging

Last synced: 09 Apr 2025

https://github.com/devlive-community/mockaroo

一个轻量级的 HTTP Mock 服务器,用于快速构建模拟数据接口,适用于前后端开发和接口测试场景。

data mock

Last synced: 08 Jul 2025

https://github.com/himel-sarder/web-scraping-it-jobs-dataset

This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.

data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping

Last synced: 22 Feb 2026

https://github.com/dalikewara/typego

typego provides custom type that can be used to construct information (such as success data, error data, etc)

custom data golang helper type typego

Last synced: 09 Apr 2025

https://github.com/yasenstar/powerbi_tutorial

Base on "PowerBI Tutorial" book, provide step by step video demo on learning and mastering Power BI tool

analytics data microsoft powerbi tutorial visualization

Last synced: 07 Jan 2026

https://github.com/geo-y20/uber-rides-data-analysis

This project aims to analyze Uber ride data to understand various aspects of ride usage, such as the distribution of rides across different categories, purposes, months, days, and times.

dashboard dashboard-templates data data-analysis data-analysis-python data-analytics data-visualization pandas powerbi python recommendation-system rides uber

Last synced: 13 Apr 2026

https://github.com/stdlib-js/array-base-to-accessor-array

Convert an array-like object to a minimal array-like object supporting the accessor protocol.

accessor accessors array array-like convert data javascript node node-js nodejs object protocol stdlib structure types wrap wrapper

Last synced: 04 Jan 2026

https://github.com/jaldekoa/fdicapi

A Python wrapper to easily retrieve data from the BankFind Suite official API from FDIC in pandas format.

api api-wrapper banking data finance pandas python united-states

Last synced: 07 Jan 2026

https://github.com/husna-poyraz/titanic-machine-learning

Use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.

data data-analysis data-science data-visualization deep-learning machine-learning missing-data outlier-detection python titanic

Last synced: 10 May 2026

https://github.com/camara94/introduction-to-data-engineering

Describe the different entities that form a modern data ecosystem. Describe and differentiate between the role and responsibilities of Data Engineers, Data Scientists, Data Analysts, Business Analysts, and Business Intelligence Analysts. Explain what Data Engineering is. List the tasks that need to be performed in a typical data engineering lifecycle. Describe what a day in the life of a Data Engineer looks like.

business-analytics business-intelligence data dataingestion dataintegration datascience machinelearning python statistical-analysis

Last synced: 09 Apr 2025

https://github.com/flowsynx/plugin-postgresql

FlowSynx plugin to interfaces with PostgreSQL for CRUD operations. Supports JSONB, full-text search, and advanced query features.

data database flowsynx postgresql postgresql-database sql

Last synced: 09 May 2026

https://github.com/dannyben/datamix

DSL for manipulating tabular data

csv data data-analysis data-engineering gem ruby tabular-data

Last synced: 31 Jul 2025

https://github.com/stdlib-js/array-zero-to-like

Generate a linearly spaced numeric array whose elements increment by 1 starting from zero and having the same length and data type as a provided input array.

array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector

Last synced: 07 Jan 2026

https://github.com/dav009/bqt

Local unit tests for your BigQuery queries

bigquery bq data test unittest

Last synced: 11 Feb 2026

https://github.com/varbrad/mindb

🗄 🔍 ⚡️ Schema-less document-oriented collection model data-store for Node & Browsers.

browser data datastore db document javascript json-schema mongo mongodb nodejs nosql query schema

Last synced: 13 Apr 2026

https://github.com/kuro337/scalamono

Scala Monorepo Tooling for Kafka, Opensearch, Spark, Redpanda, Hadoop - and Lang Reference.

data database duckdb hadoop kafka redpanda sdala spark

Last synced: 13 Apr 2026

https://github.com/nodef/infoods

Kit for International Network of Food Data Systems (INFOODS).

component data food identifier infoods international network systems tagnames

Last synced: 11 Mar 2026

https://github.com/goncaloperes/datavisualization

Here I will share some of my data visualizations using a variety of datasets, technologies and tools.

d3js data dataset datavisualization dataviz ggplot matplotlib rawgraphs seaborn tableau visualization yellowbrick

Last synced: 04 Feb 2026

https://github.com/derrickbaruga7/python-data-analysis

This project analyzes ORU’s off-season sewer usage using Python, with `pandas` for data handling, histograms and line plots for exploration, and a `scipy`-based model for prediction. Pearson’s correlation and visualizations help reveal key trends and relationships.

analytics data data-science visualization

Last synced: 31 Jul 2025

https://github.com/neelravi/data-management

A data management plan for computational chemists/physicists and material scientists for a FAIR storage of raw data

data dmp fair management workflows

Last synced: 16 Jan 2026

https://github.com/tylerben/data-spring

Easily generate a dummy dataset based on a provided config

data data-spring datagenerator fake-data generator javascript typescript

Last synced: 27 May 2026

https://github.com/gappeah/london-housing-price-dashboard

This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.

data data-analysis data-visualization excel visual

Last synced: 31 Jul 2025

https://github.com/stdlib-js/array-zero-to

Generate a linearly spaced numeric array whose elements increment by 1 starting from zero.

array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector

Last synced: 08 Jan 2026

https://github.com/luminati-io/Crunchbase-dataset-samples

A sample of 1001 Crunchbase companies with key data points, extracted using the Bright Data API.

crunchbase crunchbase-api crunchbase-scraper data database datasets webscraper-api webscraping

Last synced: 09 Apr 2025

https://github.com/luminati-io/Twitter-X-dataset-samples

A sample dataset of over 1000 Twitter (X) posts, extracted using the Bright Data API, ideal for trend discovery, brand monitoring, and competitive insights.

api data dataset twitter twitter-api twitter-scraper web-scraping x

Last synced: 09 Apr 2025

https://github.com/visenger/prada

Profiling Datasets

cleaning data dataset profiling

Last synced: 24 Aug 2025

https://github.com/isaac-lal/english-arabic-dictionary

This is a dictionary website that implements a search feature which allows input for a word in either English or Arabic and returns the alternative translation.

data db javascript react web-development

Last synced: 09 Apr 2026

https://github.com/castdrian/kdapi

A TypeScript library that scrapes K-pop idol and group information from online sources to create comprehensive JSON datasets.

api data kpop scraper typescript

Last synced: 15 May 2025

https://github.com/stdlib-js/datasets-herndon-venus-semidiameters

Fifteen observations of the vertical semidiameter of Venus, made by Lieutenant Herndon, with the meridian circle at Washington, in the year 1846.

astronomy data dataset datasets grubbs herndon javascript node node-js nodejs outlier outliers sample statistics stats stdlib venus

Last synced: 09 Oct 2025

https://github.com/stdlib-js/array-one-to-like

Generate a linearly spaced numeric array whose elements increment by 1 starting from one and having the same length and data type as a provided input array.

array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector

Last synced: 20 Feb 2026

https://github.com/undistraction/grid-model

A small API for creating a grid and accessing the positions of the cells, rows and columns within it.

2d calculations cells data grid layout model

Last synced: 04 Aug 2025

https://github.com/ilejuxepwaduzd/structured-data-extractor

🛠️ Extract structured data from messy texts using Chain-of-Thought prompting to improve processing of customer support and technical issues.

cdp chrome-fetcher data document-extraction ecommerce golang-library headless metadata-extraction ocr open-source pdf pdf-converter pdf-extractor ruby scraper shopify spider structured-data

Last synced: 10 Apr 2026

https://github.com/theryston/db-mycro

A node module with a json database that saves data in a specific directory, similar to sqlite, but in JSON

base crud data database db db-mycro javascript json jsondatabase nodejs nosql typescript

Last synced: 09 Apr 2026

https://github.com/aranfononi/h4x0r-news-section-17-project

A SwiftUI-powered app that displays top stories from Hacker News. Users can open articles directly within the app, utilizing SwiftUI’s NavigationLink and custom WebView integration.

app-development data data-binding data-binding-library ios swift swiftui xcode

Last synced: 18 May 2026

https://github.com/rishabh-agarwal/datastructuremachineproblem

Data Structure MP - Clemson University (Language C)

273 alogrithms clemson data ece structure university

Last synced: 26 Oct 2025

https://github.com/simranjeet97/leetcode_practice

Practicing the Leet Code Codes for Competitive Programming

algorithms amazon coding competitive-programming data data-structures facebook google leetcode python

Last synced: 03 Aug 2025

https://github.com/mouneshgouda/learn_dsa

This repository explores fundamental data structures and their implementations. Learn how to organize and manipulate data efficiently for various programming tasks. (Feel free to add your specific focus areas here, e.g., algorithms, interview prep)

c data queue sorting-algorithms stack structured-data

Last synced: 30 Jul 2025

https://github.com/qeeqbox/data-states

Data states refer to structured and unstructured data divided into three categories (At Rest, In Use, and In Transit)

data data-state infosecsimplified qeeqbox

Last synced: 10 Mar 2026

https://github.com/stdlib-js/array-base-to-deduped

Copy elements to a new generic array after removing consecutive duplicated values.

array compress copy data dedupe deduplicate deduplication duplicate generic javascript node node-js nodejs stdlib structure types uniq unique

Last synced: 14 Jun 2025

https://github.com/neelravi/fairtool

A CLI tool for FAIR processing of computational materials science data.

computational data data-analytics fair management materials physics python science

Last synced: 14 Jan 2026

https://github.com/qedsoftware/afsisdb-demos

AfSIS DB Demos

agriculture data soil

Last synced: 27 Oct 2025

https://github.com/izaaccoding36/dados-dinamicos

Esse repositório apresenta um site criado com API para a criação de gráficos, relatando o uso de redes sociais em uma escala global

api data redes-sociais social-media website

Last synced: 26 Mar 2025

https://github.com/exoticknight/juhe

simple way to analyze complex data in one chain call

aggregation aggregator analysis data statistic typescript

Last synced: 21 May 2026

https://github.com/pradeep221b/turbofan_predictive_maintenance

An R project for predicting turbofan engine RUL using {targets} and {tidymodels}.

data data-science-portfolio machine-learning nasa preditive-maintaince r rstats targets-pipeline tidymodels

Last synced: 04 Oct 2025

https://github.com/bredalis/scikitlearn

🤖 Library to create ML models 🤖

data ia learning-python librery ml python

Last synced: 30 May 2026

https://github.com/rremple/intervalidus

For all your interval-based data needs.

data intervals

Last synced: 21 Feb 2026

https://github.com/bilalmehrban/data-log-monitor

A simple yet elegant desktop c# application based on 3 Tier architecture, designed to have a look at the logs stored in the database using Nlog or other logging framework's.

csharp data desktop-app logging

Last synced: 14 Mar 2025

https://github.com/bastianolea/palestina

Visualizador sobre cifras de la masacre que Israel está llevando a cabo en Palestina y la franja de Gaza

app data meses palestina politica shiny social tiempo

Last synced: 06 Jul 2025

https://github.com/zediculz/block

Block is a data structure/collection that uses Blockchain principle in managing data.

algorithm data structure

Last synced: 05 Oct 2025

https://github.com/vagnerbellacosa/029_analisededadoscompythonpandas

Neste Labs será apresentada a biblioteca Pandas, uma biblioteca Python de código aberto para análise de dados. Ela dá ao Python a capacidade de trabalhar com dados do tipo planilha, permitindo carregar, manipular e combinar dados rapidamente, entre outras funções. Python

data digital-innovation-one dio jupiter-notebook labs ms-excel panda python

Last synced: 14 May 2026

https://github.com/jmcanterafonseca/leaflet-context-information

A Leaflet plugin + infrastructure for getting access to Context Information (i.e. data) exposed through FIWARE NGSIv2

context data fiware information leaflet map open visualization web

Last synced: 21 Apr 2026

https://github.com/ddeutils/ddedocs

📖 Data Developer & Engineer Documents and Hands-On

blogs data data-engineering documents hands-on

Last synced: 08 Aug 2025

https://github.com/danielrosehill/monetised-ghg-emissions

Calculating monetised GHG emissions for various companies based upon disclosure data

data sustainability sustainability-data

Last synced: 07 Sep 2025

https://github.com/asuozzo/medicare-data-analysis

An analysis of Medicare Part D data in Vermont

data python

Last synced: 04 May 2026

https://github.com/programmer-rd-ai/library-management-system-oraclesql

The Library Management System project, part of the CI6320 Advanced Data Modelling coursework, features comprehensive SQL scripts utilizing OracleSQL to facilitate efficient data modeling and management.

adm advanced ci6320 cw data icw library management modelling oracle oraclesql report sql system

Last synced: 29 Oct 2025

https://github.com/programmer-rd-ai/moviedatascraper

Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!

beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web

Last synced: 01 Mar 2025

https://github.com/garcane/income-prediction-ml

This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.

data data-science machine-learning ml numpy pandas python random-forest scikit-learn

Last synced: 08 Apr 2026