data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-27 00:07:33 UTC
- JSON Representation
https://github.com/bijx/firestore-data-fetcher
A simple Python script to fetch documents from a Firebase Firestore collection and save them to a local `.json` file.
automation data database downloader exporter fetcher firebase firestore open-source script
Last synced: 12 Apr 2026
https://github.com/toransahu/metoffice
Data visualisation - MetOffice
data metoffice uk visualization weather
Last synced: 25 Mar 2025
https://github.com/lane-romuald/iot-irrigation-data-collection-system
An IoT-based data collection system using the ESP32 microcontroller programmed with Arduino to monitor environmental conditions for smart irrigation. The system measures soil moisture, temperature, air temperature, humidity, and rain probability. Data is stored locally on an SD card and uploaded to the ThingSpeak platform.
arduino cloud data data-collection esp32 openweather openweathermap thingspeak wi-fi
Last synced: 12 Apr 2026
https://github.com/eugenedakin/caesarcipher
Native Xojo code for the Caesar Cipher algorithm with an example program
caesar-cipher data decryption encryption xojo
Last synced: 07 Jan 2026
https://github.com/cleanzr/restaurant
Restaurant data set for entity resolution
Last synced: 11 Mar 2026
https://github.com/fiskeben/meetjescraper
HTTP proxy for Meet je stad project
api data go iot meetjestad proxy scraper weather
Last synced: 29 May 2026
https://github.com/codeforafrica/ckanext-followy
[ARCHIVED] A CKAN extension to show the datasets a user is following.
ckan ckan-extension ckanext-followy data dataset followy-extension open-data
Last synced: 16 Mar 2025
https://github.com/vvipjain/bike-sales-dashboard
Bike Sales Dashboard
dashboards data data-analysis data-cleaning data-normalisation data-visualization excel pivot-chart pivot-tables
Last synced: 04 Feb 2026
https://github.com/e-panourgia/data-science-projects
Data Science Projects
annotations augmentation data data-preprocessing-and-cleaning hyperparameter-tuning llm logistic-regression nlp random-forest-classifier xboost-classifier
Last synced: 09 Apr 2025
https://github.com/avahoffman/dataplay
🤸♂️ Load data to play with
data data-package r r-package rstats
Last synced: 25 Mar 2025
https://github.com/bolajiolayinka/graph-api-automation
An End to End Automation from Facebook Business to Data Visualization of Campaigns
Last synced: 07 May 2025
https://github.com/melinteflxrin/softserve-bigdata-project
End-to-end data warehousing project integrating APIs, ETL workflows, and PostgreSQL for analytics and reporting.
analytics api bigdata data datawarehousing externalapi pipeline postgres postgresql python warehouse
Last synced: 26 Jan 2026
https://github.com/thiagopanini/datadelivery
Um módulo Terraform open source capaz de proporcionar um toolkit completo de infraestrutura para que usuários iniciem suas respectivas jornadas de exploração em serviços de Analytics na AWS.
analytics athena aws catalog crawler data datamesh glue s3 terraform
Last synced: 29 Nov 2025
https://github.com/tether/tether-schema
Custom protocol buffer schema for data validation
data protocol schema validation
Last synced: 09 Apr 2025
https://github.com/whitehathackerpr/data-visualization-tool
This is a Python-based web application that allows users to upload datasets, analyze data, and create visualizations interactively. The tool is designed for ease of use and provides a simple interface to perform basic data analysis and generate visualizations
data data-analysis data-visualization python python3
Last synced: 05 Sep 2025
https://github.com/xpotify/scraper
Scraper designed for Xpotify's client to gather information from websites🌟
axios cheerio data javascript scraper webscraper
Last synced: 07 Jul 2025
https://github.com/cainmi/data-page-project
A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now
data html javascript python schema
Last synced: 21 Oct 2025
https://github.com/desininja/data-engineer-interview-questions
This repository contains all the Data Engineer Interview Questions asked by interviewers.
data data-engineer-interview-questions
Last synced: 31 Mar 2025
https://github.com/bredalis/datastructure
📚 Estructuras de Datos en Python
algorithms data data-structure python
Last synced: 12 Apr 2026
https://github.com/stdlib-js/ndarray-base-to-reversed
Return a new ndarray where the order of elements of an input ndarray is reversed along each dimension.
base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure to-reversed types vector view
Last synced: 12 Apr 2026
https://github.com/stdlib-js/array-float32
Float32Array.
array data float float32 float32array ieee754 javascript node node-js nodejs single single-precision stdlib structure typed typed-array types
Last synced: 14 Jan 2026
https://github.com/agavitalis/sample-c-codes
A collection of small projects I carried out on audino as an electronic engineering student despite felling in love with website development.
ageteller atm binary data gpcalculator logging
Last synced: 09 Apr 2025
https://github.com/devlive-community/mockaroo
一个轻量级的 HTTP Mock 服务器,用于快速构建模拟数据接口,适用于前后端开发和接口测试场景。
Last synced: 08 Jul 2025
https://github.com/himel-sarder/web-scraping-it-jobs-dataset
This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.
data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping
Last synced: 22 Feb 2026
https://github.com/jigyasag18/gold-price-prediction-project-using-machine-learning
This repository contains a machine learning project focused on predicting gold prices (GLD) using historical stock market data, including indicators such as SPX, USO, SLV, and EUR/USD. The project implements a Random Forest Regressor for accurate price forecasting, complete with data visualization, correlation analysis, and model evaluation metrics
data dataset jupyter-notebook jupyter-notebooks machine-learning machinelearing machinelearningalgorithms machinelearningmodel machinelearningprojects matplotlib mlproject numpy pandas randomforestregressor seaborn
Last synced: 23 Jul 2025
https://github.com/jaldekoa/fdicapi
A Python wrapper to easily retrieve data from the BankFind Suite official API from FDIC in pandas format.
api api-wrapper banking data finance pandas python united-states
Last synced: 07 Jan 2026
https://github.com/san089/black-friday-sales-analysis
This Project gives an insight into few statistics related to black Friday Sale.
custom data dataanalysis insights sales statistics
Last synced: 13 Jul 2025
https://github.com/nikhilash45/live_ipl_report
This repository hosts the source code for an interactive IPL (Indian Premier League) Dashboard built using PowerBI. The dashboard provides real-time updates on ongoing matches, including live scores, batting and bowling statistics for both teams, and the points table.
analysts cleaning-data cricket-data dashboard data data-analysis data-visualization dax powerbi
Last synced: 19 Mar 2026
https://github.com/danreynolds/data_batcher
Data batcher batches and de-dupes data fetched in the same task of the event loop.
batching data flutter hacktoberfest
Last synced: 19 May 2026
https://github.com/luminati-io/Crunchbase-dataset-samples
A sample of 1001 Crunchbase companies with key data points, extracted using the Bright Data API.
crunchbase crunchbase-api crunchbase-scraper data database datasets webscraper-api webscraping
Last synced: 09 Apr 2025
https://github.com/cintia0528/data_cleaning_and_analytics-python
Evaluate if aggressive discounting benefits Eniac long-term, considering differing views on customer acquisition and brand positioning. Focus on data cleaning for informed decision-making.
colab-notebook data data-analysis datacleaning dataquality jupyter-notebook matplotlib pandas python seaborn
Last synced: 08 Jan 2026
https://github.com/marabesi/d3-visualization
Different visualizations using data and d3.js
charts css d3js data html js json timeline-chart visualization
Last synced: 01 May 2026
https://github.com/rayyan9477/dep
data data-science machine-learning python visualization web-scraping
Last synced: 08 May 2026
https://github.com/neelravi/fairtool
A CLI tool for FAIR processing of computational materials science data.
computational data data-analytics fair management materials physics python science
Last synced: 14 Jan 2026
https://github.com/izaaccoding36/dados-dinamicos
Esse repositório apresenta um site criado com API para a criação de gráficos, relatando o uso de redes sociais em uma escala global
api data redes-sociais social-media website
Last synced: 26 Mar 2025
https://github.com/nafisalawalidris/buybuy-e-commerce-company
The BuyBuy E-commerce Company repository is a comprehensive hub for the company's e-commerce platform. It includes source code, documentation, and data analysis insights, providing a data-driven approach to improve customer experience, drive revenue, and inform decision-making.
buybuy cleaning-data company customer-experience data data-analysis decision-making documentation e-commerce excel insights postgresql repository revenue source-code sql
Last synced: 16 Mar 2025
https://github.com/vagnerbellacosa/029_analisededadoscompythonpandas
Neste Labs será apresentada a biblioteca Pandas, uma biblioteca Python de código aberto para análise de dados. Ela dá ao Python a capacidade de trabalhar com dados do tipo planilha, permitindo carregar, manipular e combinar dados rapidamente, entre outras funções. Python
data digital-innovation-one dio jupiter-notebook labs ms-excel panda python
Last synced: 14 May 2026
https://github.com/gagolews/clustering-results-v1
A framework for benchmarking clustering algorithms – Benchmark results (for version 1 of the Suite)
benchmark benchmark-datasets clustering data dataset datasets machine-learning
Last synced: 16 Mar 2025
https://github.com/robertopatino1/oscars2023_data_analysis
A deep data science analysis involving tweets regarding the upcoming Academy Awards
data data-analysis-python data-science data-visualization html jupyter-notebook lda-model machine-learning python trends tweepy twitter
Last synced: 24 Apr 2026
https://github.com/programmer-rd-ai/library-management-system-oraclesql
The Library Management System project, part of the CI6320 Advanced Data Modelling coursework, features comprehensive SQL scripts utilizing OracleSQL to facilitate efficient data modeling and management.
adm advanced ci6320 cw data icw library management modelling oracle oraclesql report sql system
Last synced: 29 Oct 2025
https://github.com/programmer-rd-ai/moviedatascraper
Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!
beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web
Last synced: 01 Mar 2025
https://github.com/sandipbera35/blogapp.spring.boot
A proof-of-concept Project Of Blog application In Java Spring Boot, Spring Data JPA with mysql Minio Object Storage , it is an Integration with JWT authservice project(written in golang) .
data java jpa jpa-entity-manager jpa-hibernate mysql mysql-server postman postmanapi spring-boot
Last synced: 13 Apr 2026
https://github.com/aiwithqasim/competitive-programming
I will add all material which i did or in the future i will do to make my programming skill more enhanced to become a competitive programmer
c-plus-plus code data java programming structured-data
Last synced: 20 May 2026
https://github.com/vincentlaucsb/csv-data
A curated repository of real and fake CSV data for use in testing suites
Last synced: 08 Mar 2026
https://github.com/s-raza/csvio
Wrapper for conveniently processing CSV files
csv data file processing wrapper
Last synced: 14 Jan 2026
https://github.com/avto-dev/static-references-data
Data for static references
Last synced: 05 Oct 2025
https://github.com/dixslyf/nbparts
Unpack a Jupyter notebook into its sources, outputs and metadata.
data haskell jupyter jupyter-notebook nix nix-flake
Last synced: 05 Oct 2025
https://github.com/helins/ex.clj
Java exceptions as clojure data
clojure data exception java java-exceptions
Last synced: 12 Dec 2025
https://github.com/jahilldev/immutable-parsejs
Parse a JS object or array/map into an Immutable collection. Makes use of ImmutableJs List, and Record primitives.
data immutablejs javascript json nodejs parse typescript
Last synced: 13 Apr 2026
https://github.com/hyperversal-blocks/averveil
Averveil is OpenSea for Data.
blockchain data golang iot privacy zero-knowledge zkp
Last synced: 14 Jan 2026
https://github.com/joocer/data_expectations
Are your data meeting your expectations?
data data-engineering data-quality data-science data-unit-tests observability pipelines quality validation
Last synced: 07 Oct 2025
https://github.com/ahmad-ali-rafique/comment-generation-tool
This repository hosts a Jupyter Notebook-based Comment Generation Tool exploring advanced NLP techniques for automated, contextually relevant comment generation from input data. Ideal for developers and researchers in NLP and automated text generation.
ai aitools artificial-intelligence content-based-recommendation data datascience jupyter-notebook machine-learning
Last synced: 07 Oct 2025
https://github.com/tushar2704/insurance-cross-sell
This project harnesses the power of cutting-edge technologies including H2O AutoML, MLflow, FastAPI, and Streamlit to enhance cross-selling campaigns and boost efficiency.
data datascience h20automl machine-learning mlflow python streamlit-tushar2704
Last synced: 08 Oct 2025
https://github.com/nikoshet/rust-dms-cdc-operator
The rust-dms-cdc-operator is a Rust-based utility for comparing the state of a list of tables in an Amazon RDS database with data stored in Parquet files on Amazon S3, particularly useful for change data capture (CDC) scenarios.
aws cdc data dms parquet pgdatadiff polars postgres rds rust s3 validation
Last synced: 18 Jan 2026
https://github.com/mewmix/drivehound
magic file signatures + python drive recovery magic
data disk file-signatures harddrive python recovery recovery-tool
Last synced: 08 Oct 2025
https://github.com/varun-khorgade/sentimentscope-e-commerce-review-analyzer
Analyzed customer reviews and purchase data to extract sentiment and behavioral insights. Built SQL-based ETL for data preparation and visualized results using Python and Power BI dashboards for actionable business decisions.
analytics customer-beheviour dashboard data data-visualization dataextraction natural-language-processing nlp pandas powerbi python sentiment-analysis sql textblob
Last synced: 17 Apr 2026
https://github.com/definetlynotai/vulnscan_data
Logicytics VulnScan Module's Training Data and old model archive
ai data logicytics ml models pytorch sensitive-files text-processing tfidf-text-analysis training-data
Last synced: 11 Oct 2025
https://github.com/strata/data
Tools to help you read data from a range of different data providers.
Last synced: 27 Jan 2026
https://github.com/stdlib-js/array-base-assert-is-real-floating-point-data-type
Test if an input value is a supported array real-valued floating-point data type.
array assert base check data dtype is javascript node node-js nodejs stdlib test types util utilities utility utils valid validate
Last synced: 12 Oct 2025
https://github.com/mccarthy-m-g/alda
An R data package for the book "Applied longitudinal data analysis: Modeling change and event occurrence" by Singer and Willett (2003).
data growth-curves longitudinal-data mixed-models nonlinear-mixed-models r r-package structural-equation-modeling survival-analysis time-to-event
Last synced: 19 Jan 2026
https://github.com/saroshfarhan/kaggle-playground-s4e12
Kaggle competition first attempt
analytics data data-analysis-python data-science
Last synced: 12 Oct 2025
https://github.com/rohancyberops/r-language
R Language Projects directory. This repository contains various projects, scripts, and experiments developed using R, a powerful statistical computing and data visualization language.
caret cran data dplyr ggplot2 rlanguage rstudio shiny tidyverse
Last synced: 12 Oct 2025
https://github.com/iamgmujtaba/github-python-daily-trending
This repository provides an automated, daily-updated list of the top trending Python repositories on GitHub. Using a GitHub Actions workflow, it scrapes data from GitHub's trending page, sorts the results by total stars, and generates a clean, well-structured README file
data data-scraping github-actions tranding tranding-bot
Last synced: 13 Oct 2025
https://github.com/connectaman/deepseek-ocr-multigpu-infer
Efficient multi-GPU OCR inference framework leveraging parallel processes for accelerated token throughput and faster batch processing. Designed for scalable, high-performance optical character recognition workloads using PyTorch. Supports dynamic GPU assignment, optimized resource utilization, and easy integration for large-scale image datasets.
agentic-extraction data deepseek document-parser extraction extractor gpu image-parser llm multigpu nvidia ocr parallel-computing parser pdf-parser vlm
Last synced: 22 Jan 2026
https://github.com/geocollections/turvas
Database of peat geology
data data-visualization database estonia geology mineral-resources peat
Last synced: 05 Feb 2026
https://github.com/athul64/powerbi
Financial Reports Dashboard This repository showcases a Financial Reporting Dashboard that visualizes key financial metrics and performance insights. The dashboard contains Monthly and Annual reports, allowing users to switch between the two views to analyze data at different intervals.
data data-an data-visualization dax dax-expression powerbi
Last synced: 23 Feb 2026
https://github.com/lisakey/datacamp-data-analyst-python-sql-projects
Several projects completed during my Data Analyst 📊 training on the DataCamp platform with Python 🐍 and SQL 🗃️. Each project addresses real-world challenges using modern analytical tools and techniques.
analysis cleaning-data data dataanalysis dataanalyst matplotlib pandas python seaborn sql transformation visuali
Last synced: 19 Apr 2026
https://github.com/lahcenezzara/whatsapp-scraping-python
WhatsApp Scraping Python
automation data python scraping selenium whatsapp
Last synced: 05 Feb 2026
https://github.com/stdlib-js/ndarray-base-dtypes2signatures
Transform a list of array argument data types into a list of signatures.
api array base data dtype dtypes interface javascript multidimensional ndarray node node-js nodejs sig signatures stdlib types utilities utility utils
Last synced: 14 Apr 2026
https://github.com/morphaxthedeveloper/yokatlas-dataset-2025
yök atlas detaylı üniversite, bölüm, puan vb. datası..
data database liste scrape universite veri yok-atlas yok-atlas-api yok-atlas-data yokatlas yokatlas-crawler yokatlas-data
Last synced: 14 Oct 2025
https://github.com/ibilalkayy/covid-tracking-app
This repository contains the code of a covid tracking app that shows the data of covid-19 on Google Map.
Last synced: 14 Oct 2025
https://github.com/nnavales/desafios-data-engineer
En este proyecto abordaremos desafíos comunes en el rol de un Data Engineer con tecnologías modernas.
data data-engineering database dataengineering docker minio scrapping spark
Last synced: 01 Jun 2026
https://github.com/kledenai/jsonweaver
A powerful and easy-to-use library for transforming JSON data into popular formats such as CSV, XML, Markdown tables, YAML, and JSONLines (NDJSON).
csv data data-transform format json jsonlines jsonweaver markdown markdown-tables xml yaml
Last synced: 24 Feb 2026
https://github.com/sanskaryo/ultimate-dsa-repo
One Stop Solution for DSA Learning and Resources
data data-structures-and-algorithms dsa hacktoberfest hacktoberfest-accepted hacktoberfest2025
Last synced: 15 Oct 2025
https://github.com/nxank4/loclean
⚡️ The All-in-One Local AI Data Cleaning Library. No GPU or API keys required.
automated-cleaning data data-cleaning data-engineering data-preprocessing data-science data-wrangling etl llm normalization open-source polars privacy-preserving python semantic-analysis slm structured-data
Last synced: 22 Jan 2026
https://github.com/potreic/etl-fashion-trend-analysis
✨ Automate fashion trend analysis with Apache Airflow! Extract data from X & Pinterest, transform into insights, and load into PostgreSQL. Predict seasonal styles & visualize trends. 💃📊
airflow airflow-dags data data-engineering etl etl-automation etl-pipeline fashion-trends
Last synced: 27 Jan 2026
https://github.com/gematik/poc-isik-patient-merge
The repository contains a proof of concept (POC). The POC demonstrates how a FHIR subscription can be used to inform about happened merges within the ISIK context.
Last synced: 19 Oct 2025
https://github.com/divithraju/divith-aju-hadoop-pyspark-pipeline
This project demonstrates the creation of a scalable data processing pipeline for handling and analyzing log data from a hypothetical e-commerce platform. Leveraging Hadoop and PySpark, the pipeline is designed to process large volumes of log files, providing meaningful insights into user behavior, system performance, and sales metrics.
apache-hadoop-framework apache-spark bigdata client data database dataengineering dataingestionframework datapreprocessing documentation ecommerce-platform hdfs pipeline project project-repository pyspark python3 software-engineering
Last synced: 27 Jan 2026
https://github.com/marcelo-earth/H5N8-Data
🔢🦠 Confirmed cases of H5N8 in humans - Feel free to open Pull Requests with new data.
csv data h5n8 h5n8-cases h5n8-virus russia
Last synced: 20 Oct 2025