data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-02 00:07:45 UTC
- JSON Representation
https://github.com/a3r0id/lightshot-data-miner
A random idea I had a while back to make a data miner for lightshot. Never released this but after a friend sent me a post about lightshot's transparency I figured it'd be a good time to release this. I've included some output from a run before making the repo. I am not responsible for the imagery or it's contents.
brute-force bruteforce data dataset face-recognition image-processing lightshot mining scraper scraping text-recognition
Last synced: 19 Oct 2025
https://github.com/assem-elqersh/creativa-data-science-bootcamp
Jupyter notebooks from the Creativa Data Science Bootcamp, covering key data science concepts and practices across multiple sessions, from data preprocessing to model building and time series analysis.
data data-science eda exploratory-data-analysis machine-learning pandas time-series-analysis xgboost xgboost-classifier
Last synced: 03 May 2026
https://github.com/askaniy/celestialocationsmaker
Tool for making Celestia location files
celestia data geology locations mapping planetary-science space
Last synced: 14 Mar 2025
https://github.com/stdlib-js/array-uint16
Uint16Array.
array data int integer javascript node node-js nodejs short stdlib structure typed typed-array types uint uint16 uint16array unsigned
Last synced: 22 Apr 2025
https://github.com/yashmistry-24/ytcomment-iq
YTComment-IQ is a web app for analyzing and visualizing YouTube comments, offering insights through sentiment analysis, topic modeling, and interactive charts.
analysis comments data dataanalysis dataanalytics deep-learning machine-learning nlp python streamlit training visualization webapp youtube
Last synced: 15 Feb 2026
https://github.com/sdhutchins/jxn-open-data-api
Access Jackson, MS open government data using a python API wrapper.
api data jackson jxn mississippi open-gov
Last synced: 08 Apr 2025
https://github.com/stdlib-js/array-ones
Create an array filled with ones and having a specified length.
array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector
Last synced: 09 Apr 2025
https://github.com/parimala24-ds/datascientistmlinterviewprep24
DATASCIENTST ML INTERVIEW PREP24
data decisiontree interviewquestions linear-regression logistic machine-learning matplotlib numpy pandas python seaborn sklearn
Last synced: 12 Apr 2025
https://github.com/mskian/tamil-words
Tamil words Collections with English Meaning - API and SQL Data.
api data javascript json json-api mysql pdo php sql tamil tamil-language tamil-sms tamilwords translate translator
Last synced: 14 Apr 2026
https://github.com/mawburn/across-a-thousand-dead-worlds-data
Across a Thousand Dead Worlds Data
Last synced: 21 Apr 2026
https://github.com/jongirard/unique_names_generator
A Unique Names Generator built in Elixir
data data-generator elixir elixir-lang fake-data name-generator phoenix seed
Last synced: 21 Oct 2025
https://github.com/gonzalezlrjesus/covid-19API
Convierte la data ofrecida por: the Johns Hopkins University Center en formato CSV al formato JSON sobre los casos confirmados, muertos y recuperados de COVID-19 por paises.
api api-rest api-server coronavirus covid-19 data go golang json
Last synced: 06 May 2025
https://github.com/sapienzanlp/exploring-srl
Repository for the paper "Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities"
acl acl2023 conllu data dataset natural-language-processing nlp semantic-role-labeling srl
Last synced: 31 Jan 2026
https://github.com/tusharnankani/analysis-2.0
An Exhaustive WhatsApp Chat Data Analysis 2.0
analysis data data-science plots trends visualization
Last synced: 31 Mar 2025
https://github.com/yashika-malhotra/cardioflex-treadmill-analysis-using-descriptive-statistics-probability
Description Analysis and Visualization on CardioFlex Treadmill data to provide insights and recommendations to improve their userbase.
colab-notebook data data-visualization jupyter-notebook matplotlib numpy pandas python seaborn
Last synced: 12 Apr 2026
https://github.com/infinitode/pwlds
A public dataset of over 10 million passwords, with assigned strength levels.
ai classes classification cyber-security data dataset ml open-source password passwords synthetic-data
Last synced: 22 Feb 2026
https://github.com/aravind-selvam/bikeshare-company-analysis
Google Data Analytics Professional Certificate program's Capstone project, of a bike sharing company
analytics business-analytics business-intelligence data data-analysis data-visualization dataanalytics google-data-analytics postgresql sql sql-server
Last synced: 22 Apr 2026
https://github.com/speakeasy-sdks/fivetran-python-sdk
Python SDK for accessing Fivetran API.
api connector data fivetran fivetran-connector python sdk
Last synced: 01 Jul 2025
https://github.com/amazingtest/data4test
测试数据构造生成器,you can get useful data here for software testing
data test-automation testdata testdatabuilder testing testing-tools
Last synced: 16 Jan 2026
https://github.com/swarchal/morar
Processing phenotypic screening data
biology data data-analysis drug-discovery hts phenotypic
Last synced: 19 Jun 2025
https://github.com/cobluestars/dataherd-raika
"Dataherd-Raika is a library designed to simulate large-scale user behavior datasets. It takes a single user event (like a click or keyword input) and, by applying simple probability distributions and custom variables, expands it into a vast dataset."
big-data data data-generation data-generator data-science front-end javascript machine-learning npm-package simulator statistics typescript user-behavior user-experience
Last synced: 02 Jan 2026
https://github.com/alexandregazagnes/ghisa
ghisa - Github Import Statistic Analyzer is a free and open-source software, app and python package that helps you to analyze the import statistics of your github repositories.
analytics data dependencies git github github-api import package pypi python skills tool
Last synced: 27 Jun 2025
https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform
This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.
bigquery data data-analysis data-modeling live-streaming sql
Last synced: 23 Jun 2025
https://github.com/vulcalien/vulcdataformat
Simple data storage system for Java.
data data-storage java serialization
Last synced: 25 Feb 2025
https://github.com/e-kotov/mapineqr
Access Mapineq inequality indicators via API
data demogrpahy r rstats socio-economic-indicators
Last synced: 06 Apr 2025
https://github.com/tobinchilongo/oop-school-library
This project consists of Ruby script for the school library app. I implemented encapsulation and inheritance with Ruby by creating classes to represent students and teachers in the school.
data database gemfile input-output preserve rspec-testing rubocop unit-test
Last synced: 02 May 2026
https://github.com/maxnowack/elastic-sync
Connector to sync mongodb documents into a elasticsearch index
data elasticsearch mongodb sync
Last synced: 20 Jan 2026
https://github.com/stefanpietrusky/facts
Repository for the article in the online magazine Data Science Collective.
ai arxiv-papers beautifulsoup data flask-application gensim llama matplotlib ollama plotly pyldavis python selenium webdriver
Last synced: 09 May 2026
https://github.com/davidgamero/gatech-covid-chart
Line chart showing COVID19 cases per day at Georgia Tech
Last synced: 28 Oct 2025
https://github.com/tushar2704/interview-quest
Interview-Quest is comprehensive collection of interview questions and answers that can help you prepare for technical interviews. Whether you're a seasoned developer looking to brush up on your skills or a job seeker preparing for your next big opportunity, this repository aims to provide valuable resources to enhance your interview readiness.
artificial-intelligence data data-science interview interview-questions machine-learning
Last synced: 23 Jan 2026
https://github.com/dhimmel/erc
Processing human Evolutionary Rate Covariation data
data erc evolution evolutionary-rate-covariation genes hetionet human rephetio
Last synced: 23 Jul 2025
https://github.com/tbrowder/classfactory
Provides tools to create a data collection with classes to manipulate the persistent data.
Last synced: 04 Apr 2025
https://github.com/jebin1999/livestock-production-monitoring-
Livestock production Monitoring
data datascience livestock livestock-monitor r shiny shiny-apps shiny-r shinydashboard
Last synced: 05 Nov 2025
https://github.com/real-veersandhu/cia-country-comparison
Data analysis system on the CIA World Factbook
Last synced: 25 Feb 2025
https://github.com/thomd/git-scrape-hacker-news
scrape hacker news metadata for data analysis
data data-science git-scraping hacker-news
Last synced: 16 Sep 2025
https://github.com/qeeqbox/data-lifecycle-management
Data Lifecycle Management (DLM) is a policy-based model for managing data in an organization
data data-lifecycle-management infosecsimplified lifecycle management qeeqbox
Last synced: 07 Mar 2026
https://github.com/discindo/natochak
Analysis of bicycle accidents in Macedonia using Rmarkdown and ggplot2
Last synced: 19 Feb 2026
https://github.com/lakecountryhuntclub/dnr-map-data-model
Data Model for the 2023 DNR Pheasant Stocking Property Data
data data-model documentation excel gis hunting mapping powerquery vba
Last synced: 29 Jul 2025
https://github.com/gappeah/london-housing-price-dashboard
This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.
data data-analysis data-visualization excel visual
Last synced: 31 Jul 2025
https://github.com/dannyben/datamix
DSL for manipulating tabular data
csv data data-analysis data-engineering gem ruby tabular-data
Last synced: 31 Jul 2025
https://github.com/elhariri78/case-study-a-better-smoker-detector
Case Study-A better Smoker Detector
data dataframe evaluation kaggle matplotlib-pyplot numpy pandas pandas-dataframe pandas-python python3 seaborn sklearn
Last synced: 07 Apr 2026
https://github.com/tonykipkemboi/ens_subgraph_data
Query On-Chain Data from Subgraphs by The Graph Protocol using Python
data subgraphs thegraphprotocol web3
Last synced: 17 Sep 2025
https://github.com/undistraction/grid-model
A small API for creating a grid and accessing the positions of the cells, rows and columns within it.
2d calculations cells data grid layout model
Last synced: 04 Aug 2025
https://github.com/woctezuma/download-steam-screenshots-data
Data consisting of Steam screenshots.
Last synced: 19 Feb 2026
https://github.com/vikjam/ui-policy
Unemployment policy at the state level
data government government-data
Last synced: 13 Feb 2026
https://github.com/qeeqbox/data-classification
Data classification defines and categorizes data according to its type, sensitivity, and value
classification data data-classification infosecsimplified qeeqbox
Last synced: 09 Mar 2026
https://github.com/davidteather/scrape-crossfit-gyms
Scrapes crossfit gym data
cross-fit crossfit data data-scraping python python-requests python3 scraping
Last synced: 13 Aug 2025
https://github.com/stdlib-js/array-one-to-like
Generate a linearly spaced numeric array whose elements increment by 1 starting from one and having the same length and data type as a provided input array.
array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector
Last synced: 20 Feb 2026
https://github.com/pradeep221b/turbofan_predictive_maintenance
An R project for predicting turbofan engine RUL using {targets} and {tidymodels}.
data data-science-portfolio machine-learning nasa preditive-maintaince r rstats targets-pipeline tidymodels
Last synced: 04 Oct 2025
https://github.com/freddy03h/immutable-data-structure
Normalize and Merge your application's data store using Immutable.JS objects
Last synced: 05 Oct 2025
https://github.com/vincentneo/sgtidetimings
Scraped SG NEA tide timings table into machine-readable JSON files!
data github-actions github-pages gov html-tables-to-json javascript json nodejs sg singapore singapore-data-analysis tide webscraping
Last synced: 10 Apr 2026
https://github.com/carlotta94c/sql4datascientistsdemo
Demo material for Microsoft Reactor session "Getting Started with Databases: SQL and Data Visualizations"
analysis data r sqlite tidyverse visualisation
Last synced: 18 Apr 2026
https://github.com/labwhatever/leetcode
Collection of LeetCode questions to ace the coding interview!
data data-structures-and-algorithms dsa leetcode-cpp leetcode-solutions structure structure-learning
Last synced: 22 Aug 2025
https://github.com/aymane-maghouti/mobile-data-hive-insights
This project demonstrates the process of extracting data from a MySQL database, transferring it using Apache Sqoop, storing it in Hive Data warehouse (the data actually is store in Hadoop Distributed File System (HDFS)), and performing analysis using Hive Query Language (Hive QL) (it is a language close to SQL). Then visualize the data in Power BI,
apache-sqoop data data-integration data-visualization hadoop-hdfs hivedb hiveql powerbi
Last synced: 09 Mar 2026
https://github.com/stdlib-js/array-filled-by
Create a filled array according to a provided callback function.
array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector
Last synced: 09 Mar 2026
https://github.com/tatey/list_of_baby_names
A list of baby names given to tiny humans in Ruby
Last synced: 11 Nov 2025
https://github.com/marcelo-earth/h5n8-data
🔢🦠 Confirmed cases of H5N8 in humans - Feel free to open Pull Requests with new data.
csv data h5n8 h5n8-cases h5n8-virus russia
Last synced: 19 Jan 2026
https://github.com/jackokring/www
Generic www flask server with phinka module
compression data flask phinka python
Last synced: 16 Jan 2026
https://github.com/snimmagadda1/stack-exchange-dump-to-mysql
Batch pipeline to import Stack Exchange XML data dumps to relational DB
batch data mysql spring-batch stackoverflow
Last synced: 30 Mar 2025
https://github.com/desmondsanctity/abeona-kafka
A demo to show how to implement Upstash's serverless Kafka to a Node.js microservice. Presented at Berlin Buzzwords 2024
berlin-buzzwords data event-driven kafka microservice serverless streaming upstash-kafka
Last synced: 15 May 2025
https://github.com/bredalis/exceptions
Examples of exceptions 🚫
algotithms coding data exceptions language-programing python
Last synced: 04 Mar 2025
https://github.com/husna-poyraz/titanic-machine-learning
Use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.
data data-analysis data-science data-visualization deep-learning machine-learning missing-data outlier-detection python titanic
Last synced: 10 May 2026
https://github.com/stdlib-js/array-one-to
Generate a linearly spaced numeric array whose elements increment by 1 starting from one.
array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector
Last synced: 26 Feb 2026
https://github.com/neelravi/data-management
A data management plan for computational chemists/physicists and material scientists for a FAIR storage of raw data
data dmp fair management workflows
Last synced: 16 Jan 2026
https://github.com/azeemmirza/structures
Structures Applied
data data-structures javascript typescript
Last synced: 14 Feb 2026
https://github.com/codenoid/webtoons.com-database
a Webtoons.com Database, collected by Hofesh Bot (Scrapper)
Last synced: 28 Mar 2025
https://github.com/castdrian/kdapi
A TypeScript library that scrapes K-pop idol and group information from online sources to create comprehensive JSON datasets.
api data kpop scraper typescript
Last synced: 15 May 2025
https://github.com/stdlib-js/datasets-herndon-venus-semidiameters
Fifteen observations of the vertical semidiameter of Venus, made by Lieutenant Herndon, with the meridian circle at Washington, in the year 1846.
astronomy data dataset datasets grubbs herndon javascript node node-js nodejs outlier outliers sample statistics stats stdlib venus
Last synced: 09 Oct 2025
https://github.com/nouman6093/advanced-statistical-models
in this repository i will upload everything i have learned about data science advanced statistical models. there are over 42 statistical models. each of them work on algorithms. and there are over 32 algorithms. each library has its own way of writing such statistical models. after learning i will try to upload as much statistical models as possibl
data data-analysis data-science data-visualization
Last synced: 11 Jun 2026
https://github.com/ilejuxepwaduzd/structured-data-extractor
🛠️ Extract structured data from messy texts using Chain-of-Thought prompting to improve processing of customer support and technical issues.
cdp chrome-fetcher data document-extraction ecommerce golang-library headless metadata-extraction ocr open-source pdf pdf-converter pdf-extractor ruby scraper shopify spider structured-data
Last synced: 10 Apr 2026
https://github.com/makepath/medaprep
medaprep is a data preparation and feature engineering toolkit for geospatial applications.
data data-science datacleaning eda exploratory-data-analysis xarray
Last synced: 29 Jun 2025
https://github.com/ndohvich/ndohvich
Je suis un grand fan de l'analyse des données avev PYTHON
anaconda arduino data github jypyter keras machine-learning machine-learning-algorithms numpy pandas python scikit-learn sql tensorflow visual-studio-code visualization-dashboard
Last synced: 11 Apr 2026
https://github.com/mews-labs/dataframe-memory
This tools aims to provide simple solution to save memory when using pandas' data frame.
data data-science memory-usage pandas-dataframe python3
Last synced: 22 May 2026
https://github.com/brianali-codes/github-searcher
A website for API experimentation that users the github Api to search for different users and some of their (public) information
Last synced: 21 May 2026
https://github.com/vatshayan/list-of-animals-data-classification-
Classification & Visualization of List of Animals Data set using Machine Learning Algorithm
animal-behavior animal-data animals artificial-intelligence classification data data-analysis data-mining data-science data-visualization dataset jupyter-notebook machine-learning python supervised-learning
Last synced: 17 May 2026
https://github.com/connectomicslab/cmtklib-data
Datalad dataset that stores all data resources of the cmtklib module of Connectome Mapper 3 (https://github.com/connectomicslab/connectomemapper3).
brain data parcellation resources software
Last synced: 16 Jan 2026
https://github.com/yukti-09/extracting-data-from-twitter
Data From Twitter!
data data-mining extracting-data timeline tweepy tweets twitter
Last synced: 11 Oct 2025
https://github.com/mrbisquit/weathercollector
Open-Source weather station data collector
collector customisable data modular opensource weather weather-forecast weather-station
Last synced: 16 Jan 2026
https://github.com/mitevpi/vue-d3-bar-chart
Reusable, reactive, animated bar chart using D3 + Vue.js. Written in idiomatic Vue, rather than D3 syntax.
d3 data data-visualization frontend interactive svg vue web
Last synced: 18 May 2026
https://github.com/ucd-cws/nitrates-cv
california centralvalley data frep groundwater model nitrates
Last synced: 16 Jan 2026
https://github.com/astrid-project/cb-manager
APIs to interact with the Context Broker's database. Through a REST Interface, it exposes data and events stored in the internal storage system in a structured way. It provides uniform access to the capabilities of monitoring agents.
agent beats control data ebpf elasticsearch log logstash management programmability security
Last synced: 30 Jun 2025
https://github.com/mohsinali08000/myportfolio
I’m Mohsin Ali, a passionate software engineer with over 2 years of experience in developing robust software solutions. Currently transitioning into the field of data science.
Last synced: 22 Apr 2026
https://github.com/kingabzpro/makefile-actions
GitHub Actions and MakeFile tutorial and project for beginners.
actions analytics automation data data-science makefile
Last synced: 18 Apr 2026
https://github.com/ahmadjamil888/facial-recognition-ai-model
A facial recognition AI model powered by CNN , and trained by thousands of images.
ai cnn data data-science facial facial-recognition recognition
Last synced: 30 Jun 2025
https://github.com/emnetdegafe/allesoverfilm-backend
AllesOverFilm-backend is part of the AllesOverFilm mobile app development project and contains the database structure, server query scripts, and Sequelize-cli database structures.
backend data data-model express postgresql sequelize-cli
Last synced: 11 Apr 2026
https://github.com/spine-tools/metreload
Python application for downloading meteorological reanalysis data
Last synced: 01 Jul 2025
https://github.com/cosmos-loops/cosmos-dapper
Cosmos.Dapper is a part of Cosmos.Data, a inline project of COSMOS LOOPS PROGRAMME. This repository provides a package of StackExchange.Dapper to improve development efficiency.
dapper data mysql mysqlconnector oracle postgresql sql-query sqlite sqlkata sqlserver
Last synced: 11 Apr 2026
https://github.com/cintia0528/data_analytics_and_visualization-sql_tableau
Evaluate Magist as a strategic partner for Eniac's Brazilian expansion. Use SQL to analyze growth, tech accessory sales potential, delivery times, and customer satisfaction in Magist's database.
data dataanalysis datavisualization sql strategy tableau
Last synced: 31 Mar 2025
https://github.com/tsvikas/covid-19-israel-data
Unofficial Github with the data published by The Israel Ministry of Health, regarding The Coronavirus disease
coronavirus-disease covid-19 csv daily-reports data health israel
Last synced: 05 Jan 2026