data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-02 00:07:45 UTC
- JSON Representation
https://github.com/shreshthvashisht/instgram-user-analytics
SQL Fundamentals
data data-analysis data-science mysql social-network-analysis
Last synced: 09 Jun 2026
https://github.com/abhibisht89/data-visualization
data matplotlib pandas ploty python visualization
Last synced: 06 May 2026
https://github.com/ksm26/ml-ai-data-science-jobs-in-canada
Explore the latest machine learning, artificial intelligence, and data science job opportunities in Canada. Stay informed about Canadian tech job market trends and find your next career move.
ai-canada ai-careers canada canadian-tech-companies canadian-tech-job-market data data-analysis data-engineering data-science data-science-careers machine-learning prompt-engineering robotics
Last synced: 06 May 2026
https://github.com/parthds02/analyzing-student-success-with-data
Discover key factors influencing student performance through data analysis and visualization. Explore gender, parental education, sports, and ethnicity impacts.
data datascience jupyter-notebook kaggle python pythonlibraries
Last synced: 06 May 2026
https://github.com/tadiusfrank2001/pythonprojects
Compilation of Some Fun Introduction to Python Lab Coding Projects introducing the foundamentals of data science, databases, and pythonlibraries
data data-science databases gamedesign python pythonlibrarires sorting-algorithms sqlite string-manipulation
Last synced: 06 May 2026
https://github.com/jbn/vaquero
A Python library for iterative and interactive data wrangling at laptop-scale.
data data-analysis data-cleaning data-mining dirty-data elt etl etl-framework
Last synced: 10 Jun 2026
https://github.com/ralzz/dibimbing_datascience
This project contains an Exploratory Data Analysis (EDA) of the Estonia Passenger List dataset. I handled missing values, removed duplicate data, and created basic visualizations to find insights.
data data-science eda google-colab kaggle pandas python
Last synced: 06 May 2026
https://github.com/fabsdevx/file-format-converter-handout
Data Engineering project for learning purposes. Credits to itversity
csv csv-import data data-engineering database pandas python
Last synced: 06 May 2026
https://github.com/ashleydavis/brisjs-web-scraping-talk
Code to accompany my talk on web scraping for the Brisbane JavaScript meeting in September 2018
cheerio data data-acquisition data-acquisiton electron headless-browsers javascript nightmare nightmarejs nodejs web-scraping
Last synced: 06 May 2026
https://github.com/dakshdeephere/bank_eda-practice
EDA analysis of Bank.csv dataset
analysis data data-visualization dataanalysis matplotlib numpy pandas python3 seaborn
Last synced: 07 May 2026
https://github.com/shantanujpk/bigdatacloud
Exploration of PySpark for data processing and interview prep — demonstrates handling corrupted records, applying transformations/actions, and building efficient data pipelines with practical examples.
big-data data jupyter-notebook pipeline pyspark python spark sparksql
Last synced: 07 May 2026
https://github.com/whis99/data_analysis_journey
A repositories of my data analysis projects.
data data-analysis data-analysis-python data-visualization dataset jupyter-notebook matplotlib python visualization
Last synced: 07 May 2026
https://github.com/lab5e/loadabledata
Simple framework-agnostic wrapper around loadable data to help encapsulate and use state changes in a UI.
async data loadable state typescript ui
Last synced: 07 May 2026
https://github.com/pocketfullofdata/electric-vehicles-market-size-analysis
This project analyzes the growth, adoption trends, and future projections of the electric vehicle (EV) market. Using data analysis and visualization techniques, it examines key factors like sales trends, and consumer adoption to understand the evolving landscape of the EV industry.
analysis data jupyter-notebook matplotlib numpy python seaborn vscode
Last synced: 07 May 2026
https://github.com/hudson-newey/data-miner
A simple data miner that collects information from an API and stores it in a file
api api-client big-data bigdata data logger logging
Last synced: 10 Jun 2026
https://github.com/chardos/get-git-data
Access git repository data in node.
Last synced: 07 May 2026
https://github.com/tjas/postgrad-ai-ddv-plotly
Jupyter Notebook to analyze the salaries of Federal District government public servants, using Python, Pandas and Plotly Express, to solve the proposed exercise in "Data Discovery and Visualization" discipline.
analysis analytics data data-analytics data-discovery data-science data-visualization graph graphs jupyter-notebook jupyter-notebooks pandas plotly plotly-express python
Last synced: 07 May 2026
https://github.com/jigyasag18/iit-guhawati
Empower Sakhi is a data-driven platform that uses machine learning to identify women at risk of domestic violence in India. It offers confidential self-assessments, survivor stories, and emergency resources through a trauma-informed, privacy-focused web app. The project also provides NGOs with actionable insights via Power BI dashboard for support.
aiml data dataset datavisualization domestic-violence eda jupyter-notebook label-encoding machine-learning machine-learning-algorithms machine-learning-models machinelearning machinelearningprojects powerbi python python-app random-forest random-forest-classifier streamlit streamlit-webapp
Last synced: 08 May 2026
https://github.com/kemalcalak/python
computer-vision data data-science fastapi image-processing jupyter-notebook machine-learning python
Last synced: 08 May 2026
https://github.com/aidan-zamfir/the-iliad
Data analysis & relationship network for the characters of Homers Iliad
data data-analysis dataframes networks networkx python selenium spacy webscraping
Last synced: 08 May 2026
https://github.com/blackhatdevx/leetcode
LeetCode Solutions by Jash Gro
algorithm algorithms dart data datastructures datastructures-algorithms dsa java javascript leetcode leetcode-java leetcode-python leetcode-solutions neetcode
Last synced: 08 May 2026
https://github.com/writetome51/page-load-access
A TypeScript/Javascript class that loads a batch (array) of data from a larger set too big to be loaded all at once.
batch class data javascript load loader typescript
Last synced: 16 May 2026
https://github.com/writetome51/public-data-container-interface
Just a TypeScript interface with 1 property: 'data'
container data interface typescript
Last synced: 15 May 2026
https://github.com/miniql/miniql-csv
A MiniQL query resolver that loads data from CSV files.
comma-separated-values csv data query query-language
Last synced: 08 May 2026
https://github.com/taquece/goals-per-match
basic script to calculate average football goals per match from .CSV
beginner csv data football nodejs python sports-analytics
Last synced: 09 May 2026
https://github.com/basemax/okala-product-ids
A PHP script to fetch and save product IDs from Okala's online store API across multiple categories and store branches.
crawler crawler-okala crawler-php crawlers data database ids ir iran json okala okala-crawler php php-crawler product
Last synced: 09 May 2026
https://github.com/caiorss/julia-box-docker
Docker that provides a development environment for Julia language, Octave, Python, R (Rlang) with a Jupyter Notebook; Jupyter QtConsole and so on.
data datascience deveops docker julia jupyter octave python rlang scientific
Last synced: 09 May 2026
https://github.com/pawlo77/nos_snowflake
Network Operating Systems course for DS studies in Winter 2024/25
azure data data-science snowflake snowpark streamlit
Last synced: 09 May 2026
https://github.com/naitiknayak196/tech-layoffs-cleaning-sql-vs-python
This project cleans and analyzes a tech layoffs dataset using MySQL and Python (Pandas) to compare their efficiency in data processing. It provides business insights into workforce trends, industry stability, and economic impacts to support data-driven decision-making.
data datacleaning dataset jyputer-notebook layoffdata layoffs mysql python sql
Last synced: 09 May 2026
https://github.com/mohamedbilal1800/olympic_history_data_analysis
This project delves into the 120 Years of Olympic History: Athletes and Results dataset, analyzing athlete demographics, medal achievements, and country performances across the Summer and Winter Olympics from 1896 to 2016.
analysis data eda matplotlib-pyplot pandas python seaborn visulaization
Last synced: 09 May 2026
https://github.com/xiaomingx/10000-public-apis-and-data
Public APIs are interfaces that allow developers to access various services, features, or data from external systems or platforms.
api-ecosystem api-integration data developer-friendly-apis open-api-access public-api-tools third-party-services
Last synced: 30 Jul 2025
https://github.com/yashkp1234/movie-recommendation-engine
My project on analyzing the movie data set, and creating a recommendation engine using that analysis.
analysis data notebook python recommendation-engine
Last synced: 04 May 2025
https://github.com/baranasoftware/curricular-api
The design and implementation of a REST API for student and course data for a Higher Ed institution.
aws data data-pipeline go golang lambda rest rest-api sqlite3 system-design terraform
Last synced: 09 May 2026
https://github.com/scjoaoantonio/trab_datascience
Este projeto tem como objetivo analisar os posts da rede social Bluesky. A aplicação interativa foi desenvolvida utilizando Streamlit e permite a coleta e visualização de dados, além de oferecer análises avançadas como previsão de engajamento, modelagem de tópicos e análise de sentimentos.
bluesky data data-science streamlit
Last synced: 09 May 2026
https://github.com/thanh-wutan/chess-opening-comparator
Interactive web app using R to visualize and compare chess opening performance and popularity.
chess-openings data databases datavisualisation r
Last synced: 09 May 2026
https://github.com/beeracs/llama
Run Llama models in your web browser using JavaScript and WebAssembly. Explore light and dark modes easily. 🌐🐱👤
ai data fine-tuning framework gpt langchain large-language-models llama3 llamaindex llm lora machine-learning nlp peft qlora qwen rlhf vllm
Last synced: 10 May 2026
https://github.com/aneeshmurali-n/nlp-emotion-classification-in-text
Develop machine learning models to classify emotions in text samples.
bag-of-words data emotion-classification feature-extraction machine-learning naive-bayes natural-language-processing nlp nltk preprocessing python scikit-learn svm text-classification tf-idf tokenizer vectorizer
Last synced: 10 May 2026
https://github.com/adrianoleitedasilva/adrianoleitedasilva
Me chamo Adriano, tenho 35 anos de idade, sendo 18 anos dedicados as áreas de Tecnologia da Informação e Educação.
adrianoleitedasilva automation ceo cio cto data data-science dev diretor github mobile professor python readme techlead web
Last synced: 10 May 2026
https://github.com/notthestallion/pca__3d-and-from-scratch__principal-component-analysis
In this project, I will be implementing Principal Component Analysis (PCA) from scratch on an ecological footprint consummation database for countries and a three-dimensional scale using a movie database. The goal of this project is to gain a deeper understanding of PCA and to demonstrate its capabilities in exploring complex datasets.
data data-science database pca pca-analysis principal-component-analysis principal-component-analysis-pca principle-component-analysis
Last synced: 10 May 2026
https://github.com/infinitode/crsd
A synthetic customer review sentiment dataset for sentiment analysis generated using different AI models.
ai data dataset datasets huggingface-datasets mit-license ml nlp open-source python sentiment sentiment-analysis sentiment-classification text-data
Last synced: 10 Jun 2026
https://github.com/miniql/miniql-json
A MiniQL query resolver that loads data from JSON files.
data json query query-language
Last synced: 11 May 2026
https://github.com/sehaj003/boston-bruins-roster-planning-mysql-nosql
Repository for Data Management project, Boston Bruins Roster Planning using MySQL and NoSQL along with data analysis using Python
data data-management mongodb mysql project-repository python
Last synced: 11 May 2026
https://github.com/amethyst-php/tax
amethyst amethyst-package api data laravel tax
Last synced: 11 May 2026
https://github.com/sakan811/honkai-star-rail-a-few-fun-insights-with-data-analysis
The project gives insights that delve into the Honkai Star Rail's character's stats of all available characters as of the given date.
data data-analysis data-science data-visualization docker flask game honkai honkai-star-rail honkai-starrail seaborn webscraping webscraping-data webscraping-selenium
Last synced: 10 Jun 2026
https://github.com/sebastian-diaz-berdecia/analisis-popularidad-de-series-y-generos-de-series
Consultas SQL para el análisis de la popularidad de series y géneros series de la base de datos NetflixDB.
business-analytics bussiness-intelligence data data-analysis database mysql mysql-database sql
Last synced: 12 May 2026
https://github.com/mateogiuffra/estrd2024s1
trabajos prácticos realizados en la materia Estructura de Datos de la Universidad Nacional de Quilmes (UNQ)
c cpp data data-structures-and-algorithms eficiency functional-programming haskell unq
Last synced: 12 May 2026
https://github.com/vbhatsaccnt/retail-strategy-and-analytics-optimization-of-control-stores-for-sales-enhancement
In this project, we aim to optimize the performance of retail chain stores by establishing control stores based on their performance compared to selected trial stores. By leveraging data analytics and strategic insights, we seek to enhance sales revenue and drive growth within the retail chain.
customer-segmentation data data-science risk-analysis
Last synced: 13 May 2026
https://github.com/miniql/notebook-example
An example of MiniQL in a JavaScript Notebook
comma-separated-values csv data data-analysis data-science graphql javascript notebook query query-language
Last synced: 13 May 2026
https://github.com/cvinicius987/projetos-bigdata
Estudos de caso envolvendo projetos de BigData e Engenharia de Dados.
bigdata data data-engineering spark
Last synced: 13 May 2026
https://github.com/dev88jerry/cs304
Bishop's University - CS304 Data Structures
bishops bu data data-structures python structure university
Last synced: 11 Jun 2026
https://github.com/andygol/andygol.github.io
Andrii Holovin – Product & Project Manager Geospatial Expert / OpenStreetMap Consultant / DevOps practitioner
consultant data data-structures devops experience floss gis mapping navigation openstreetmap personal-site personal-website
Last synced: 13 May 2026
https://github.com/m0nica/datalogues-refresh
:bar_chart: Programming blog focused on data with an emphasis on exploration in Python.
data jekyll python technical-writing
Last synced: 14 May 2026
https://github.com/poojaharihar03/wellness-cities-case-study
A case study for dats analysis of city health centers
Last synced: 11 Jun 2026
https://github.com/iannil/one-data-studio
one-data-studio integrates a data governance and development platform, a cloud-native MLOps platform, and a large model application development platform. It connects the entire value chain from raw data governance to model training and deployment, and further to the construction of generative AI applications.
Last synced: 12 Jun 2026
https://github.com/jurooravec/knwldg
Datasets, scrapers, pipelines
companies crawler data dataset non-profit-organizations scraper scrapy
Last synced: 13 Jun 2026
https://github.com/word2vect/beijing-new-house-data-visualization
Beijing New House Data Visualization for Python Programming 2024 Fall Data Visualization Lab
Last synced: 13 Jun 2026
https://github.com/cannt39t/wylsacom-analysis-reflinks-datamining
data data-analysis data-mining python3 sql
Last synced: 13 Jun 2026
https://github.com/bastianolea/plebiscitos_chile
Datos de resultados electorales de los plebiscitos constitucionales de 2022 y 2023
chile comunas data elecciones politica social
Last synced: 15 Jun 2026
https://github.com/isharescheme/participant-onboarding-portal
Standardized onboarding portal for data space participants.
data onboarding particpant space
Last synced: 15 Jun 2026
https://github.com/spiraldb/spiraldb-nemo-curator
SpiralDB connectors for NVIDIA NeMo Curator
computer-vision data data-curation data-prep data-preparation data-processing data-quality datacuration datarecipes deduplication fast-data-processing multimodal multimodal-ai nvidia-nemo physical-ai python spiral vortex
Last synced: 15 Jun 2026
https://github.com/ayushman0511/data-analytics-project1
This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.
analytics busine data data-anal data-enginee data-sci data-scien database datascien query reporting sql sql-query sql-server window-func
Last synced: 17 Jun 2026
https://github.com/ibttf/bayborhood
Interactive map to find the ideal neighborhood in San Francisco based on data.
data data-analysis data-visualization gis mapbox react
Last synced: 18 Jun 2026
https://github.com/dushansenadheera/web_scraper
web scraper using Python along with BeautifulSoup and Selenium
beautifulsoup data python selenium web-scraping
Last synced: 19 Jun 2026
https://github.com/mbagalman/lattice-doe
Python code to create experimental designs optimized to meet statistical power targets
abtesting data datascience designofexperiments experimentaldesign statistics
Last synced: 19 Jun 2026
https://github.com/rylan12/apscores
A quick way to visualize how the AP score distributions have changed from year to year.
advanced-placement analysis ap-exam data scores
Last synced: 19 Jun 2026
https://github.com/artcc/coredatademo
Demo for CoreDataGenericModule implementation
core coredata coredata-model data encrypted encrypted-data encryption persist
Last synced: 19 Jun 2026
https://github.com/svetlanam/kbl-to-csv-s3
Keboola extractor, that converts excel to CSV based on input mapping criteria and upload to S3 bucket
data data-cleaning data-transformation etl keboola s3-bucket
Last synced: 20 Jun 2026
https://github.com/g-schumacher44/analyst_resource_hub
A collection of guidebooks, quickref, and resources for data analysis
analytics bigquery data lookerstudio machine-learning model python sql yaml-configuration
Last synced: 20 Jun 2026
https://github.com/supunlakmal/coronavirus-covid-19-status
Covid 19 cases and death count for each country in a json file.
coronavirus count country covid-19 covid-data covid19 data data-science data-visualization geographical geographical-information-system json
Last synced: 21 Jun 2026
https://github.com/petzi53/repairdata
Open Repair Alliance Datasets 2021
data open-data open-datasets r repair repair-cafe repairs
Last synced: 22 Jun 2026
https://github.com/peterhellberg/bugsnag-data
Dump Bugsnag data using the Data access API
Last synced: 22 Jun 2026
https://github.com/dineshdhamodharan24/data-analysis
probability Analysis to customers and bascis analysis
analysis data powerbi probability python visualization
Last synced: 23 Jun 2026
https://github.com/anburocky3/cbse-schools-data
Fetch CBSE Schools in seconds and use it for your data projects
cbse data data-analysis data-science grabber nextjs
Last synced: 24 Jun 2026
https://github.com/chrisabruce/scrapling-rs
Adaptive web scraping, built in Rust. A high-performance port of Python Scrapling.
ai ai-scraping automation crawler crawling crawling-rust data data-extraction mcp mcp-server playwright rust-lang scraping selectors stealth web-scraper web-scraping web-scraping-rust webscraping xpath
Last synced: 26 Jun 2026
https://github.com/souza-vitor/stock-market
codecademy data data-analysis data-mining data-science sql sqlite
Last synced: 26 Jun 2026
https://github.com/rodrigojunqueiradev/curso-python-3-do-basico-ao-avancado
Curso de Python 3 do básico ao avançado - com projetos reais
data data-analysis data-science python python-3 python-library python-script python3
Last synced: 27 Jun 2026
https://github.com/fgazzelloni/20240930-dwpwr
Data Wrangling Practice with R - 30 September Tutorial for R-Ladies Rome
data data-science data-structures data-wrangling
Last synced: 28 Jun 2026
https://github.com/nitheshgoutham/sentinel-2-data-processing-for-pichavaram-mangrove-forest-using-cnn
Image Processing using CNN
cnn cnn-classification cnn-keras data deep-learning matplotlib ploty python seaborn-python visualization
Last synced: 29 Jun 2026
https://github.com/operto/data-engineering
Local Env for Data Engineering
apache apache-superset big-data data data-engineering data-lake data-science duckdb exploration parquet query-engine superset visualization
Last synced: 29 Jun 2026
https://github.com/davorg/towerbridge
When is Tower Bridge lifting?
data hacktoberfest london perl web-scraping
Last synced: 29 Jun 2026
https://github.com/bhojpur/dlm
The Bhojpur DLM is a software-as-a-service product used for Data Lifecycle Management based on Bhojpur.NET Platform for data delivery.
Last synced: 19 Feb 2026
https://github.com/madhuresh2011/daily-sql-from-hackerrank
Welcome to my SQL Series, where I tackle SQL problems from HackerRank on a daily basis.
data dataanalysis database question-answering sql
Last synced: 19 Jan 2026
https://github.com/mubashirsidiki/olympics-data-enigeering
Worked with Azure Data Factory, Databricks, Data Lake Storage, and Synapse Analytics to build an ETL pipeline for processing and analyzing Olympic Games data from Kaggle.
analytics azure big-data data dataengineering devops pipeline
Last synced: 02 May 2026
https://github.com/jesuscc1993/data-cleaner-extension
Clears browser data in a single click.
application-data chrome chrome-extension data
Last synced: 02 May 2026
https://github.com/adadalshabab/data-engineering-gcp-project
An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.
bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi
Last synced: 19 Jan 2026
https://github.com/tyriek-cloud/nyc-dca-etl
Created an ETL pipeline to merge two CSV files (converted to JSON) into a parquet file using Azure Data Factory, The data was extracted from NYC Open Data: https://opendata.cityofnewyork.us/ and I created a Blob Container within an existing storage account.
azure azure-data-factory blob-storage data data-engineering etl-pipeline
Last synced: 21 Jan 2026
https://github.com/luminati-io/httpx-web-scraping
Web scraping using HTTPX in Python, covering setup, advanced features, comparisons with Requests, and more.
beautifulsoup data html httpx python web-scraper web-scraping
Last synced: 13 Oct 2025
https://github.com/bertrand31/one-billion-rows-challenge
🌪️ Pushing Scala to its limits to aggregate a billion rows' worth of data in 2.42 seconds
competitive-programming competitive-programming-contests data data-engineering data-processing performance scala
Last synced: 05 Sep 2025
https://github.com/parablelab/parable
Work in progress...
data data-management data-platform data-validation database pipelines
Last synced: 28 May 2026