data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-03 00:07:49 UTC
- JSON Representation
https://github.com/hyperversal-blocks/averveil
Averveil is OpenSea for Data.
blockchain data golang iot privacy zero-knowledge zkp
Last synced: 14 Jan 2026
https://github.com/tushar2704/insurance-cross-sell
This project harnesses the power of cutting-edge technologies including H2O AutoML, MLflow, FastAPI, and Streamlit to enhance cross-selling campaigns and boost efficiency.
data datascience h20automl machine-learning mlflow python streamlit-tushar2704
Last synced: 08 Oct 2025
https://github.com/nikoshet/rust-dms-cdc-operator
The rust-dms-cdc-operator is a Rust-based utility for comparing the state of a list of tables in an Amazon RDS database with data stored in Parquet files on Amazon S3, particularly useful for change data capture (CDC) scenarios.
aws cdc data dms parquet pgdatadiff polars postgres rds rust s3 validation
Last synced: 18 Jan 2026
https://github.com/ryanjoy0000/yt-notifier
Youtube Notifier (Telegram Bot) - A real time data processing pipeline
data go kafka-streams real-time telegram-api youtube-api
Last synced: 14 Jan 2026
https://github.com/mewmix/drivehound
magic file signatures + python drive recovery magic
data disk file-signatures harddrive python recovery recovery-tool
Last synced: 08 Oct 2025
https://github.com/jakakokosar/bioinformatics-serverfiles
Knowledge base for Orange3-bioinformatics add-on
bioinformatics data dictybase gene genesets go homologene markergenes ncbi serverfiles
Last synced: 16 Apr 2026
https://github.com/varun-khorgade/sentimentscope-e-commerce-review-analyzer
Analyzed customer reviews and purchase data to extract sentiment and behavioral insights. Built SQL-based ETL for data preparation and visualized results using Python and Power BI dashboards for actionable business decisions.
analytics customer-beheviour dashboard data data-visualization dataextraction natural-language-processing nlp pandas powerbi python sentiment-analysis sql textblob
Last synced: 17 Apr 2026
https://github.com/kamal-singh22/ai-driven-emotional-sentiments-analysis
This project leverages machine learning to analyze and classify the emotional sentiment of textual data. The goal is to accurately identify and categorize emotions, aiding applications in customer feedback analysis, social media sentiment analysis, and mental health monitoring.
analysis artificial-intelligence data emotion nlp-machine-learning python sentiment-analysis streamlit text-classification
Last synced: 14 Apr 2026
https://github.com/DefinetlyNotAI/VulnScan_Data
Logicytics VulnScan Module's Training Data and old model archive
ai data logicytics ml models pytorch sensitive-files text-processing tfidf-text-analysis training-data
Last synced: 17 Aug 2025
https://github.com/dylanhogg/cloud-products
A package for getting cloud products and product descriptions from a cloud provider website.
aws cloud-products crawler data text-processing
Last synced: 05 Oct 2025
https://github.com/east-empire-trading-company/eetc-data-client
Client library for retrieving data managed by EETC Data Hub.
client-library data data-science finance library python
Last synced: 31 May 2026
https://github.com/alexandregazagnes/rica-analysis
This repository contains the code to download, analyse, and modelize the RICA dataset from the french ministry of agriculture.
analysis argiculture business data data-analysis data-analytics food python
Last synced: 29 Apr 2026
https://github.com/famarks/grafarg
Grafarg is an interactive data analytics and graphical data visualization application. Grafarg being a progressive fork of Grafana 7.5.17 continues to be available under open source Apache 2.0 License
analytics charts data data-analysis data-science data-visualization grafana grafarg graph
Last synced: 19 Jan 2026
https://github.com/davorg/dmp
Data Munging with Perl
book data hacktoberfest munging perl
Last synced: 21 Jan 2026
https://github.com/garcane/income-prediction-ml
This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.
data data-science machine-learning ml numpy pandas python random-forest scikit-learn
Last synced: 08 Apr 2026
https://github.com/saroshfarhan/kaggle-playground-s4e12
Kaggle competition first attempt
analytics data data-analysis-python data-science
Last synced: 12 Oct 2025
https://github.com/rohancyberops/r-language
R Language Projects directory. This repository contains various projects, scripts, and experiments developed using R, a powerful statistical computing and data visualization language.
caret cran data dplyr ggplot2 rlanguage rstudio shiny tidyverse
Last synced: 12 Oct 2025
https://github.com/rishabh-agarwal/datastructuremachineproblem
Data Structure MP - Clemson University (Language C)
273 alogrithms clemson data ece structure university
Last synced: 26 Oct 2025
https://github.com/anobaka/insidecollector
这是一个介于Excel和纯记录工具之间的软件,您可以自由创建各种列表,然后将其以各种规则关联起来,并且可以创建自定义视图帮助您更好地理解数据。
collection data excel-like list list-manager table
Last synced: 19 Jan 2026
https://github.com/iamgmujtaba/github-python-daily-trending
This repository provides an automated, daily-updated list of the top trending Python repositories on GitHub. Using a GitHub Actions workflow, it scrapes data from GitHub's trending page, sorts the results by total stars, and generates a clean, well-structured README file
data data-scraping github-actions tranding tranding-bot
Last synced: 13 Oct 2025
https://github.com/Lemniscate-world/StratAI
This project analyzes financial assets using a Hidden Markov Model (HMM) to identify different market regimes and patterns. The analysis includes calculating daily returns, rolling volatility, and volume changes, and visualizing the hidden states identified by the HMM.
ai assets data data-science data-visualization finance financial-analysis fintech hmm-model hmmlearn machine-learning trading
Last synced: 13 Oct 2025
https://github.com/tberey/social-stocks
A Graphical Data and Analysis Tool
data data-analysis data-science data-stream data-visualization database javascript mysql mysql-database node nodejs rest rest-api social-stocks stock-market stocks ticker-data tickers trends typescript
Last synced: 21 Jan 2026
https://github.com/connectaman/deepseek-ocr-multigpu-infer
Efficient multi-GPU OCR inference framework leveraging parallel processes for accelerated token throughput and faster batch processing. Designed for scalable, high-performance optical character recognition workloads using PyTorch. Supports dynamic GPU assignment, optimized resource utilization, and easy integration for large-scale image datasets.
agentic-extraction data deepseek document-parser extraction extractor gpu image-parser llm multigpu nvidia ocr parallel-computing parser pdf-parser vlm
Last synced: 22 Jan 2026
https://github.com/player29879/neum-ai
Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.
ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors
Last synced: 18 Apr 2026
https://github.com/ompreetham/dcn-network-traffic-anomaly-detection
Data Communication Networks - Network Traffic Anomaly Detection
anomaly anomaly-detection communication data dcn keras learning machine machine-learning network pandas presentation project python scikit-learn tensorflow traffic
Last synced: 08 Apr 2026
https://github.com/lisakey/datacamp-data-analyst-python-sql-projects
Several projects completed during my Data Analyst 📊 training on the DataCamp platform with Python 🐍 and SQL 🗃️. Each project addresses real-world challenges using modern analytical tools and techniques.
analysis cleaning-data data dataanalysis dataanalyst matplotlib pandas python seaborn sql transformation visuali
Last synced: 19 Apr 2026
https://github.com/lahcenezzara/whatsapp-scraping-python
WhatsApp Scraping Python
automation data python scraping selenium whatsapp
Last synced: 05 Feb 2026
https://github.com/vvipjain/hockey-tournament-analysis
Hockey Tournament Analysis
beautifulsoup data data-analysis data-visualization databases pandas pandas-dataframe powerbi python python-library python-script requests-library-python sql sql-server sqlalchemy
Last synced: 27 Jan 2026
https://github.com/stdlib-js/ndarray-base-dtypes2signatures
Transform a list of array argument data types into a list of signatures.
api array base data dtype dtypes interface javascript multidimensional ndarray node node-js nodejs sig signatures stdlib types utilities utility utils
Last synced: 14 Apr 2026
https://github.com/soenneker/soenneker.quark.table
A native Blazor table component.
blazor blazorlibrary csharp data dotnet html quark quarktable table tables
Last synced: 13 Aug 2025
https://github.com/yeshunit/walmart-product-customer-sales-sql-analysis
This project aims to explore the Walmart Sales data to understand top performing branches and products, sales trend of of different products, customer behaviour. The aims is to study how sales strategies can be improved and optimized. The dataset was obtained from the Kaggle
data database mysql sql walmart
Last synced: 24 Feb 2026
https://github.com/helosantosdesousa/analise-previsao-de-rotatividade-ml
Projeto final do Bootcamp Data Girls 2025 que analisa a rotatividade de funcionários usando Machine Learning. Com base no dataset IBM HR Analytics Attrition, o projeto identifica os principais fatores de risco e cria modelos preditivos (SVC e Random Forest) com até 89% de acurácia para antecipar saídas e apoiar decisões estratégicas de RH.
analise-de-dados analise-exploratoria bootcamp ciencia-de-dados colab-notebook dados data data-analysis data-science dataanalytics dataframe eda machine-learning machine-learning-algorithms pandas python random-forest svc
Last synced: 16 Apr 2026
https://github.com/r12habh/datacamp.com-micro_projects
data data-analysis data-science datascience python python3
Last synced: 23 May 2026
https://github.com/kledenai/jsonweaver
A powerful and easy-to-use library for transforming JSON data into popular formats such as CSV, XML, Markdown tables, YAML, and JSONLines (NDJSON).
csv data data-transform format json jsonlines jsonweaver markdown markdown-tables xml yaml
Last synced: 24 Feb 2026
https://github.com/sanskaryo/ultimate-dsa-repo
One Stop Solution for DSA Learning and Resources
data data-structures-and-algorithms dsa hacktoberfest hacktoberfest-accepted hacktoberfest2025
Last synced: 15 Oct 2025
https://github.com/akv3sic/cryptocurrency-charts
Cryptocurrency API data visualizations 📈 with Matplolib.
cryptocurrency data data-visualization matplotlib python
Last synced: 16 Oct 2025
https://github.com/bishtrishu/pizza_sales_data_analysis_sql
This project is a comprehensive data analysis of pizza sales, aimed at uncovering key insights and trends to inform business decisions. Using a combination of SQL, Python, and data visualization tools, the project analyzes sales data to understand customer preferences, peak sales periods, and the most popular pizza types.
cloud data data-analysis data-science data-visualization dataanalytics database mysql oracle-database
Last synced: 14 Apr 2026
https://github.com/mscbuild/analysis
🎢 This collection of data analysis projects demonstrates techniques for extracting, transforming, analyzing, and visualizing data. Data Analytics Projects for Beginners 📈 ⚡
anallysis analysis chart csv dashboard data data-science data-science-projects excel google html5 mashine-learning portfolio pyton
Last synced: 19 Oct 2025
https://github.com/gematik/poc-isik-patient-merge
The repository contains a proof of concept (POC). The POC demonstrates how a FHIR subscription can be used to inform about happened merges within the ISIK context.
Last synced: 19 Oct 2025
https://github.com/semibran/img-data
Easily read from and write to ImageData instances
Last synced: 11 Aug 2025
https://github.com/divithraju/divith-aju-hadoop-pyspark-pipeline
This project demonstrates the creation of a scalable data processing pipeline for handling and analyzing log data from a hypothetical e-commerce platform. Leveraging Hadoop and PySpark, the pipeline is designed to process large volumes of log files, providing meaningful insights into user behavior, system performance, and sales metrics.
apache-hadoop-framework apache-spark bigdata client data database dataengineering dataingestionframework datapreprocessing documentation ecommerce-platform hdfs pipeline project project-repository pyspark python3 software-engineering
Last synced: 27 Jan 2026
https://github.com/waqaszafar9/cricket-managment-database
cricket website data base mangment project
cricket-data cricket-dataset data data-structures database database-management database-management-system database-schema oracle oracle-database sql sql-query
Last synced: 10 Aug 2025
https://github.com/sksubhadeep/airbnb-dashboard-tableau
Airbnb Dashboard Using Tableau
airbnb data tableau tableau-public visualization
Last synced: 19 Mar 2026
https://github.com/mohamedmaher-dev/mena
Middle East and North Africa country data utilities for Dart/Flutter.
api arabic-localization capitals country-codes country-data country-flags currencies dart data flutter internationalization localization mena mena-countries mena-region middle-east north-africa offline-data package
Last synced: 21 Feb 2026
https://github.com/rubenhortas/python_examples
Examples of Python code and DSA (data structures and algorithms).
algorithm algorithms data dsa examples python python-3 python3 samples snippets structures
Last synced: 03 Oct 2025
https://github.com/ddeutils/ddedocs
📖 Data Developer & Engineer Documents and Hands-On
blogs data data-engineering documents hands-on
Last synced: 08 Aug 2025
https://github.com/fiddlydigital/fastmap
A simple 2D map that is optimized for speed.
Last synced: 23 Oct 2025
https://github.com/cisagov/cyhy-feeds
Tools to create and retrieve Cyber Hygiene (CyHy) data extracts
Last synced: 23 Oct 2025
https://github.com/rodekruis/510-data-catalog
The Project is CKAN based Data Catalog Portal for 510
Last synced: 23 Jan 2026
https://github.com/atymri/linqsimulator
LINQ Simulator is an interactive C# console application designed to let you experiment with LINQ queries in real time.
console csharp data data-analysis linq query sql
Last synced: 23 Oct 2025
https://github.com/purarue/git_doc_history
copy/track file history in git, with python bindings to traverse and extract history/files/lines at some date
Last synced: 17 May 2026
https://github.com/mustika-putri-m/analysis-of-sales-transactions-in-an-online-shop---london
Crucial Question 1. How was the sales trend over the months? 2. What are the most frequently purchased products? 3. How many products does the customer purchase in each transaction? 4. What are the most profitable segment customers? 5. Based on your findings, what strategy could you recommend to the business to gain more profit?
data data-analysis-python data-analytics data-visualization ecommerce
Last synced: 24 Oct 2025
https://github.com/garcane/Income-Prediction-ML
This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.
data data-science machine-learning ml numpy pandas python random-forest scikit-learn
Last synced: 24 Oct 2025
https://github.com/sourceduty/data_hardware
🖥️ Comparing various hardware configurations needed for different data sizes, from personal laptops to mainframes.
calculation computer-hardware computer-science computers data data-calculation data-hardware data-processing data-project hardware hardware-configuration hardware-requirements hardware-science math process-programming programming python
Last synced: 08 Aug 2025
https://github.com/tiaanduplessis/country-currency-data
Data about currencies of countries
countries currencies data symbols
Last synced: 08 Aug 2025
https://github.com/priyanshubiswas-tech/pwc-power-bi-task-1-2
Power BI dashboards analyzing Phonenow's call center performance and customer retention. Task 1 focuses on KPIs like satisfaction rating, call count, and agent efficiency. Task 2 analyzes retention trends and customer behavior to enhance loyalty. Built using Power BI, DAX, and Excel.
dashboard data data-analysis dax-measures excel powerbi powerbidashboard
Last synced: 23 Jan 2026
https://github.com/doziestar/datavinci
DataVinci enables you to visualize data from various sources, generate insights, analyze data with AI models, and receive real-time updates on anomalies
Last synced: 23 Jan 2026
https://github.com/ayushverma135/sas-health-metrics-analysis-bmi-categorization-and-gender-insights
Using SAS, this project processes Excel data on individual statistics and health metrics. It calculates BMI, categorizes health status, and visualizes distributions through pie charts.
analytics data excel sas sasprogramming statistical-analysis
Last synced: 24 Feb 2026
https://github.com/imahdimir/githubdata
A very simple Python package to easily download from and manage a GitHub "Data Repository"
data data-repository python-package
Last synced: 23 Jan 2026
https://github.com/mihaiconstantin/lavot
A `React` application that allows users to indicate how votes will be redistributed among candidates for the second round of Romanian presidential elections.
data data-visualization elections react sankey typescript
Last synced: 06 Feb 2026
https://github.com/aleenprd/docbt
Documentation Build Tool - Generate YAML documentation for dbt models with optional AI assistance. Built with Streamlit for an intuitive and familiar web interface.
ai analytics-engineering bigquery data data-modeling data-science dbt docker llm lmstudio ollama openai snowflake sql streamlit
Last synced: 11 Nov 2025
https://github.com/maccccd/wsoa3029a_2444372
This website serves an extension of my portfolio work. It focuses specifically on showcasing my understanding of D3.js , a JavaScript library used to create interactive data visualizations. The visualizations in here were used to provide insights on two types of cybersecurity attacks: Phishing & Ransomware.
d3js data hacking visualization
Last synced: 24 Jan 2026
https://github.com/city-of-helsinki/drupal-helfi-tyollisyyspalvelut-manuaali
Työllisyyden kuntakokeilujen palvelutietovarannon manuaali
data drupal drupal-9 unemployment
Last synced: 24 Jan 2026
https://github.com/theryston/db-mycro
A node module with a json database that saves data in a specific directory, similar to sqlite, but in JSON
base crud data database db db-mycro javascript json jsondatabase nodejs nosql typescript
Last synced: 09 Apr 2026
https://github.com/fairspec/fairspec-typescript
Fairspec TypeScript is a fast data management framework built on top of the Fairspec standard and Polars DataFrames
ckan csv data dataframe dataset excel fair json ods polars quality schema sqlite table typescript validation zenodo
Last synced: 09 Feb 2026
https://github.com/alejo1630/titanic_kaggle
This Python Notebook is a proposal to analyse the Titanic dataset for the Kaggle Competition, using several data science techniques and concepts.
data data-science jupyter-notebook notebook python titanic-survival-prediction
Last synced: 03 May 2026
https://github.com/itu-helper/data-updater
Periodically scrapes data related to ITU to be used by anyone. This data powers the ITU Helper web sites.
data istanbul-technical-university scraper selenium-python
Last synced: 29 Jan 2026
https://github.com/priyanshubiswas-tech/deloitte-daikibo-forensic-analysis-task-2
Forensic pay equity analyzer for Deloitte. Processes compensation data to classify gender equality scores into Fair/Unfair/Discriminative tiers. Outputs modified Excel with 3-tier evaluation system.
data data-analysis deloitte excel forensic-analysis
Last synced: 06 Feb 2026
https://github.com/jinsyin/datagovernance
公众号:「数据之道」
data data-governance datagovernance governance
Last synced: 30 Jan 2026
https://github.com/sandk21/etude_eau_potable_monde
Etude sur l'accès à l'eau dans le monde - Tableaux de bord avec Tableau
analysis data tableau tableau-public visualization
Last synced: 19 Mar 2026
https://github.com/simranjeet97/leetcode_practice
Practicing the Leet Code Codes for Competitive Programming
algorithms amazon coding competitive-programming data data-structures facebook google leetcode python
Last synced: 03 Aug 2025
https://github.com/simranjeet97/quotes-analysis
Kaggle Dataset on Quotes Analysis and Visualization With Python, Pandas and MatplotLib Using Jupyter Notebook.
data data-science datavisualization jupyter-notebook kaggle kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python quotes quotes-application
Last synced: 15 Apr 2026
https://github.com/inphyt/quantitative_single_neuron_modeling_competition_2009
Data for the Quantitative Single-Neuron Modeling Competition (2009).
bayesian-inference bayesian-methods bayesian-optimization bayesian-statistics challenge competition computational-neuroscience data electrophysiological-data electrophysiology-data model-calibration modeling neuronal-models neuroscience neuroscience-competition parameter-estimation simulation simulation-modeling single-neuron-model uncertainty-quantification
Last synced: 25 Feb 2026
https://github.com/dandre3000/matrix
Matrix library
algebra array data data-structure math matrix vector
Last synced: 01 Feb 2026
https://github.com/BenSFGamer/B.I.O.S.
A biographer
academia agi ai ai-tools artificial-general-intelligence artificial-neural-networks automation data fact-checking information-extraction information-retrieval self-improving self-learning self-referential semi-autonomous software-engineering specialization web-agent web-scraping writer
Last synced: 27 Sep 2025
https://github.com/openearth/rws-viewer
This viewer is created by Deltares in cooperation with Voorhoede under OpenEarth GPL License. The viewer can be used via several RWS websites, please visit https://www.informatiehuismarien.nl/, https://waterinfo-extra.rws.nl/ and https://basismonitoringwadden.waddenzee.nl/.
data mapbox-gl-js ogc-services viewer
Last synced: 01 Feb 2026
https://github.com/jub0t/eso
An application to manage all your Encryption & Decryption keys and other related tools.
data encryption encryption-decryption hacking hacking-tool keys pgp privacy private
Last synced: 07 Feb 2026
https://github.com/giladbarnea/to
A simple CLI tool to convert and diff between JSON, YAML, TOML, JSON5 and Python collections.
conversion data data-conversion json json5 parser script terminal toml yaml
Last synced: 08 Feb 2026
https://github.com/raymondcm/strawberrydata
Tool suite for fast multi-camera strawberry data collection project. The standards document houses cross compatibility/purpose implementation details.
camera cpp data intel multi-camera
Last synced: 08 Feb 2026
https://github.com/jeanmanguy/milk-sci-fi
Census of every mention of milk in sci-fi works.
Last synced: 26 Feb 2026
https://github.com/garcane/cookie-company-visual-dashboard
This Excel-based interactive dashboard provides a comprehensive overview of the Cookie Company's sales performance and key metrics.
dashboard data data-visualization excel microsoft-excel
Last synced: 09 Feb 2026