An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with data-exploration

A curated list of projects in awesome lists tagged with data-exploration .

https://github.com/kanaries/pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

data-analysis data-exploration dataframe matplotlib pandas plotly tableau tableau-alternative visualization

Last synced: 09 Sep 2025

https://github.com/Kanaries/pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

data-analysis data-exploration dataframe matplotlib pandas plotly tableau tableau-alternative visualization

Last synced: 26 Mar 2025

https://github.com/sfu-db/dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

apis apiwrapper cleaning connector data-exploration data-science datacleaning dataconnector dataprep datapreparation eda exploratory-data-analysis webconnector

Last synced: 14 May 2025

https://github.com/comet-ml/kangas

๐Ÿฆ˜ Explore multimedia datasets at scale

data-analysis data-exploration dataframe datagrid machine-learning

Last synced: 14 May 2025

https://github.com/keen/explorer

Data Explorer by Keen - point-and-click interface for analyzing and visualizing event data.

analysis analytics analytics-api charts data-exploration data-visualization dataviz keen-io native-analytics web-analytics

Last synced: 15 May 2025

https://github.com/marmotdata/marmot

Marmot helps teams discover, understand, and leverage their data with powerful search and lineage visualisation tools. It's designed to make data accessible for everyone.

bigdata data-catalog data-collaboration data-discovery data-exploration data-governance data-lineage data-observability datacatalog datadiscovery dataengineering lineage mcp mcp-server metadata

Last synced: 09 Apr 2026

https://github.com/desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 22 Nov 2025

https://github.com/Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 03 Apr 2025

https://github.com/tkrabel/edaviz

edaviz - Python library for Exploratory Data Analysis and Visualization in Jupyter Notebook or Jupyter Lab

altair data-analysis data-exploration data-sciene data-visualization eda edaviz exploratory-data interactive jupyter-notebook matplotlib pandas plotly project-jupyter pyhon qgrid seaborn

Last synced: 13 Jun 2025

https://github.com/rolkra/explore

R package that makes basic data exploration radically simple (interactive data exploration, reproducible data science)

data-exploration data-visualisation decision-trees eda r rmarkdown shiny tidy

Last synced: 08 Apr 2025

https://github.com/virajbhutada/bi-projects-collection

Discover a curated collection of dynamic Power BI dashboards covering financial analytics, HR metrics, streaming service trends, real estate dynamics, and more. Meticulously designed for comprehensive data exploration, this repository continues to expand with new and impactful visualizations.

analytical-insights data-analytics data-exploration data-visualization dynamic-dashboards healthcare-analysis hr-management powerbi trends-visualization visual-reporting

Last synced: 01 Mar 2026

https://github.com/facultyai/lens

Summarise and explore Pandas DataFrames

dask data-exploration data-science data-visualisation dataframe pandas

Last synced: 14 Apr 2025

https://github.com/renumics/sliceguard

A library for detecting problematic data segments in structured and unstructured data with few lines of code.

data-analysis data-cleaning data-curation data-exploration data-science data-visualization deep-learning eda exploratory-data-analysis machine-learning python visualization

Last synced: 16 Mar 2025

https://github.com/DistrictDataLabs/cultivar

Multidimensional data explorer and visualization tool.

data-analysis data-exploration data-management visualization

Last synced: 15 Apr 2025

https://github.com/districtdatalabs/cultivar

Multidimensional data explorer and visualization tool.

data-analysis data-exploration data-management visualization

Last synced: 01 Feb 2026

https://github.com/denisotree/tuitab

A fast, keyboard-driven terminal explorer for tabular data. Open CSV, JSON, Parquet, Excel and SQLite files directly in your terminal โ€” filter, sort, pivot, compute new columns, and visualise distributions without leaving the shell

cli csv data-analysis data-exploration dataframe duckdb excel parquet polars ratatui rust spreadsheet sqlite tabular-data terminal tui vim

Last synced: 15 Jun 2026

https://github.com/DECODEproject/bcnnow

Light, personalized, interactive dashboards for urban data exploration.

data-exploration data-visualization urban-dashboards

Last synced: 02 Apr 2025

https://github.com/darenasc/aeda

Build a data catalog by running a single line of code

data-catalog data-exploration database eda metadata metadata-extraction

Last synced: 05 Mar 2026

https://github.com/ndleah/health-analysis

This case study is contained within the Serious SQL course by Danny Ma

data-analysis data-exploration data-with-danny database serious-sql sql

Last synced: 02 Mar 2025

https://github.com/virajbhutada/us-healthcare-analysis-powerbi

Unlock insights into the U.S. healthcare landscape from 2019 to 2020. Our PowerBI-driven analysis delves into hospital performance, patient outcomes, and payer-provider dynamics. Dive into detailed reports and visualizations for informed decision-making, empowering healthcare stakeholders, and shaping the industry's future.

data-analytics data-exploration data-modeling data-visualization datascience dax-expression decision-making healthcare-analysis healthcare-datasets insights interactive-visualizations microsoftpowerbi power-query powerbi powerbi-dashboards powerbi-desktop strategic-planning

Last synced: 04 Mar 2026

https://github.com/open-edge-platform/annflux

A research tool for exploring and annotating large datasets with Active Learning

active-learning annotation classification clustering data-exploration large-scale machine-learning multilabel-clustering

Last synced: 08 Feb 2026

https://github.com/vidhi1290/deep-learning-for-eeg-emotion-classification

This repository contains a Python code script for performing emotion classification using EEG (Electroencephalogram) data. Emotion classification from EEG signals is an important application in neuroscience and human-computer interaction. The code leverages deep learning techniques to analyze EEG data and predict emotional states.

coorelation data-exploration data-preprocessing data-science data-visualization deep-learning deep-learning-algorithms eeg-emotion-recognition egg-signals emotion-distribution emotion-prediction feature-analysis heatmap human-emotions machine-learning machine-learning-algorithms pie-chart spectral-analysis time-series-visualization

Last synced: 10 Apr 2025

https://github.com/aiguofer/sql_connectors

A simple wrapper for SQL connections using SQLAlchemy and Pandas read_sql to standardize SQL workflow with multiple data sources.

data-analysis data-analytics data-exploration data-science pandas relational-databases sql sqlalchemy standardized-api

Last synced: 13 Oct 2025

https://github.com/darenasc/auto-fes

Automated exploration of files in a folder structure to extract metadata and potential usage of information.

data-exploration data-profiling data-science eda plain-text python

Last synced: 16 Mar 2025

https://github.com/copyleftdev/x12-edi-tools

A comprehensive set of tools for working with X12 EDI files

data-exploration dental x12 zuub

Last synced: 01 Jul 2025

https://github.com/ideas-lab-nus/eplusr-paper

Data and code for Jia and Chong (2020): Hongyuan Jia and Adrian Chong (2020). eplusr: A framework for integrating building energy simulation and data-driven analytics. (Accepted in Energy and Buildings).

bayesian-calibration data-analysis data-driven-analytics data-exploration energyplus eplusr multi-objective-optimization parametric-analysis r

Last synced: 21 Feb 2026

https://github.com/virajbhutada/US-healthcare-analysis-powerBI

Unlock insights into the U.S. healthcare landscape from 2019 to 2020. Our PowerBI-driven analysis delves into hospital performance, patient outcomes, and payer-provider dynamics. Dive into detailed reports and visualizations for informed decision-making, empowering healthcare stakeholders, and shaping the industry's future.

data-analytics data-exploration data-modeling data-visualization datascience dax-expression decision-making healthcare-analysis healthcare-datasets insights interactive-visualizations microsoftpowerbi power-query powerbi powerbi-dashboards powerbi-desktop strategic-planning

Last synced: 09 Oct 2025

https://github.com/eikevons/pandas-paddles

Access the parent Pandas data frame in loc[], iloc[], assign(), and others Pandas helpers

data-analysis data-exploration data-science pandas pandas-dataframe pandas-library pandas-loc

Last synced: 16 Jun 2025

https://github.com/jgphilpott/polyplot

A data exploration application inspired by Ola Rosling's Trendalyzer software.

d3js data-exploration data-science ola-rosling threejs trendalyzer

Last synced: 11 Jul 2025

https://github.com/coding-chemist/datalens

A smart dashboard that provides automated insights and visualizations from your data. With just a few clicks, explore trends, statistics, and data quality to make informed decisions effortlessly.

data-cleaning data-exploration datalens matplotlib nltk numpy pandas streamlit

Last synced: 29 Jan 2026

https://github.com/mndrake/cliffnotes

visual summary of an R dataframe

data-exploration data-visualization r

Last synced: 23 Jan 2026

https://github.com/phillipdupuis/mbta-api-playground

Learn about the MBTA V3 API by building queries and exploring the results

data-exploration django django-rest-framework mbta-api pandas pandas-dataframe pandas-profiling python

Last synced: 22 Jan 2026

https://github.com/asifdotexe/covidporfolioproject

This is a SQL + Tableau Project on real world Covid 19 Dataset from the start of recorded case to 2nd March 2022 i.e My birthday XD

dashboard data-analysis data-exploration data-visualization sql sql-server tableau

Last synced: 08 Jun 2026

https://github.com/mpolinowski/hotel-booking-dataset

Python Pandas Dataset Exploration with Hotel Demand Data.

data-exploration hotel-booking pandas python

Last synced: 20 Apr 2026

https://github.com/acook/enumerable_deep_search

Recursively search enumerable objects

data-exploration data-mining nested-objects

Last synced: 05 Jul 2025

https://github.com/cnag-biomedical-informatics/pheno-ranker

Pheno-Ranker is a tool designed for performing semantic similarity analysis on phenotypic data structured in JSON format, such as Beacon v2 Models or Phenopackets v2.

beacon-v2 bff csv data-exploration json phenopackets-v2 pxf semantic-similarity semantic-similarity-measures

Last synced: 29 Apr 2026

https://github.com/cnag-biomedical-informatics/pheno-ranker-ui

The web ui (R-Shiny application) for Pheno-Ranker, a tool designed for performing semantic similarity analysis on phenotypic data structured in JSON format, such as Beacon v2 Models or Phenopackets v2

beacon-v2 clinical-data data-exploration json phenopackets-v2 r semantic-similarity semantic-similarity-measures shiny

Last synced: 13 Oct 2025

https://github.com/gjbex/python-dashboards

Repository that contains material for training sessions on creating dashboards using Python.

dash dashboard data-analysis data-exploration data-science data-visualization panel python streamlit training training-materials visualization

Last synced: 13 Jul 2025

https://github.com/revogati/ecommerce_consumer_behaviour

This is a Full Data Analytics project From data cleaning, preparation, exploration, Interpretation of insights up to Presentation of findings and recommendations..

data-analysis data-exploration ecommerce jupyter-notebook python sql tableau-public visualization

Last synced: 16 Apr 2026

https://github.com/lefteris-souflas/sas-programming-and-machine-learning

Applied SAS techniques for data analysis and machine learning in a milestone project. Base SAS Programming and SAS Viya tools were utilized for preprocessing, customer profiling, sales analysis, promotions, supplier evaluation, and customer segmentation. Results were visualized comprehensively.

customer-profiling data-analytics data-exploration market-basket-analysis pre-processing recency-frequency-monetary sas-machine-learning sas-oda sas-programming sas-studio sas-visual-analytics sas-viya

Last synced: 05 Mar 2026

https://github.com/macdon112/layoff-analysis

SQL data cleaning & analysis of global layoffs

data-analysis data-cleaning data-exploration sql

Last synced: 21 Feb 2026

https://github.com/ahmadrazacdx/sales-data-analysis

A comprehensive data analysis project focusing on sales data exploration, cleaning, and statistical analysis. This includes key insights into sales trends, customer behavior, product categories, delivery times, and seasonality effects. Advanced statistical tests and visualizations are added to identify relationships and make data driven decisions.

data-cleaning data-exploration exploratory-data-analysis feature-engineering hypothesis-testing time-series-analysis time-series-forecasting

Last synced: 31 May 2026

https://github.com/daodavid/titanic-exploration-data-science-project

Titanic -applying T-test -exercises-DataCleaning,DataEplorations,Hypotesis,basic text procesing

data-cleaning data-exploration linear-regression t-independent-test titanic

Last synced: 06 Oct 2025

https://github.com/as16082023/covid-19--data-exploration-

Project exploring COVID-19 data using SQL

covid data-exploration mysql sql

Last synced: 10 Apr 2025

https://github.com/anushadatta/airbnb-in-seattle

๐Ÿจ Understanding the Airbnb rental landscape in Seattle using data science.

airbnb data-analysis data-exploration data-visualization datascience sentiment-analysis

Last synced: 13 Jun 2025

https://github.com/vidhi1290/hr_employee_prediction

"Welcome to the HR Employee Promotion Prediction project! This repository contains the code and resources for a machine learning project that focuses on predicting employee promotions. By analyzing various employee attributes, this project aims to provide valuable insights for HR decision-making and talent recognition within organizations.

data-exploration data-science data-visualization docker hr-employee-prediction hyperparameter-tuning machine-learning matplot model-building numpy pandas scikit-learn seaborn streamlit streamlit-webapp

Last synced: 13 Apr 2026

https://github.com/lijesh010/netflix_dataset_exploratory_data_analysis_python_project

This repository contains an Exploratory Data Analysis (EDA) Python project on the Netflix dataset. The purpose of this project is to gain insights and better understand the characteristics of the content available on Netflix, including movies and TV shows.

data-analysis data-exploration data-visualization exploratory-data-analysis jupyter-notebook python

Last synced: 20 May 2026

https://github.com/shogunbanik18/budgetify

End-to-End Budget Analysis enables effective budgeting through detailed analysis and strategic planning

analysis data data-engineering data-exploration databricks databricks-notebooks etl etl-process python3

Last synced: 09 Jun 2026

https://github.com/willie-conway/global-superstore-data-modeling-analysis

A comprehensive data modeling and analysis project for the ๐ŸŒGlobal Super Store, focusing on database design ๐Ÿ—ƒ๏ธ, sales data analysis ๐Ÿ“Š, and interactive visualizations ๐Ÿ“ using MySQL ๐Ÿ–ฅ๏ธ and Tableau ๐Ÿ“ˆ.

business-analytics business-intelligence data-exploration data-modeling data-preprocessing data-restructuring data-visualization database-design er-diagram geographic-analysis interactive-dashboard mysql profit-analysis sales-analysis sales-performance sales-trends sql star-schema tableau time-series-analysis

Last synced: 21 Jul 2025

https://github.com/jvelezmagic/pandas-missing

A pandas extension to explore and handle missing values.

data-exploration eda missing-data missing-values pandas

Last synced: 14 Apr 2025

https://github.com/lcvriend/laserbeans

Toolbox for data exploration

altair data-exploration data-visualization

Last synced: 15 Mar 2025

https://github.com/anastasius21/imdb-movie-analysis

Analysis of IMDb's Top 1000 Movies dataset using Pandas, Matplotlib, and Seaborn. It provides visualizations and insights into various aspects of movies, such as ratings, genres, directors, and release years.

data-analysis data-exploration data-science data-visualization imdb imdb-dataset jupyter-notebook python

Last synced: 25 Apr 2026