An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with exploratory-data-analysis

A curated list of projects in awesome lists tagged with exploratory-data-analysis .

https://github.com/lux-org/lux

Automatically visualize your pandas dataframe via a single print! πŸ“Š πŸ’‘

data-science exploratory-data-analysis jupyter pandas python visualization visualization-tools

Last synced: 29 Apr 2025

https://github.com/evidence-dev/evidence

Business intelligence as code: build fast, interactive data visualizations in SQL and markdown

analytics business-intelligence dashboard data-engineering data-science data-visualization dbt duckdb exploratory-data-analysis self-hosted sql svelte tailwindcss webassembly

Last synced: 06 Feb 2026

https://github.com/sfu-db/dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

apis apiwrapper cleaning connector data-exploration data-science datacleaning dataconnector dataprep datapreparation eda exploratory-data-analysis webconnector

Last synced: 14 May 2025

https://github.com/jadianes/data-science-your-way

Ways of doing Data Science Engineering and Machine Learning in R and Python

data-frame data-science data-science-engineering exploratory-data-analysis jupyter machine-learning notebook python r tutorial

Last synced: 04 Apr 2025

https://aeturrell.github.io/skimpy/

skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.

data-science eda exploratory-data-analysis pandas statistics summary-statistics

Last synced: 08 Oct 2025

https://github.com/aeturrell/skimpy

skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.

data-science eda exploratory-data-analysis pandas statistics summary-statistics

Last synced: 10 Oct 2025

https://github.com/desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 22 Nov 2025

https://github.com/mstaniak/autoEDA-resources

A list of software and papers related to automatic and fast Exploratory Data Analysis

autoeda automation eda exploratory-data-analysis visualization

Last synced: 06 May 2025

https://github.com/rasbt/musicmood

A machine learning approach to classify songs by mood.

exploratory-data-analysis lyrics machine-learning mood song-dataset

Last synced: 25 Jan 2026

https://github.com/Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 03 Apr 2025

https://github.com/data-describe/data-describe

data⎰describe: Pythonic EDA Accelerator for Data Science

analysis data-science eda exploratory-data-analysis pypi

Last synced: 12 Apr 2025

https://github.com/amanovishnu/ineuron-full-stack-data-science-assignments

this repository features assignments and projects from the iNeuron full stack data science course, providing valuable resources for learners to enhance their skills and apply their knowledge.

computer-vision data-science datascience deep-learning exploratory-data-analysis linear-regression machine-learning natural-language-processing python recommender-system sql statistics

Last synced: 08 Apr 2025

https://github.com/yangboz/lotteryprediction

:full_moon_with_face: Lottery prediction besides of following "law of proability","Probability: Independent Events", there are still "Saying "a Tail is due", or "just one more go, my luck is due to change" is called The Gambler's Fallacy" existed.

exploratory-data-analysis keras mine-data predictive-analytics scatter-plot series-analysis tensoflow time-series-analysis

Last synced: 05 Apr 2025

https://github.com/amanovishnu/ineuron-full-stack-data-science-assignment-collection

this repository features assignments and projects from the iNeuron full stack data science course, providing valuable resources for learners to enhance their skills and apply their knowledge.

computer-vision data-science datascience deep-learning exploratory-data-analysis linear-regression machine-learning natural-language-processing python recommender-system sql statistics

Last synced: 28 Feb 2025

https://github.com/neerjad/DataVisualization

Tutorials on visualizing data using python packages like bokeh, plotly, seaborn and igraph

exploratory-data-analysis plotly tutorial visualisation

Last synced: 06 Apr 2025

https://github.com/alastairrushworth/inspectdf

πŸ› οΈ πŸ“Š Tools for Exploring and Comparing Data Frames

comparison dataframe eda exploratory-data-analysis r rstats visualization

Last synced: 13 Apr 2025

https://github.com/dvgodoy/handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes

exploratory-data-analysis imputation outlier-detection pandas pyspark python spark visualization

Last synced: 05 Apr 2025

https://github.com/mirador/mirador

Tool for visual exploration of complex data.

data exploratory-data-analysis tabular-data visualization

Last synced: 17 Jan 2026

https://github.com/trr266/expandar

R Package for Interactive Panel Data Exploration

accounting eda exploratory-data-analysis finance open-science package r replication shiny shiny-apps

Last synced: 22 Oct 2025

https://github.com/trr266/ExPanDaR

R Package for Interactive Panel Data Exploration

accounting eda exploratory-data-analysis finance open-science package r replication shiny shiny-apps

Last synced: 30 Jul 2025

https://github.com/jadianes/spark-r-notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

big-data bigdata data-analysis data-science exploratory-data-analysis jupyter jupyter-notebook notebook r sparkr

Last synced: 21 Apr 2025

https://github.com/lozuwa/impy

Impy is a Python3 library with features that help you in your computer vision tasks.

dataset exploratory-data-analysis machine-learning preprocessing raw-data statistics tidy-data

Last synced: 02 Apr 2025

https://github.com/ujjwalkarn/xda

R package for exploratory data analysis

data-analysis data-science exploratory-data-analysis r

Last synced: 26 Apr 2025

https://github.com/dgwozdz/HN_SO_analysis

Is there a relationship between popularity of a given technology on Stack Overflow (SO) and Hacker News (HN)? And a few words about causality

eda exploratory-data-analysis granger-causality hackernews python relationship stackoverflow

Last synced: 17 Apr 2025

https://github.com/jadianes/data-journalism

Data journalism and easy to replicate notebooks using Python, R, and Web visualisations

data-analysis data-journalism data-visualisation data-visualization exploratory-data-analysis notebook

Last synced: 03 Oct 2025

https://github.com/duttashi/learnr

Exploratory, Inferential and Predictive data analysis. Feel free to show your :heart: by giving a star :star:

exploratory-data-analysis inferential-statistics predictive-modeling r

Last synced: 13 Jul 2025

https://github.com/kianweelee/Edator

A python package that performs exploratory data analysis for users. Additionally, it generates 3 types of output files (cleaned CSV, plots and a text report).

data-analysis data-science exploratory-data-analysis

Last synced: 08 May 2025

https://github.com/nbarrowman/vtree

An R package for calculating and drawing variable trees

data-science data-visualization exploratory-data-analysis r statistics

Last synced: 11 Oct 2025

https://github.com/ben519/mltools

Exploratory and diagnostic machine learning tools for R

exploratory-data-analysis machine-learning r

Last synced: 30 Jul 2025

https://github.com/zmjones/edarf

exploratory data analysis using random forests

exploratory-data-analysis machine-learning r random-forest rstats

Last synced: 26 Oct 2025

https://github.com/devsgnr/breadroll

breadroll πŸ₯Ÿ is a simple lightweight library for data processing operations written in Typescript and powered by Bun.

bun csv csv-parser data-engineering data-science data-transformation eda exploratory-data-analysis tsv tsv-parser

Last synced: 11 Oct 2025

https://github.com/renumics/sliceguard

A library for detecting problematic data segments in structured and unstructured data with few lines of code.

data-analysis data-cleaning data-curation data-exploration data-science data-visualization deep-learning eda exploratory-data-analysis machine-learning python visualization

Last synced: 16 Mar 2025

https://github.com/ucd-dnp/leila

LibrerΓ­a para la evaluaciΓ³n de calidad de datos, e interacciΓ³n con el portal de datos.gov.co

data-quality data-science eda espanol exploratory-data-analysis python report-generator ucd

Last synced: 05 Apr 2026

https://github.com/datamole-ai/edvart

An open-source Python library for Data Scientists & Data Analysts designed to simplify the exploratory data analysis process. Using Edvart, you can explore data sets and generate reports with minimal coding.

analysis data-analysis data-science data-visualization data-viz eda exploration exploratory-data-analysis exploratory-data-analysis-eda plots python

Last synced: 11 Feb 2026

https://github.com/spratiher9/sparkora

Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟

apache apache-spark data data-analysis data-analysis-python data-analytics easy-to-use eda exploratory-data-analysis open-source opensource pyspark python python3 toolkit

Last synced: 10 Jul 2025

https://github.com/ahmed-mohamed-sn/olliePy

OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.

ai analytics charts dashboard data data-analytics data-science data-scientist eda error-analysis exploratory-data-analysis machine-learning python visualization

Last synced: 08 May 2025

https://github.com/TysonStanley/furniture

The furniture R package contains table1 for publication-ready simple and stratified descriptive statistics, tableC for publication-ready correlation matrixes, and other tables #rstats

cran descriptive-statistics exploratory-data-analysis health table-one table1 tables tidy tidyverse

Last synced: 13 Jul 2025

https://github.com/tysonstanley/furniture

The furniture R package contains table1 for publication-ready simple and stratified descriptive statistics, tableC for publication-ready correlation matrixes, and other tables #rstats

cran descriptive-statistics exploratory-data-analysis health table-one table1 tables tidy tidyverse

Last synced: 22 Mar 2025

https://github.com/elysian01/data-purifier

A Python library for Automated Exploratory Data Analysis, Automated Data Cleaning, and Automated Data Preprocessing For Machine Learning and Natural Language Processing Applications in Python.

data-analysis data-cleaning data-cleaning-pipeline data-preprocessing data-science data-visualization datapurifier eda exploratory-data-analysis jupyter python-lib python-library python3

Last synced: 04 Oct 2025

https://github.com/mast-group/sequence-mining

Probabilistic Sequence Mining

data-mining exploratory-data-analysis

Last synced: 15 Nov 2025

https://github.com/hemansnation/python-for-data-professionals

This course is designed to get a good grip on python programming, logic building, solving algorithm-based questions, data structures, understanding of data analytics, working with pandas, professional practices, and API building.

data-analytics data-professionals data-science exploratory-data-analysis logic-programming machine-learning pandas python

Last synced: 23 Jul 2025

https://github.com/daya6489/SmartEDA

a R package for data exploratory analysis

analysis exploratory-data-analysis

Last synced: 06 May 2025

https://github.com/zhihanyue/qgridnext

Advancing QGrid, an interactive grid for exploring DataFrames in JupyterLab/Notebook

datatable exploratory-data-analysis grid jupyter-extension qgrid

Last synced: 08 Apr 2025

https://github.com/aatmunbaxi/orgroamtools

Helper library for data analysis of org-roam collections

data-science emacs exploratory-data-analysis library org-roam personal-knowledge-management python

Last synced: 09 Apr 2025

https://github.com/vega/altair_ally

Altair Ally is a companion package to Altair, which provides a few shortcuts to create common plots for exploratory data analysis.

altair eda exploratory-data-analysis exploratory-data-visualizations vega-lite visualization

Last synced: 11 Mar 2026

https://github.com/anitagraser/eda-protocol-movement-data

Step-by-step exploratory movement data analysis protocol in a Jupyter notebook

data-quality-assessment data-science exploratory-data-analysis movement-data

Last synced: 25 Feb 2025

https://github.com/csbiology/fsharpgephistreamer

F# functions for streaming any kind of graph/network data to the network visualization tool gephi

data-analysis exploratory-data-analysis fsharp gephi graph-visualization streaming-graph-data visualization

Last synced: 30 Jul 2025

https://github.com/parrt/msds593

MSDS593 -- Exploratory data analysis (EDA) at the University of San Francisco

exploratory-data-analysis matplotlib numpy pandas visualization

Last synced: 04 May 2025

https://github.com/hneth/ds4psy

Data science for psychologists (ds4psy): R package supporting book and course

data-literacy data-science education exploratory-data-analysis psychology r r-package social-sciences visualisation

Last synced: 14 Apr 2025

https://github.com/rubydamodar/loan-approval-prediction-

Loan approval prediction is a popular machine learning project, especially in the banking and finance industry. The goal of this project is to build a predictive model that can determine whether a loan application will be approved or not based on the applicant's information such as income, credit history, and loan amount.

ai-in-finance banking classification classification-internal credit-risk data-science exploratory-data-analysis feature-engineering financial-analytics loan-approval machine-learning matplotlib pandas predictive-modeling python scikit-learn seaborn visualization

Last synced: 17 Jul 2025

https://github.com/alastairrushworth/tdf

πŸš΄πŸ…πŸ“ŠTour de France winners and stages data

data-science dataframe exploratory-data-analysis rstats tdf tour-de-france

Last synced: 13 Apr 2025

https://github.com/m-clark/exploratory-data-analysis-tools

A survey of tools that make EDA more automated.

eda exploratory-data-analysis r

Last synced: 06 Apr 2026

https://github.com/rubydamodar/the-ultimate-pandas-bootcamp

Welcome to the Pandas for Data Science repository! This course is designed to take you from beginner to proficient in using Pandas, the powerful data manipulation library in Python. Whether you're just starting your data science journey or looking to sharpen your skills, this repository contains all the resources

beginner-friendly csv-data data-analysis data-cleaning data-manipulation data-science data-visualization dataframe exploratory-data-analysis jupyter-notebook machine-learning matplotlib numpy pandas python python-pandas series statistical-analysis time-series titanic-dataset

Last synced: 19 Apr 2025

https://github.com/kozaka93/torpeda

toRpEDA package

exploratory-data-analysis r

Last synced: 05 Oct 2025