Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mito-ds/mito

The mitosheet package, trymito.io, and other public Mito code.

data data-analysis data-science data-visualization jupyter pandas python streamlit-component

Last synced: 11 Oct 2024

https://github.com/weijie-chen/linear-algebra-with-python

Lecture Notes for Linear Algebra Featuring Python. This series of lecture notes will walk you through all the must-know concepts that set the foundation of data science or advanced quantitative skillsets. Suitable for statistician/econometrician, quantitative analysts, data scientists and etc. to quickly refresh the linear algebra with the assistance of Python computation and visualization.

computational-science data-analysis data-science data-visualization diagonalization eigenvalues eigenvectors gram-schmidt jupyter linear-algebra linear-transformations mathematics matrix matrix-calculations multivariate-normal-distribution null-space python singular-value-decomposition symmetric-matrices vector-space

Last synced: 11 Oct 2024

https://github.com/justmarkham/pandas-videos

Jupyter notebook and datasets from the pandas video series

data-analysis data-cleaning data-science jupyter-notebook pandas python tutorial

Last synced: 15 Oct 2024

https://github.com/lana-k/sqliteviz

Instant offline SQL-powered data visualisation in your browser

charting csv data-analysis pivot pivot-table plotly plotting sql sqlite visualization

Last synced: 15 Oct 2024

https://github.com/chris1610/pbpython

Code, Notebooks and Examples from Practical Business Python

data-analysis data-visualization datascience pandas python scikit-learn

Last synced: 13 Oct 2024

https://github.com/elementary-data/elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

analytics-engineer bigquery data-analysis data-governance data-lineage data-observability data-pipeline data-pipelines data-reliability data-warehouse dataops dbt dbt-artifacts dbt-packages lineage redshift snowflake

Last synced: 18 Nov 2024

https://github.com/DAGWorks-Inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering hacktoberfest lineage llmops machine-learning mlops orchestration pandas python rag software-engineering

Last synced: 29 Oct 2024

https://github.com/h2oai/datatable

A Python package for manipulating 2-dimensional tabular data structures

data-analysis data-structure ftrl performance python

Last synced: 19 Nov 2024

https://github.com/bellingcat/octosuite

GitHub Data Analysis Framework.

data-analysis github

Last synced: 15 Oct 2024

https://github.com/visactor/vtable

VTable is not just a high-performance multidimensional data analysis table, but also a grid artist that creates art between rows and columns.

canvas-table data-analysis data-visualization database datagrid grid javascript-table javescript list-table list-tree online-excel pivot-chart pivot-grid pivot-tables react-table sparklines spreadsheet tree-table visualization vue-table

Last synced: 19 Nov 2024

https://github.com/apachecn/python_data_analysis_and_mining_action

《python数据分析与挖掘实战》的代码笔记

data-analysis data-science python3 readingnotes

Last synced: 14 Oct 2024

https://github.com/404notf0und/ai-for-security-learning

安全场景、基于AI的安全算法和安全数据分析业界实践

data-analysis data-mining machine-learning security

Last synced: 15 Oct 2024

https://github.com/404notf0und/AI-for-Security-Learning

安全场景、基于AI的安全算法和安全数据分析业界实践

data-analysis data-mining machine-learning security

Last synced: 11 Nov 2024

https://github.com/jadianes/spark-py-notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

big-data bigdata data-analysis data-science ipython ipython-notebook machine-learning mllib notebook pyspark python spark

Last synced: 11 Oct 2024

https://github.com/ecmadao/hacknical

Hacknical, hacker & technical. A website for GitHub user to make a better resume.

contribute-languages contributions data-analysis github github-analysis github-commits github-contributions reac react resume resume-template

Last synced: 13 Nov 2024

https://github.com/nubank/fklearn

fklearn: Functional Machine Learning

data-analysis data-science machine-learning ml python

Last synced: 14 Oct 2024

https://github.com/DataBrewery/cubes

[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis

cube data data-analysis data-warehouse multidimensional-analysis olap sql

Last synced: 29 Oct 2024

https://github.com/microsoft/responsible-ai-toolbox

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

data-analysis data-science data-visualization error-analysis explainability explainable-ai explainable-ml fairness fairness-ai fairness-ml interpretability jupyter machine-learning machinelearning ml responsible-ai ui visualization widget widgets

Last synced: 19 Nov 2024

https://github.com/rilldata/rill-developer

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

bi business-analytics csv data data-analysis data-visualization dataviz duckdb gcs golang parquet parquet-tools parquet-viewer s3 sql sql-editor svelte sveltejs sveltekit

Last synced: 21 Aug 2024

https://github.com/rilldata/rill

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

bi business-analytics csv data data-analysis data-visualization dataviz duckdb gcs golang parquet parquet-tools parquet-viewer s3 sql sql-editor svelte sveltejs sveltekit

Last synced: 09 Oct 2024

https://github.com/patmartin/dex

Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.

d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization

Last synced: 12 Nov 2024

https://github.com/PatMartin/Dex

Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.

d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization

Last synced: 13 Nov 2024

https://github.com/datageartech/datagear

数据可视化分析平台,自由制作任何您想要的数据看板

bi business-intelligence chart data-analysis data-analytics data-visualization echarts

Last synced: 14 Oct 2024

https://github.com/dagworks-inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hacktoberfest lineage llmops machine-learning mlops numpy orchestration pandas python software-engineering

Last synced: 19 Nov 2024

https://github.com/googlecloudplatform/data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning

Last synced: 07 Oct 2024

https://github.com/GoogleCloudPlatform/data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning

Last synced: 07 Aug 2024

https://github.com/singer-io/getting-started

This repository is a getting started guide to Singer.

data-analysis etl etl-framework python singer

Last synced: 14 Oct 2024

https://github.com/alan-turing-institute/CleverCSV

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3

Last synced: 29 Oct 2024

https://github.com/alan-turing-institute/clevercsv

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3

Last synced: 22 Oct 2024

https://github.com/intel/scikit-learn-intelex

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

ai-inference ai-machine-learning ai-training analytics big-data data-analysis gpu intel machine-learning machine-learning-algorithms oneapi python scikit-learn swrepo

Last synced: 18 Nov 2024

https://github.com/VisActor/VTable

VTable is not just a high-performance multidimensional data analysis table, but also a grid artist that creates art between rows and columns.

canvas-table data-analysis data-visualization database datagrid grid javascript-table javescript list-table list-tree online-excel pivot-chart pivot-grid pivot-tables sparklines spreadsheet table tree-chart tree-table visualization

Last synced: 17 Aug 2024

https://github.com/machow/siuba

Python library for using dplyr like syntax with pandas and SQL

data-analysis dplyr pandas python sql

Last synced: 14 Oct 2024

https://github.com/man-group/ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 24 Oct 2024

https://github.com/man-group/arcticdb

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 15 Oct 2024

https://github.com/apachecn/pyda-2e-zh

:book: [译] 利用 Python 进行数据分析 · 第 2 版

book data-analysis numpy pandas pyda python

Last synced: 29 Oct 2024

https://github.com/comet-ml/kangas

🦘 Explore multimedia datasets at scale

data-analysis data-exploration dataframe datagrid machine-learning

Last synced: 14 Oct 2024

https://github.com/d4t4x/data-selfie

Data Selfie - a browser extension to track yourself on Facebook and analyze your data.

chrome-extension data-analysis data-dashboard firefox-addon privacy

Last synced: 13 Nov 2024

https://github.com/xinglie/report-designer

⚡打印设计、可视化、标签打印、编辑器、设计器、数据分析、报表设计、组件化、表单设计、h5页面、调查问卷、pdf生成、流程图、试卷、SVG、图形元素、物联网、标签纸

cloud-print data-analysis data-visualization editor h5-creator h5-editor h5-maker iot-demo layouts-and-renderings online-design online-printing printer snapshot visiual-editor xinglie

Last synced: 13 Oct 2024

https://github.com/GoogleCloudPlatform/DataflowJavaSDK

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

big-data data-analysis data-mining data-processing data-science google-cloud-dataflow

Last synced: 12 Nov 2024

https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

data-analysis data-science machine-learning python sklearn statistical-learning tensorflow tutorials

Last synced: 12 Nov 2024

https://github.com/empathy87/the-elements-of-statistical-learning-python-notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

data-analysis data-science machine-learning python sklearn statistical-learning tensorflow tutorials

Last synced: 10 Oct 2024

https://github.com/Kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 07 Nov 2024

https://github.com/androz2091/discord-data-package-explorer

🌀 What's really in your Discord Data package?

data-analysis discord discord-data-package statistics

Last synced: 13 Nov 2024

https://github.com/Androz2091/discord-data-package-explorer

🌀 What's really in your Discord Data package?

data-analysis discord discord-data-package statistics

Last synced: 01 Nov 2024

https://github.com/visualpython/visualpython

GUI-based Python code generator for data science, extension to Jupyter Lab, Jupyter Notebook and Google Colab.

bigdata chrome-extension code-generator data-analysis jupyter-lab-extension jupyter-notebook-extension jupyterlab-extension pandas python visual-coding

Last synced: 19 Oct 2024

https://github.com/yoshoku/rumale

Rumale is a machine learning library in Ruby

artificial-intelligence data-analysis data-science machine-learning ml ruby rubyml

Last synced: 17 Nov 2024

https://github.com/chawlaavi/daily-dose-of-data-science

A collection of code snippets from the publication Daily Dose of Data Science on Substack: http://www.dailydoseofds.com/

data-analysis data-science data-science-tips data-visualization jupyter jupyter-notebook jupyter-tips matplotlib matplotlib-tips numpy pandas pandas-tips python python-tips sklearn

Last synced: 09 Oct 2024

https://github.com/JosephLai241/URS

Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

archiving command-line comments csv data-analysis data-science json livestream osint-tool praw pyo3 python reddit reddit-scraper redditor rust scraper subreddit trees wordcloud

Last synced: 28 Oct 2024

https://github.com/ipython-books/cookbook-2nd-code

Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]

computing data-analysis data-mining data-science data-visualization ipython jupyter jupyter-notebook machine-learning numerical-computation python visualization

Last synced: 13 Nov 2024

https://github.com/arvkevi/kneed

Knee point detection in Python :chart_with_upwards_trend:

data-analysis data-science elbow-method knee-point python scientific-computing systems

Last synced: 28 Oct 2024

https://github.com/program-spiritual/dataanalysisinaction

(Finished) Geek Time Data Analysis Practical 45 Lecture - Detailed notes containing markdown images mind map code data can be read directly code test

data-analysis data-analytics in-action notebook-jupyter pipenv pyenv python python-data-analysis python-data-science python3

Last synced: 29 Oct 2024

https://github.com/mrankitgupta/data-analyst-roadmap

I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge

ankit ankit-gupta ankitgupta data-analysis data-analytics data-science data-structures data-visualization excel mongodb mysql pandas powerbi python sql sql-server tableau

Last synced: 12 Oct 2024

https://github.com/mpw0311/antd-umi-sys

企业BI系统,数据可视化平台,主要技术:react、antd、umi、dva、es6、less等,与君共勉,互相学习,如果喜欢请start ⭐。

antd antd-umi-sys company-site d3js data-analysis data-visualization dva dvajs echarts echarts-for-react es6 gitdatav react react-redux react-router redux sankey umi umijs

Last synced: 13 Nov 2024

https://github.com/nicolaskruchten/jupyter_pivottablejs

Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js

data-analysis data-science interactive jupyter-notebook pivot-chart pivot-tables

Last synced: 10 Oct 2024

https://github.com/abixen/abixen-platform

Abixen Platform is a microservices based software platform for building enterprise applications delivering functionalities through creating particular microservices and integrating by provided CMS.

analytics angularjs architecture aws business-intelligence businessintelligence charts cloud dashboard data-analysis data-analytics data-visualization low-code microservices netflixoss reporting spring-boot spring-cloud sql-editor visualization

Last synced: 11 Oct 2024

https://github.com/dmpe/r

Exercises (incl. analyses) with R language (math+statistics)

course data-analysis exercise r statistics

Last synced: 17 Nov 2024

https://github.com/ashishpatel26/amazing-feature-engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

data-analysis data-mining data-science data-scientists data-visualization deep-learning feature-engineering feature-extraction feature-scaling feature-selection features machine-learning scikit-learn

Last synced: 19 Nov 2024

https://github.com/anthonydb/practical-sql

Code and Data for the First Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2018).

data-analysis postgresql sql

Last synced: 15 Nov 2024

https://github.com/aloctavodia/bap

Bayesian Analysis with Python (Second Edition)

arviz bayesian-analysis data-analysis data-visualization errata pymc3 python

Last synced: 15 Nov 2024

https://github.com/elastic/eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

big-data data-analysis dataframe dataframes eland elasticsearch etl lightgbm machine-learning pandas python scikit-learn time-series-forecasting

Last synced: 07 Oct 2024

https://github.com/SciTools/iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data

data-analysis earth-science grib iris meteorology netcdf oceanography python spaceweather visualisation

Last synced: 06 Nov 2024