Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mito-ds/mito

The mitosheet package, trymito.io, and other public Mito code.

data data-analysis data-science data-visualization jupyter pandas python streamlit-component

Last synced: 11 Oct 2024

https://github.com/weijie-chen/linear-algebra-with-python

Lecture Notes for Linear Algebra Featuring Python. This series of lecture notes will walk you through all the must-know concepts that set the foundation of data science or advanced quantitative skillsets. Suitable for statistician/econometrician, quantitative analysts, data scientists and etc. to quickly refresh the linear algebra with the assistance of Python computation and visualization.

computational-science data-analysis data-science data-visualization diagonalization eigenvalues eigenvectors gram-schmidt jupyter linear-algebra linear-transformations mathematics matrix matrix-calculations multivariate-normal-distribution null-space python singular-value-decomposition symmetric-matrices vector-space

Last synced: 11 Oct 2024

https://github.com/justmarkham/pandas-videos

Jupyter notebook and datasets from the pandas video series

data-analysis data-cleaning data-science jupyter-notebook pandas python tutorial

Last synced: 15 Oct 2024

https://github.com/lana-k/sqliteviz

Instant offline SQL-powered data visualisation in your browser

charting csv data-analysis pivot pivot-table plotly plotting sql sqlite visualization

Last synced: 15 Oct 2024

https://github.com/chris1610/pbpython

Code, Notebooks and Examples from Practical Business Python

data-analysis data-visualization datascience pandas python scikit-learn

Last synced: 13 Oct 2024

https://github.com/DAGWorks-Inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering hacktoberfest lineage llmops machine-learning mlops orchestration pandas python rag software-engineering

Last synced: 29 Oct 2024

https://github.com/h2oai/datatable

A Python package for manipulating 2-dimensional tabular data structures

data-analysis data-structure ftrl performance python

Last synced: 11 Oct 2024

https://github.com/bellingcat/octosuite

GitHub Data Analysis Framework.

data-analysis github

Last synced: 15 Oct 2024

https://github.com/elementary-data/elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

analytics-engineer bigquery data-analysis data-governance data-lineage data-observability data-pipeline data-pipelines data-reliability data-warehouse dataops dbt dbt-artifacts dbt-packages lineage redshift snowflake

Last synced: 15 Oct 2024

https://github.com/apachecn/python_data_analysis_and_mining_action

《python数据分析与挖掘实战》的代码笔记

data-analysis data-science python3 readingnotes

Last synced: 14 Oct 2024

https://github.com/404notf0und/ai-for-security-learning

安全场景、基于AI的安全算法和安全数据分析业界实践

data-analysis data-mining machine-learning security

Last synced: 15 Oct 2024

https://github.com/404notf0und/AI-for-Security-Learning

安全场景、基于AI的安全算法和安全数据分析业界实践

data-analysis data-mining machine-learning security

Last synced: 02 Aug 2024

https://github.com/jadianes/spark-py-notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

big-data bigdata data-analysis data-science ipython ipython-notebook machine-learning mllib notebook pyspark python spark

Last synced: 11 Oct 2024

https://github.com/ecmadao/hacknical

Hacknical, hacker & technical. A website for GitHub user to make a better resume.

contribute-languages contributions data-analysis github github-analysis github-commits github-contributions reac react resume resume-template

Last synced: 30 Oct 2024

https://github.com/nubank/fklearn

fklearn: Functional Machine Learning

data-analysis data-science machine-learning ml python

Last synced: 14 Oct 2024

https://github.com/DataBrewery/cubes

[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis

cube data data-analysis data-warehouse multidimensional-analysis olap sql

Last synced: 29 Oct 2024

https://github.com/rilldata/rill-developer

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

bi business-analytics csv data data-analysis data-visualization dataviz duckdb gcs golang parquet parquet-tools parquet-viewer s3 sql sql-editor svelte sveltejs sveltekit

Last synced: 21 Aug 2024

https://github.com/rilldata/rill

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

bi business-analytics csv data data-analysis data-visualization dataviz duckdb gcs golang parquet parquet-tools parquet-viewer s3 sql sql-editor svelte sveltejs sveltekit

Last synced: 09 Oct 2024

https://github.com/patmartin/dex

Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.

d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization

Last synced: 29 Oct 2024

https://github.com/datageartech/datagear

数据可视化分析平台,自由制作任何您想要的数据看板

bi business-intelligence chart data-analysis data-analytics data-visualization echarts

Last synced: 14 Oct 2024

https://github.com/PatMartin/Dex

Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.

d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization

Last synced: 02 Aug 2024

https://github.com/dagworks-inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hacktoberfest lineage llmops machine-learning mlops numpy orchestration pandas python software-engineering

Last synced: 11 Oct 2024

https://github.com/googlecloudplatform/data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning

Last synced: 07 Oct 2024

https://github.com/GoogleCloudPlatform/data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning

Last synced: 07 Aug 2024

https://github.com/singer-io/getting-started

This repository is a getting started guide to Singer.

data-analysis etl etl-framework python singer

Last synced: 14 Oct 2024

https://github.com/alan-turing-institute/CleverCSV

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3

Last synced: 29 Oct 2024

https://github.com/microsoft/responsible-ai-toolbox

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

data-analysis data-science data-visualization error-analysis explainability explainable-ai explainable-ml fairness fairness-ai fairness-ml interpretability jupyter machine-learning machinelearning ml responsible-ai ui visualization widget widgets

Last synced: 11 Oct 2024

https://github.com/alan-turing-institute/clevercsv

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3

Last synced: 22 Oct 2024

https://github.com/VisActor/VTable

VTable is not just a high-performance multidimensional data analysis table, but also a grid artist that creates art between rows and columns.

canvas-table data-analysis data-visualization database datagrid grid javascript-table javescript list-table list-tree online-excel pivot-chart pivot-grid pivot-tables sparklines spreadsheet table tree-chart tree-table visualization

Last synced: 17 Aug 2024

https://github.com/visactor/vtable

VTable is not just a high-performance multidimensional data analysis table, but also a grid artist that creates art between rows and columns.

canvas-table data-analysis data-visualization database datagrid grid javascript-table javescript list-table list-tree online-excel pivot-chart pivot-grid pivot-tables sparklines spreadsheet table tree-chart tree-table visualization

Last synced: 15 Oct 2024

https://github.com/intel/scikit-learn-intelex

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

ai-inference ai-machine-learning ai-training analytics big-data data-analysis gpu intel machine-learning machine-learning-algorithms oneapi python scikit-learn swrepo

Last synced: 13 Oct 2024

https://github.com/machow/siuba

Python library for using dplyr like syntax with pandas and SQL

data-analysis dplyr pandas python sql

Last synced: 14 Oct 2024

https://github.com/man-group/ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 24 Oct 2024

https://github.com/man-group/arcticdb

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 15 Oct 2024

https://github.com/apachecn/pyda-2e-zh

:book: [译] 利用 Python 进行数据分析 · 第 2 版

book data-analysis numpy pandas pyda python

Last synced: 29 Oct 2024

https://github.com/comet-ml/kangas

🦘 Explore multimedia datasets at scale

data-analysis data-exploration dataframe datagrid machine-learning

Last synced: 14 Oct 2024

https://github.com/d4t4x/data-selfie

Data Selfie - a browser extension to track yourself on Facebook and analyze your data.

chrome-extension data-analysis data-dashboard firefox-addon privacy

Last synced: 30 Oct 2024

https://github.com/xinglie/report-designer

⚡打印设计、可视化、标签打印、编辑器、设计器、数据分析、报表设计、组件化、表单设计、h5页面、调查问卷、pdf生成、流程图、试卷、SVG、图形元素、物联网、标签纸

cloud-print data-analysis data-visualization editor h5-creator h5-editor h5-maker iot-demo layouts-and-renderings online-design online-printing printer snapshot visiual-editor xinglie

Last synced: 13 Oct 2024

https://github.com/GoogleCloudPlatform/DataflowJavaSDK

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

big-data data-analysis data-mining data-processing data-science google-cloud-dataflow

Last synced: 02 Aug 2024

https://github.com/empathy87/the-elements-of-statistical-learning-python-notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

data-analysis data-science machine-learning python sklearn statistical-learning tensorflow tutorials

Last synced: 10 Oct 2024

https://github.com/Kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 07 Nov 2024

https://github.com/androz2091/discord-data-package-explorer

🌀 What's really in your Discord Data package?

data-analysis discord discord-data-package statistics

Last synced: 30 Oct 2024

https://github.com/Androz2091/discord-data-package-explorer

🌀 What's really in your Discord Data package?

data-analysis discord discord-data-package statistics

Last synced: 01 Nov 2024

https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

data-analysis data-science machine-learning python sklearn statistical-learning tensorflow tutorials

Last synced: 02 Aug 2024

https://github.com/visualpython/visualpython

GUI-based Python code generator for data science, extension to Jupyter Lab, Jupyter Notebook and Google Colab.

bigdata chrome-extension code-generator data-analysis jupyter-lab-extension jupyter-notebook-extension jupyterlab-extension pandas python visual-coding

Last synced: 19 Oct 2024

https://github.com/yoshoku/rumale

Rumale is a machine learning library in Ruby

artificial-intelligence data-analysis data-science machine-learning ml ruby rubyml

Last synced: 26 Oct 2024

https://github.com/chawlaavi/daily-dose-of-data-science

A collection of code snippets from the publication Daily Dose of Data Science on Substack: http://www.dailydoseofds.com/

data-analysis data-science data-science-tips data-visualization jupyter jupyter-notebook jupyter-tips matplotlib matplotlib-tips numpy pandas pandas-tips python python-tips sklearn

Last synced: 09 Oct 2024

https://github.com/JosephLai241/URS

Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

archiving command-line comments csv data-analysis data-science json livestream osint-tool praw pyo3 python reddit reddit-scraper redditor rust scraper subreddit trees wordcloud

Last synced: 28 Oct 2024

https://github.com/ipython-books/cookbook-2nd-code

Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]

computing data-analysis data-mining data-science data-visualization ipython jupyter jupyter-notebook machine-learning numerical-computation python visualization

Last synced: 30 Oct 2024

https://github.com/arvkevi/kneed

Knee point detection in Python :chart_with_upwards_trend:

data-analysis data-science elbow-method knee-point python scientific-computing systems

Last synced: 28 Oct 2024

https://github.com/program-spiritual/dataanalysisinaction

(Finished) Geek Time Data Analysis Practical 45 Lecture - Detailed notes containing markdown images mind map code data can be read directly code test

data-analysis data-analytics in-action notebook-jupyter pipenv pyenv python python-data-analysis python-data-science python3

Last synced: 29 Oct 2024

https://github.com/mrankitgupta/data-analyst-roadmap

I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge

ankit ankit-gupta ankitgupta data-analysis data-analytics data-science data-structures data-visualization excel mongodb mysql pandas powerbi python sql sql-server tableau

Last synced: 12 Oct 2024

https://github.com/mpw0311/antd-umi-sys

企业BI系统,数据可视化平台,主要技术:react、antd、umi、dva、es6、less等,与君共勉,互相学习,如果喜欢请start ⭐。

antd antd-umi-sys company-site d3js data-analysis data-visualization dva dvajs echarts echarts-for-react es6 gitdatav react react-redux react-router redux sankey umi umijs

Last synced: 30 Oct 2024

https://github.com/nicolaskruchten/jupyter_pivottablejs

Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js

data-analysis data-science interactive jupyter-notebook pivot-chart pivot-tables

Last synced: 10 Oct 2024

https://github.com/abixen/abixen-platform

Abixen Platform is a microservices based software platform for building enterprise applications delivering functionalities through creating particular microservices and integrating by provided CMS.

analytics angularjs architecture aws business-intelligence businessintelligence charts cloud dashboard data-analysis data-analytics data-visualization low-code microservices netflixoss reporting spring-boot spring-cloud sql-editor visualization

Last synced: 11 Oct 2024

https://github.com/anthonydb/practical-sql

Code and Data for the First Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2018).

data-analysis postgresql sql

Last synced: 08 Nov 2024

https://github.com/aloctavodia/bap

Bayesian Analysis with Python (Second Edition)

arviz bayesian-analysis data-analysis data-visualization errata pymc3 python

Last synced: 01 Nov 2024

https://github.com/elastic/eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

big-data data-analysis dataframe dataframes eland elasticsearch etl lightgbm machine-learning pandas python scikit-learn time-series-forecasting

Last synced: 07 Oct 2024

https://github.com/SciTools/iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data

data-analysis earth-science grib iris meteorology netcdf oceanography python spaceweather visualisation

Last synced: 06 Nov 2024

https://github.com/planetlabs/notebooks

interactive notebooks from Planet Engineering

api data-analysis jupyter-notebooks python remote-sensing satellite-imagery

Last synced: 06 Nov 2024

https://github.com/dmpe/R

Exercises (incl. analyses) with R language (math+statistics)

course data-analysis exercise r statistics

Last synced: 05 Aug 2024