Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/scikit-learn/scikit-learn

scikit-learn: machine learning in Python

data-analysis data-science machine-learning python statistics

Last synced: 28 Oct 2024

https://github.com/pydata/pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

alignment data-analysis data-science flexible pandas python

Last synced: 09 Aug 2024

https://github.com/pandas-dev/pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

alignment data-analysis data-science flexible pandas python

Last synced: 28 Oct 2024

https://github.com/metabase/metabase

The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:

analytics bi business-intelligence businessintelligence clojure dashboard data data-analysis data-visualization database metabase mysql postgres postgresql reporting slack sql-editor visualization

Last synced: 28 Oct 2024

https://github.com/gradio-app/gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

data-analysis data-science data-visualization deep-learning deploy gradio gradio-interface hacktoberfest interface machine-learning models python python-notebook ui ui-components

Last synced: 29 Oct 2024

https://github.com/microsoft/Data-Science-For-Beginners

10 Weeks, 20 Lessons, Data Science for All!

data-analysis data-science data-visualization pandas python

Last synced: 30 Oct 2024

https://github.com/microsoft/data-science-for-beginners

10 Weeks, 20 Lessons, Data Science for All!

data-analysis data-science data-visualization pandas python

Last synced: 28 Oct 2024

https://github.com/gchq/CyberChef

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

compression data-analysis data-manipulation encoding encryption hashing parsing

Last synced: 25 Oct 2024

https://github.com/gchq/cyberchef

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

compression data-analysis data-manipulation encoding encryption hashing parsing

Last synced: 28 Oct 2024

https://gchq.github.io/CyberChef/

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

compression data-analysis data-manipulation encoding encryption hashing parsing

Last synced: 27 Oct 2024

https://github.com/allinurl/goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

analytics apache c caddy cli command-line dashboard data-analysis gdpr goaccess google-analytics monitoring ncurses nginx privacy real-time terminal tui web-analytics webserver

Last synced: 29 Oct 2024

https://github.com/dataease/dataease

人人可用的开源数据可视化分析工具。

apache-doris business-intelligence data-analysis data-visualization echarts kettle superset tableau

Last synced: 09 Oct 2024

https://github.com/airbytehq/airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

bigquery change-data-capture data data-analysis data-collection data-engineering data-integration data-pipeline elt etl java mssql mysql pipeline postgresql python redshift s3 self-hosted snowflake

Last synced: 28 Oct 2024

https://github.com/Sinaptik-AI/pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql

Last synced: 29 Oct 2024

https://github.com/sinaptik-ai/pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql

Last synced: 14 Oct 2024

https://github.com/gventuri/pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql

Last synced: 03 Aug 2024

https://github.com/openrefine/openrefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 28 Oct 2024

https://github.com/OpenRefine/OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 27 Oct 2024

https://github.com/guipsamora/pandas_exercises

Practice your pandas skills!

data-analysis exercise pandas practice tutorial

Last synced: 29 Oct 2024

https://github.com/kanaries/pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

data-analysis data-exploration dataframe matplotlib pandas plotly tableau tableau-alternative visualization

Last synced: 28 Oct 2024

https://github.com/Kanaries/pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

data-analysis data-exploration dataframe matplotlib pandas plotly tableau tableau-alternative visualization

Last synced: 29 Oct 2024

https://github.com/tangyudi/ai-learn

人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

algorithm artificial-intelligence caffe cv data-analysis data-mining data-science deep-learning keras machine-learning mathematics matplotlib nlp numpy pandas python pytorch seaborn tensorflow tensorflow2

Last synced: 29 Oct 2024

https://github.com/jindaxiang/akshare

AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库

academic akshare asset-pricing bond currency data data-analysis data-science datasets economic-data economics finance finance-api financial-data fundamental futures option quant stock

Last synced: 03 Aug 2024

https://github.com/akfamily/akshare

AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库

academic akshare asset-pricing bond currency data data-analysis data-science datasets economic-data economics finance finance-api financial-data fundamental futures option quant stock

Last synced: 28 Oct 2024

https://github.com/gonum/gonum

Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more

data-analysis go golang graph matrix scientific-computing statistics

Last synced: 28 Oct 2024

https://github.com/Alluxio/alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

alluxio data-analysis data-orchestration hadoop memory-speed presto spark tensorflow virtual-distributed-filesystem

Last synced: 29 Oct 2024

https://github.com/alluxio/alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

alluxio data-analysis data-orchestration hadoop memory-speed presto spark tensorflow virtual-distributed-filesystem

Last synced: 28 Oct 2024

https://github.com/scikit-learn-contrib/imbalanced-learn

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

data-analysis data-science machine-learning python statistics

Last synced: 28 Oct 2024

https://github.com/jeecgboot/JimuReport

「可视化报表,DataV、帆软的开源替代」积木报表是一款类Excel操作风格,在线拖拽设计的报表工具。功能涵盖: 报表设计、打印设计、图形报表、仪表盘门户设计、大屏设计等,完全免费!秉承“简单、易用、专业”的产品理念,极大的降低报表开发难度、缩短开发周期、解决各类报表难题。

bi bigscreen birt data-analysis data-visualization dataease datav echart echarts finereport highcharts ireport jasperreport jfreechart metabase print redash report superset tableau

Last synced: 30 Oct 2024

https://github.com/rhiever/data-analysis-and-machine-learning-projects

Repository of teaching materials, code, and data for my data analysis and machine learning projects.

data-analysis data-science evolutionary-algorithm ipython-notebook machine-learning python

Last synced: 11 Oct 2024

https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects

Repository of teaching materials, code, and data for my data analysis and machine learning projects.

data-analysis data-science evolutionary-algorithm ipython-notebook machine-learning python

Last synced: 27 Oct 2024

https://github.com/airbnb/knowledge-repo

A next-generation curated knowledge sharing platform for data scientists and other technical professions.

data data-analysis data-science knowledge

Last synced: 29 Oct 2024

https://github.com/microsoft/TaskWeaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.

agent ai-agents code-interpreter copilot data-analysis llm openai

Last synced: 29 Oct 2024

https://github.com/flyteorg/flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

data data-analysis data-science dataops declarative fine-tuning flyte golang grpc kubernetes kubernetes-operator llm machine-learning mlops orchestration-engine production production-grade python scale workflow

Last synced: 28 Oct 2024

https://github.com/cube2222/octosql

OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.

cli csv data-analysis go golang json mysql nosql postgresql query query-engine redis sql

Last synced: 29 Oct 2024

https://github.com/SpiderClub/weibospider

:zap: A distributed crawler for weibo, building with celery and requests.

data-analysis distributed-crawler python3 sina weibo weibospider

Last synced: 29 Oct 2024

https://github.com/spiderclub/weibospider

:zap: A distributed crawler for weibo, building with celery and requests.

data-analysis distributed-crawler python3 sina weibo weibospider

Last synced: 15 Oct 2024

https://github.com/javascriptdata/danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

danfojs data-analysis data-analytics data-manipulation data-science dataframe javascript pandas plotting-charts stream-data stream-processing table tensorflow tensors

Last synced: 13 Oct 2024

https://github.com/microsoft/taskweaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.

agent ai-agents code-interpreter copilot data-analysis llm openai

Last synced: 09 Oct 2024

https://github.com/Nyandwi/machine_learning_complete

A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.

computer-vision data-analysis data-science data-visualization datascience deep-learning keras machine-learning matplotlib neural-networks nlp numpy open-source pandas python scikit-learn seaborn tensorflow

Last synced: 05 Nov 2024

https://github.com/nyandwi/machine_learning_complete

A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.

computer-vision data-analysis data-science data-visualization datascience deep-learning keras machine-learning matplotlib neural-networks nlp numpy open-source pandas python scikit-learn seaborn tensorflow

Last synced: 10 Oct 2024

https://github.com/BrambleXu/pydata-notebook

利用Python进行数据分析 第二版 (2017) 中文翻译笔记

chinese-translation data-analysis jupyter-notebook pandas python-for-data-analysis

Last synced: 31 Oct 2024

https://github.com/bramblexu/pydata-notebook

利用Python进行数据分析 第二版 (2017) 中文翻译笔记

chinese-translation data-analysis jupyter-notebook pandas python-for-data-analysis

Last synced: 15 Oct 2024

https://github.com/whoiskatrin/sql-translator

SQL Translator is a tool for converting natural language queries into SQL code using artificial intelligence. This project is 100% free and open source.

data-analysis data-engineering dataquery datascience dataset openai postgresql query sql

Last synced: 15 Oct 2024

https://github.com/residentmario/missingno

Missing data visualization module for Python.

data-analysis data-visualization missing-data pandas python

Last synced: 29 Oct 2024

https://github.com/ResidentMario/missingno

Missing data visualization module for Python.

data-analysis data-visualization missing-data pandas python

Last synced: 26 Oct 2024

https://github.com/lancedb/lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..

apache-arrow computer-vision data-analysis data-analytics data-centric data-format data-science dataops deep-learning duckdb embeddings llms machine-learning mlops python rust

Last synced: 29 Oct 2024

https://github.com/has2k1/plotnine

A Grammar of Graphics for Python

data-analysis grammar graphics plotting python

Last synced: 29 Oct 2024

https://github.com/briefercloud/briefer

Dashboards and notebooks in a single place. Create powerful and flexible dashboards using code, or build beautiful Notion-like notebooks and share them with your team.

analytics bi bigquery briefer business-intelligence businessintelligence dashboard data-analysis data-visualization jupyter notebook postgres postgresql reporting visualization

Last synced: 29 Oct 2024

https://github.com/eto-ai/lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..

apache-arrow computer-vision data-analysis data-analytics data-centric data-format data-science dataops deep-learning duckdb embeddings llms machine-learning mlops python rust

Last synced: 02 Aug 2024

https://github.com/antonycourtney/tad

A desktop application for viewing and analyzing tabular data

csv data-analysis data-science database desktop-application duckdb parquet-viewer pivot-tables pivots tabular-data

Last synced: 28 Oct 2024

https://github.com/aksnzhy/xlearn

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

data-analysis data-science factorization-machines ffm fm machine-learning statistics

Last synced: 15 Oct 2024

https://github.com/pydata/pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.

data data-analysis dataset econdb economic-data fama-french finance financial-data fred html pandas pydata python stock-data

Last synced: 14 Oct 2024

https://github.com/quadratichq/quadratic

Quadratic | Data Science Spreadsheet with Python & SQL

data data-analysis data-engineering data-science etl python quadratic spreadsheet sql wasm webgl

Last synced: 15 Oct 2024

https://github.com/jm199504/Financial-Knowledge-Graphs

小型金融知识图谱构建流程(neo4j / python / cypher / KG)

cypher data-analysis graph-database neo4j python

Last synced: 30 Oct 2024

https://github.com/jm199504/financial-knowledge-graphs

小型金融知识图谱构建流程(neo4j / python / cypher / KG)

cypher data-analysis graph-database neo4j python

Last synced: 10 Oct 2024

https://github.com/justinzm/gopup

数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…

covid19-data data data-analysis data-science datasets economic-data gopup index-data python

Last synced: 15 Oct 2024

https://github.com/ajcr/100-pandas-puzzles

100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)

data-analysis numpy pandas python

Last synced: 15 Oct 2024

https://github.com/root-project/root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically

c-plus-plus cling data-analysis geometry graphics hacktoberfest interpreter machine-learning mathematics parallel physics python root root-cern statistics visualization

Last synced: 29 Oct 2024

https://github.com/apache/incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

dashboard-friendly data data-analysis data-engineering data-integration data-transfers devops domain-layer dora etl golang hacktoberfest integration jira open-source user-friendly

Last synced: 01 Nov 2024

https://github.com/AAChartModel/AAChartKit-Swift

📈📊📱💻🖥️An elegant modern declarative data visualization chart framework for iOS, iPadOS and macOS. Extremely powerful, supports line, spline, area, areaspline, column, bar, pie, scatter, angular gauges, arearange, areasplinerange, columnrange, bubble, box plot, error bars, funnel, waterfall and polar chart types. 极其精美而又强大的现代化声明式数据可视化图表框架,支持柱状图、条形图、折线图、曲线图、折线填充图、曲线填充图、气泡图、扇形图、环形图、散点图、雷达图、混合图等各种类型的多达几十种的信息图图表,完全满足工作所需.

animation area-chart bar-chart bubble-chart chart column-chart data-analysis data-visualization draw graph graphics ios line-chart model objective-c pie plot spline-chart swift view

Last synced: 04 Aug 2024

https://github.com/modelscope/data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

chinese data-analysis data-science data-visualization dataset gpt gpt-4 instruction-tuning large-language-models llama llava llm llms multi-modal nlp opendata pre-training pytorch sora streamlit

Last synced: 13 Oct 2024