Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/404notf0und/ai-for-security-learning

安全场景、基于AI的安全算法和安全数据分析业界实践

data-analysis data-mining machine-learning security

Last synced: 01 Aug 2024

https://github.com/ecmadao/hacknical

Hacknical, hacker & technical. A website for GitHub user to make a better resume.

contribute-languages contributions data-analysis github github-analysis github-commits github-contributions reac react resume resume-template

Last synced: 31 Jul 2024

https://github.com/nubank/fklearn

fklearn: Functional Machine Learning

data-analysis data-science machine-learning ml python

Last synced: 01 Aug 2024

https://github.com/DataBrewery/cubes

[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis

cube data data-analysis data-warehouse multidimensional-analysis olap sql

Last synced: 31 Jul 2024

https://github.com/rilldata/rill

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

bi business-analytics csv data data-analysis data-visualization dataviz duckdb gcs golang parquet parquet-tools parquet-viewer s3 sql sql-editor svelte sveltejs sveltekit

Last synced: 31 Jul 2024

https://github.com/rilldata/rill-developer

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

bi business-analytics csv data data-analysis data-visualization dataviz duckdb gcs golang parquet parquet-tools parquet-viewer s3 sql sql-editor svelte sveltejs sveltekit

Last synced: 21 Aug 2024

https://github.com/modelscope/data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

chinese data-analysis data-science data-visualization dataset gpt gpt-4 instruction-tuning large-language-models llama llava llm llms multi-modal nlp opendata pre-training pytorch sora streamlit

Last synced: 01 Aug 2024

https://github.com/datageartech/datagear

数据可视化分析平台,自由制作任何您想要的数据看板

bi business-intelligence chart data-analysis data-analytics data-visualization echarts

Last synced: 01 Aug 2024

https://github.com/PatMartin/Dex

Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.

d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization

Last synced: 02 Aug 2024

https://github.com/DAGWorks-Inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hacktoberfest lineage llmops machine-learning mlops numpy orchestration pandas python software-engineering

Last synced: 31 Jul 2024

https://github.com/dagworks-inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hacktoberfest lineage llmops machine-learning mlops numpy orchestration pandas python software-engineering

Last synced: 01 Aug 2024

https://github.com/GoogleCloudPlatform/data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning

Last synced: 07 Aug 2024

https://github.com/singer-io/getting-started

This repository is a getting started guide to Singer.

data-analysis etl etl-framework python singer

Last synced: 31 Jul 2024

https://github.com/microsoft/responsible-ai-toolbox

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

data-analysis data-science data-visualization error-analysis explainability explainable-ai explainable-ml fairness fairness-ai fairness-ml interpretability jupyter machine-learning machinelearning ml responsible-ai ui visualization widget widgets

Last synced: 01 Aug 2024

https://github.com/alan-turing-institute/CleverCSV

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3

Last synced: 31 Jul 2024

https://github.com/VisActor/VTable

VTable is not just a high-performance multidimensional data analysis table, but also a grid artist that creates art between rows and columns.

canvas-table data-analysis data-visualization database datagrid grid javascript-table javescript list-table list-tree online-excel pivot-chart pivot-grid pivot-tables sparklines spreadsheet table tree-chart tree-table visualization

Last synced: 17 Aug 2024

https://github.com/intel/scikit-learn-intelex

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

ai-inference ai-machine-learning ai-training analytics big-data data-analysis gpu intel machine-learning machine-learning-algorithms oneapi python scikit-learn swrepo

Last synced: 31 Jul 2024

https://github.com/machow/siuba

Python library for using dplyr like syntax with pandas and SQL

data-analysis dplyr pandas python sql

Last synced: 01 Aug 2024

https://github.com/man-group/ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 30 Jul 2024

https://github.com/man-group/arcticdb

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 31 Jul 2024

https://github.com/apachecn/pyda-2e-zh

:book: [译] 利用 Python 进行数据分析 · 第 2 版

book data-analysis numpy pandas pyda python

Last synced: 31 Jul 2024

https://github.com/comet-ml/kangas

🦘 Explore multimedia datasets at scale

data-analysis data-exploration dataframe datagrid machine-learning

Last synced: 01 Aug 2024

https://github.com/d4t4x/data-selfie

Data Selfie - a browser extension to track yourself on Facebook and analyze your data.

chrome-extension data-analysis data-dashboard firefox-addon privacy

Last synced: 31 Jul 2024

https://github.com/xinglie/report-designer

⚡打印设计、可视化、标签打印、编辑器、设计器、数据分析、报表设计、组件化、表单设计、h5页面、调查问卷、pdf生成、流程图、试卷、SVG、图形元素、物联网、标签纸

cloud-print data-analysis data-visualization editor h5-creator h5-editor h5-maker iot-demo layouts-and-renderings online-design online-printing printer snapshot visiual-editor xinglie

Last synced: 01 Aug 2024

https://github.com/GoogleCloudPlatform/DataflowJavaSDK

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

big-data data-analysis data-mining data-processing data-science google-cloud-dataflow

Last synced: 02 Aug 2024

https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

data-analysis data-science machine-learning python sklearn statistical-learning tensorflow tutorials

Last synced: 02 Aug 2024

https://github.com/visualpython/visualpython

GUI-based Python code generator for data science, extension to Jupyter Lab, Jupyter Notebook and Google Colab.

bigdata chrome-extension code-generator data-analysis jupyter-lab-extension jupyter-notebook-extension jupyterlab-extension pandas python visual-coding

Last synced: 03 Aug 2024

https://github.com/yoshoku/rumale

Rumale is a machine learning library in Ruby

artificial-intelligence data-analysis data-science machine-learning ml ruby rubyml

Last synced: 30 Jul 2024

https://github.com/JosephLai241/URS

Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

archiving command-line comments csv data-analysis data-science json livestream osint-tool praw pyo3 python reddit reddit-scraper redditor rust scraper subreddit trees wordcloud

Last synced: 31 Jul 2024

https://github.com/ipython-books/cookbook-2nd-code

Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]

computing data-analysis data-mining data-science data-visualization ipython jupyter jupyter-notebook machine-learning numerical-computation python visualization

Last synced: 02 Aug 2024

https://github.com/Kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 01 Aug 2024

https://github.com/mpw0311/antd-umi-sys

企业BI系统,数据可视化平台,主要技术:react、antd、umi、dva、es6、less等,与君共勉,互相学习,如果喜欢请start ⭐。

antd antd-umi-sys company-site d3js data-analysis data-visualization dva dvajs echarts echarts-for-react es6 gitdatav react react-redux react-router redux sankey umi umijs

Last synced: 03 Aug 2024

https://github.com/nicolaskruchten/jupyter_pivottablejs

Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js

data-analysis data-science interactive jupyter-notebook pivot-chart pivot-tables

Last synced: 01 Aug 2024

https://github.com/arvkevi/kneed

Knee point detection in Python :chart_with_upwards_trend:

data-analysis data-science elbow-method knee-point python scientific-computing systems

Last synced: 31 Jul 2024

https://github.com/anthonydb/practical-sql

Code and Data for the First Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2018).

data-analysis postgresql sql

Last synced: 13 Aug 2024

https://github.com/elastic/eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

big-data data-analysis dataframe dataframes eland elasticsearch etl lightgbm machine-learning pandas python scikit-learn time-series-forecasting

Last synced: 02 Aug 2024

https://github.com/dmpe/R

Exercises (incl. analyses) with R language (math+statistics)

course data-analysis exercise r statistics

Last synced: 05 Aug 2024

https://github.com/planetlabs/notebooks

interactive notebooks from Planet Engineering

api data-analysis jupyter-notebooks python remote-sensing satellite-imagery

Last synced: 01 Aug 2024

https://github.com/SciTools/iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data

data-analysis earth-science grib iris meteorology netcdf oceanography python spaceweather visualisation

Last synced: 01 Aug 2024

https://github.com/JacksonWuxs/DaPy

Easy-to-use data analysis / manipulation framework for humans

analysis data-analysis data-science efficiency pypi python statistical-reports

Last synced: 31 Jul 2024

https://github.com/LearnDataSci/articles

A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci

data-analysis data-science data-visualization machine-learning machine-learning-algorithms machinelearning python

Last synced: 01 Aug 2024

https://github.com/starpig1129/ai-data-analysis-mulitagent

AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, data analysis, visualization, and report writing. Perfect for researchers and data scientists seeking to enhance their workflow and productivity.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 17 Sep 2024

https://github.com/binpash/pash

PaSh: Light-touch Data-Parallel Shell Processing

bash bash-scripting data-analysis parallelism pash posix-sh shell

Last synced: 01 Aug 2024

https://github.com/dfm/corner.py

Make some beautiful corner plots

data-analysis data-visualization plotting python

Last synced: 07 Aug 2024

https://github.com/akanz1/klib

Easy to use Python library of customized functions for cleaning and analyzing data.

data-analysis data-cleaning data-preprocessing data-science data-visualization feature-selection klib python

Last synced: 03 Aug 2024

https://github.com/shaohua0116/ICLR2020-OpenReviewData

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.

conference crawler data-analysis iclr iclr2020 machine-learning visualization

Last synced: 07 Aug 2024

https://github.com/anthonydb/practical-sql-2

Code and Data for the Second Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2022).

data-analysis postgresql sql

Last synced: 02 Aug 2024

https://github.com/openbiox/awosome-bioinformatics

A curated list of resources for learning bioinformatics.

bioinformatics data-analysis next-generation-sequencing

Last synced: 02 Aug 2024

https://github.com/xiaopujun/light-chaser

light chaser is a lightweight data visualization designer tool

blueprints data-analysis data-visualization draggable javascript typescript web-editor

Last synced: 04 Aug 2024

https://github.com/SuperCowPowers/zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

bro data-analysis kafka networking pandas python scikit-learn security spark zeek zeek-analysis

Last synced: 07 Aug 2024

https://github.com/pgalko/BambooAI

A lightweight library that leverages Language Models (LLMs) to enable natural language interactions, allowing you to source and converse with data.

ai ai-agents data-analysis data-science gemini groq llm mistral ollama openai-api pandas pinecone python vector-database

Last synced: 31 Jul 2024

https://github.com/rio-labs/rio

WebApps in pure Python. No JavaScript, HTML and CSS needed

data-analysis data-science data-visualization deep-learning machine-learning python ui webapp

Last synced: 01 Aug 2024

https://github.com/kunalj101/Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 02 Aug 2024

https://github.com/greppo-io/greppo

Build & deploy geospatial applications quick and easy.

data-analysis data-visualization developer-tools framework geospatial machine-learning python webapp

Last synced: 01 Aug 2024

https://github.com/briefercloud/briefer

Dashboards and notebooks in a single place. Create powerful and flexible dashboards using code, or build beautiful Notion-like notebooks and share them with your team.

analytics bi bigquery briefer business-intelligence businessintelligence dashboard data-analysis data-visualization jupyter notebook postgres postgresql reporting visualization

Last synced: 11 Sep 2024

https://github.com/jkrumbiegel/Chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 04 Aug 2024

https://github.com/Niketkumardheeryan/ML-CaPsule

ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.

analytics data-analysis data-science data-visualization datascience deep-learning deep-neural-networks deployment flask heroku-deployment machine-learning python r statistics streamlit-webapp

Last synced: 02 Aug 2024

https://github.com/Litlyx/litlyx

Analytics for developers. Setup Analytics in 30 seconds with just one line of code. Display all your data on an AI-powered dashboard. Fully self-hostable and GDPR compliant.

ai analytics angular charts data data-analysis data-visualization hacktoberfest javascript metrics nextjs nodejs nuxt open-source react statistics typescript vue website

Last synced: 29 Aug 2024

https://github.com/olavolav/uniplot

Lightweight plotting to the terminal. 4x resolution via Unicode.

data-analysis data-science plot python

Last synced: 31 Jul 2024

https://github.com/d4software/QueryTree

Data reporting and visualization for your app

analytics data-analysis data-visualization database report-builder

Last synced: 08 Aug 2024

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 09 Aug 2024

https://github.com/MouseLand/suite2p

cell detection in calcium imaging recordings

data-analysis imaging neuroscience

Last synced: 03 Aug 2024