Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/litlyx/litlyx

All-in-one Analytics Solution. Setup in 30 seconds. Display all your data on an AI-powered dashboard. Fully self-hostable and GDPR compliant.

ai analytics angular charts data data-analysis data-visualization javascript metrics nextjs nodejs nuxt open-source react statistics typescript vue website

Last synced: 31 Oct 2024

https://github.com/JacksonWuxs/DaPy

Easy-to-use data analysis / manipulation framework for humans

analysis data-analysis data-science efficiency pypi python statistical-reports

Last synced: 31 Oct 2024

https://github.com/specterops/nemesis

An offensive data enrichment pipeline

data-analysis offensive

Last synced: 07 Nov 2024

https://github.com/LearnDataSci/articles

A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci

data-analysis data-science data-visualization machine-learning machine-learning-algorithms machinelearning python

Last synced: 07 Nov 2024

https://github.com/starpig1129/ai-data-analysis-mulitagent

AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, data analysis, visualization, and report writing. Perfect for researchers and data scientists seeking to enhance their workflow and productivity.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 17 Sep 2024

https://github.com/binpash/pash

PaSh: Light-touch Data-Parallel Shell Processing

bash bash-scripting data-analysis parallelism pash posix-sh shell

Last synced: 13 Oct 2024

https://github.com/dfm/corner.py

Make some beautiful corner plots

data-analysis data-visualization plotting python

Last synced: 15 Oct 2024

https://github.com/ashishpatel26/Amazing-Feature-Engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

data-analysis data-mining data-science data-scientists data-visualization deep-learning feature-engineering feature-extraction feature-scaling feature-selection features machine-learning scikit-learn

Last synced: 07 Nov 2024

https://github.com/akanz1/klib

Easy to use Python library of customized functions for cleaning and analyzing data.

data-analysis data-cleaning data-preprocessing data-science data-visualization feature-selection klib python

Last synced: 03 Aug 2024

https://github.com/shaohua0116/ICLR2020-OpenReviewData

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.

conference crawler data-analysis iclr iclr2020 machine-learning visualization

Last synced: 07 Aug 2024

https://github.com/anthonydb/practical-sql-2

Code and Data for the Second Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2022).

data-analysis postgresql sql

Last synced: 08 Nov 2024

https://github.com/pgalko/BambooAI

A lightweight library that leverages Language Models (LLMs) to enable natural language interactions, allowing you to source and converse with data.

ai ai-agents data-analysis data-science gemini groq llm mistral ollama openai-api pandas pinecone python vector-database

Last synced: 28 Oct 2024

https://github.com/pgalko/bambooai

A lightweight library that leverages Language Models (LLMs) to enable natural language interactions, allowing you to source and converse with data.

ai ai-agents data-analysis data-science gemini groq llm mistral ollama openai-api pandas pinecone python vector-database

Last synced: 10 Oct 2024

https://github.com/openbiox/awosome-bioinformatics

A curated list of resources for learning bioinformatics.

bioinformatics data-analysis next-generation-sequencing

Last synced: 02 Aug 2024

https://github.com/xiaopujun/light-chaser

light chaser is a lightweight data visualization designer tool

blueprints data-analysis data-visualization draggable javascript typescript web-editor

Last synced: 04 Aug 2024

https://github.com/SuperCowPowers/zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

bro data-analysis kafka networking pandas python scikit-learn security spark zeek zeek-analysis

Last synced: 07 Aug 2024

https://github.com/rio-labs/rio

WebApps in pure Python. No JavaScript, HTML and CSS needed

data-analysis data-science data-visualization deep-learning machine-learning python ui webapp

Last synced: 06 Nov 2024

https://github.com/kunalj101/data-science-hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 11 Oct 2024

https://github.com/kunalj101/Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 02 Aug 2024

https://github.com/greppo-io/greppo

Build & deploy geospatial applications quick and easy.

data-analysis data-visualization developer-tools framework geospatial machine-learning python webapp

Last synced: 09 Nov 2024

https://github.com/cloudberrydb/cloudberrydb

Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.

ai cloudberrydb data-analysis data-warehouse database database-management gpdb greenplum greenplum-database mpp olap postgres postgresql postgresql-database sql

Last synced: 27 Oct 2024

https://github.com/jkrumbiegel/chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 30 Oct 2024

https://github.com/jkrumbiegel/Chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 04 Aug 2024

https://github.com/olavolav/uniplot

Lightweight plotting to the terminal. 4x resolution via Unicode.

data-analysis data-science plot python

Last synced: 31 Oct 2024

https://github.com/Niketkumardheeryan/ML-CaPsule

ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.

analytics data-analysis data-science data-visualization datascience deep-learning deep-neural-networks deployment flask heroku-deployment machine-learning python r statistics streamlit-webapp

Last synced: 02 Aug 2024

https://github.com/mouseland/suite2p

cell detection in calcium imaging recordings

data-analysis imaging neuroscience

Last synced: 06 Nov 2024

https://github.com/Litlyx/litlyx

Analytics for developers. Setup Analytics in 30 seconds with just one line of code. Display all your data on an AI-powered dashboard. Fully self-hostable and GDPR compliant.

ai analytics angular charts data data-analysis data-visualization hacktoberfest javascript metrics nextjs nodejs nuxt open-source react statistics typescript vue website

Last synced: 29 Aug 2024

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 13 Oct 2024

https://github.com/d4software/QueryTree

Data reporting and visualization for your app

analytics data-analysis data-visualization database report-builder

Last synced: 08 Aug 2024

https://github.com/MouseLand/suite2p

cell detection in calcium imaging recordings

data-analysis imaging neuroscience

Last synced: 03 Aug 2024

https://github.com/boostorg/histogram

Fast multi-dimensional generalized histogram with convenient interface for C++14

boost boost-libraries c-plus-plus c-plus-plus-14 convenient convenient-interface data-analysis header-only histogram statistics

Last synced: 02 Aug 2024

https://github.com/databrickslabs/tempo

API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation

data-analysis data-science pandas python scala time-series timeseries timeseries-analysis timeseries-data

Last synced: 02 Aug 2024

https://github.com/CJWorkbench/cjworkbench

The data journalism platform with built in training

data-analysis data-journalism data-science data-visualization journalism notebook

Last synced: 06 Aug 2024

https://github.com/PydPiper/pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library

api data-analysis excel microsoft office pypi python python-library python2 python3

Last synced: 31 Oct 2024

https://github.com/pydpiper/pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library

api data-analysis excel microsoft office pypi python python-library python2 python3

Last synced: 12 Oct 2024

https://github.com/farukalamai/advanced-machine-learning-engineer-roadmap-2024

A Full Stack ML (Machine Learning) Roadmap involves learning the necessary skills and technologies to become proficient in all aspects of machine learning, including data collection and preprocessing, model development, deployment, and maintenance.

aws computer-vision data-analysis data-science data-visualization deep-learning git-github machine-learning machine-learning-roadmap mlops natural-language-processing neural-network nlp opencv pandas python pytorch statistics tensorflow yolo

Last synced: 07 Nov 2024

https://github.com/helicalinsight/helicalinsight

Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.

amazon-redshift big-data business-intelligence dashboard data-analysis data-visualization druid graph-database hive mongodb mysql neo4j nosql oracle-database postgresql rdbms reporting sql-editor sqllite

Last synced: 09 Oct 2024

https://github.com/X-lab2017/open-digger

Open source analysis tools

data-analysis github hacktoberfest openrank

Last synced: 28 Oct 2024

https://github.com/Derek-Jones/ESEUR-book

Issue handling for Evidence-based Software Engineering: based on the publicly available data

book data-analysis empirical-research engineering-data evidence-based human-cognitive-characteristics software-development software-engineering

Last synced: 07 Aug 2024

https://github.com/rasgointelligence/RasgoQL

Write python locally, execute SQL in your data warehouse

data-analysis data-science pandas python sql

Last synced: 08 Aug 2024

https://github.com/lucasxlu/LagouJob

Data Analysis & Mining for lagou.com

data-analysis data-mining lagou machine-learning nlp python3 web-crawler

Last synced: 06 Aug 2024

https://github.com/wizardforcel/data-science-notebook

:book: 每一个伟大的思想和行动都有一个微不足道的开始

data-analysis data-science machine-learning notebook numpy pandas sklearn tensorflow

Last synced: 10 Oct 2024

https://github.com/bears-r-us/arkouda

Arkouda (αρκούδα): Interactive Data Analytics at Supercomputing Scale :bear:

chapel data data-analysis data-science distributed-computing eda hpc python

Last synced: 31 Oct 2024

https://github.com/recodehive/stackoverflow-analysis

Stack overflow is a professional community for developers. This repo analysis 3 years of developer Survey done by Stackoverflow and do visualization and predict the salary of Data Scientist in future.

canva collaborate data-analysis data-science data-visualization ghdesktop github github-pages machine-learning stack-overflow student-vscode survey-analysis vscode

Last synced: 08 Nov 2024

https://github.com/tkrabel/edaviz

edaviz - Python library for Exploratory Data Analysis and Visualization in Jupyter Notebook or Jupyter Lab

altair data-analysis data-exploration data-sciene data-visualization eda edaviz exploratory-data interactive jupyter-notebook matplotlib pandas plotly project-jupyter pyhon qgrid seaborn

Last synced: 31 Oct 2024

https://github.com/nickslevine/zebras

Data analysis library for JavaScript built with Ramda

data-analysis data-science functional-programming javascript pandas ramda

Last synced: 07 Nov 2024

https://github.com/milaan9/11_python_matplotlib_module

Matplotlib is an amazing visualization library in Python for 2D plots of arrays. Matplotlib is a multi-platform data visualization library built on NumPy arrays and designed to work with the broader SciPy stack. It was introduced by John Hunter in the year 2002. One of the greatest benefits of visualization is that it allows us visual access to huge amounts of data in easily digestible visuals. Matplotlib consists of several plots like line, bar, scatter, histogram, etc

data-analysis data-visualization ipython-notebook matplotlib matplotlib-examples matplotlib-exercises matplotlib-figures matplotlib-heatmap matplotlib-pyplot matplotlib-python matplotlib-tutorial python-matplotlib python-tutor python-tutorial-github python-tutorial-notebook python-tutorials python4beginner python4datascience python4everybody tutor-milaan9

Last synced: 11 Oct 2024

https://github.com/acerbilab/vbmc

Variational Bayesian Monte Carlo (VBMC) algorithm for posterior and model inference in MATLAB

bayesian-inference data-analysis gaussian-processes machine-learning matlab variational-inference

Last synced: 30 Oct 2024

https://github.com/dataplane-app/dataplane

Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.

airflow data data-analysis data-engineering data-integration data-pipelines data-science dataplane datawarehouse etl finance golang kubernetes pipelines robotics-process-automation rpa scheduler workflow workflow-automation workflows

Last synced: 02 Aug 2024

https://github.com/koldlight/curso-python-analisis-datos

Curso de python básico orientado al análisis de datos, en español

course data data-analysis folium hacktoberfest numpy pandas python requests seaborn spanish

Last synced: 09 Nov 2024

https://github.com/aws/amazon-redshift-python-driver

Redshift Python Connector. It supports Python Database API Specification v2.0.

amazon-redshift aws-redshift data-analysis data-science

Last synced: 07 Oct 2024

https://github.com/ayush1997/visualize_ML

Python package for consolidated and extensive Univariate,Bivariate Data Analysis and Visualization catering to both categorical and continuous datasets.

data-analysis machine-learning matplotlib python statisics visualization

Last synced: 25 Oct 2024

https://github.com/nshiab/simple-data-analysis.js

Easy-to-use and high-performance JavaScript library for data analysis.

data data-analysis data-science duckdb javascript nodejs typescript

Last synced: 12 Aug 2024

https://github.com/nshiab/simple-data-analysis

Easy-to-use and high-performance JavaScript library for data analysis.

data data-analysis data-science duckdb javascript nodejs typescript

Last synced: 28 Oct 2024