Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/anthonydb/practical-sql

Code and Data for the First Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2018).

data-analysis postgresql sql

Last synced: 18 Dec 2024

https://github.com/elastic/eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

big-data data-analysis dataframe dataframes eland elasticsearch etl lightgbm machine-learning pandas python scikit-learn time-series-forecasting

Last synced: 22 Dec 2024

https://github.com/aloctavodia/bap

Bayesian Analysis with Python (Second Edition)

arviz bayesian-analysis data-analysis data-visualization errata pymc3 python

Last synced: 17 Dec 2024

https://github.com/SciTools/iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data

data-analysis earth-science grib iris meteorology netcdf oceanography python spaceweather visualisation

Last synced: 06 Nov 2024

https://github.com/planetlabs/notebooks

interactive notebooks from Planet Engineering

api data-analysis jupyter-notebooks python remote-sensing satellite-imagery

Last synced: 06 Nov 2024

https://github.com/jacksonwuxs/dapy

Easy-to-use data analysis / manipulation framework for humans

analysis data-analysis data-science efficiency pypi python statistical-reports

Last synced: 17 Dec 2024

https://github.com/JacksonWuxs/DaPy

Easy-to-use data analysis / manipulation framework for humans

analysis data-analysis data-science efficiency pypi python statistical-reports

Last synced: 31 Oct 2024

https://github.com/specterops/nemesis

An offensive data enrichment pipeline

data-analysis offensive

Last synced: 21 Dec 2024

https://github.com/LearnDataSci/articles

A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci

data-analysis data-science data-visualization machine-learning machine-learning-algorithms machinelearning python

Last synced: 07 Nov 2024

https://github.com/binpash/pash

PaSh: Light-touch Data-Parallel Shell Processing

bash bash-scripting data-analysis parallelism pash posix-sh shell

Last synced: 21 Dec 2024

https://github.com/starpig1129/ai-data-analysis-mulitagent

AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, data analysis, visualization, and report writing. Perfect for researchers and data scientists seeking to enhance their workflow and productivity.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 17 Sep 2024

https://github.com/dfm/corner.py

Make some beautiful corner plots

data-analysis data-visualization plotting python

Last synced: 18 Dec 2024

https://github.com/ashishpatel26/Amazing-Feature-Engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

data-analysis data-mining data-science data-scientists data-visualization deep-learning feature-engineering feature-extraction feature-scaling feature-selection features machine-learning scikit-learn

Last synced: 07 Nov 2024

https://github.com/farukalamai/advanced-machine-learning-engineer-roadmap-2024

A Full Stack ML (Machine Learning) Roadmap involves learning the necessary skills and technologies to become proficient in all aspects of machine learning, including data collection and preprocessing, model development, deployment, and maintenance.

aws computer-vision data-analysis data-science data-visualization deep-learning git-github machine-learning machine-learning-roadmap mlops natural-language-processing neural-network nlp opencv pandas python pytorch statistics tensorflow yolo

Last synced: 22 Dec 2024

https://github.com/anthonydb/practical-sql-2

Code and Data for the Second Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2022).

data-analysis postgresql sql

Last synced: 20 Dec 2024

https://github.com/akanz1/klib

Easy to use Python library of customized functions for cleaning and analyzing data.

data-analysis data-cleaning data-preprocessing data-science data-visualization feature-selection klib python

Last synced: 15 Nov 2024

https://github.com/pgalko/bambooai

A lightweight library that leverages Language Models (LLMs) to enable natural language interactions, allowing you to source and converse with data.

ai ai-agents data-analysis data-science gemini groq llm mistral ollama openai-api pandas pinecone python vector-database

Last synced: 21 Dec 2024

https://github.com/shaohua0116/ICLR2020-OpenReviewData

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.

conference crawler data-analysis iclr iclr2020 machine-learning visualization

Last synced: 27 Nov 2024

https://github.com/openbiox/awosome-bioinformatics

A curated list of resources for learning bioinformatics.

bioinformatics data-analysis next-generation-sequencing

Last synced: 13 Nov 2024

https://github.com/pgalko/BambooAI

A lightweight library that leverages Language Models (LLMs) to enable natural language interactions, allowing you to source and converse with data.

ai ai-agents data-analysis data-science gemini groq llm mistral ollama openai-api pandas pinecone python vector-database

Last synced: 28 Oct 2024

https://github.com/xiaopujun/light-chaser

light chaser is a lightweight data visualization designer tool

blueprints data-analysis data-visualization draggable javascript typescript web-editor

Last synced: 19 Nov 2024

https://github.com/supercowpowers/zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

bro data-analysis kafka networking pandas python scikit-learn security spark zeek zeek-analysis

Last synced: 22 Dec 2024

https://github.com/SuperCowPowers/zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

bro data-analysis kafka networking pandas python scikit-learn security spark zeek zeek-analysis

Last synced: 27 Nov 2024

https://github.com/Niketkumardheeryan/ML-CaPsule

ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.

analytics data-analysis data-science data-visualization datascience deep-learning deep-neural-networks deployment flask heroku-deployment machine-learning python r statistics streamlit-webapp

Last synced: 13 Nov 2024

https://github.com/rio-labs/rio

WebApps in pure Python. No JavaScript, HTML and CSS needed

data-analysis data-science data-visualization deep-learning machine-learning python ui webapp

Last synced: 06 Nov 2024

https://github.com/kunalj101/Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 13 Nov 2024

https://github.com/kunalj101/data-science-hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 11 Oct 2024

https://github.com/apache/cloudberry

Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.

ai cloudberrydb data-analysis data-warehouse database database-management gpdb greenplum greenplum-database mpp olap postgres postgresql postgresql-database sql

Last synced: 22 Dec 2024

https://github.com/greppo-io/greppo

Build & deploy geospatial applications quick and easy.

data-analysis data-visualization developer-tools framework geospatial machine-learning python webapp

Last synced: 09 Nov 2024

https://github.com/cloudberrydb/cloudberrydb

Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.

ai cloudberrydb data-analysis data-warehouse database database-management gpdb greenplum greenplum-database mpp olap postgres postgresql postgresql-database sql

Last synced: 27 Oct 2024

https://github.com/jkrumbiegel/chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 21 Dec 2024

https://github.com/ptyadana/data-science-and-machine-learning-projects-dojo

collections of data science, machine learning and data visualization projects with pandas, sklearn, matplotlib, tensorflow2, Keras, various ML algorithms like random forest classifier, boosting, etc

boosting-algorithms data-analysis data-science data-visualization deep-learning keras machine-learning machine-learning-algorithms natural-language-processing pandas probability-statistics scikit-learn seaborn tensorflow

Last synced: 16 Dec 2024

https://github.com/jkrumbiegel/Chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 19 Nov 2024

https://github.com/weijie-chen/econometrics-with-python

Tutorials of econometrics featuring Python programming. This is a crash course for reviewing the most important concepts and techniques of basic econometrics, the theories are presented lightly without hustles of derivation and Python codes are straightforward.

data-analysis data-science econometrics economics python statistics time-series

Last synced: 22 Dec 2024

https://github.com/mouseland/suite2p

cell detection in calcium imaging recordings

data-analysis imaging neuroscience

Last synced: 19 Dec 2024

https://github.com/MouseLand/suite2p

cell detection in calcium imaging recordings

data-analysis imaging neuroscience

Last synced: 14 Nov 2024

https://github.com/olavolav/uniplot

Lightweight plotting to the terminal. 4x resolution via Unicode.

data-analysis data-science plot python

Last synced: 31 Oct 2024

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 20 Dec 2024

https://github.com/d4software/QueryTree

Data reporting and visualization for your app

analytics data-analysis data-visualization database report-builder

Last synced: 27 Nov 2024

https://github.com/toxictoskey/dex-autotrader-bot

This is a Cryptocurrency Trading bot on DeFi that works in multiple Chain with unique trading strategies for cryptocurrencies. It performs automated technical analysis of cryptocurrencies, manages risk, reduces slippage and has customizable strategies such as Stop Loss and Buy the Dip.

arbitrum automated-trading base-network binance binance-smart-chain blockchain bybit curve data-analysis defi dydx eth fraxtal kucoin layer2 polygon sniping-bot solana starknet zksync

Last synced: 22 Dec 2024

https://github.com/databrickslabs/tempo

API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation

data-analysis data-science pandas python scala time-series timeseries timeseries-analysis timeseries-data

Last synced: 11 Nov 2024

https://github.com/boostorg/histogram

Fast multi-dimensional generalized histogram with convenient interface for C++14

boost boost-libraries c-plus-plus c-plus-plus-14 convenient convenient-interface data-analysis header-only histogram statistics

Last synced: 22 Dec 2024

https://github.com/pydpiper/pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library

api data-analysis excel microsoft office pypi python python-library python2 python3

Last synced: 21 Dec 2024

https://github.com/helicalinsight/helicalinsight

Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.

amazon-redshift big-data business-intelligence dashboard data-analysis data-visualization druid graph-database hive mongodb mysql neo4j nosql oracle-database postgresql rdbms reporting sql-editor sqllite

Last synced: 16 Dec 2024

https://github.com/CJWorkbench/cjworkbench

The data journalism platform with built in training

data-analysis data-journalism data-science data-visualization journalism notebook

Last synced: 24 Nov 2024

https://github.com/PydPiper/pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library

api data-analysis excel microsoft office pypi python python-library python2 python3

Last synced: 31 Oct 2024

https://github.com/X-lab2017/open-digger

Open source analysis tools

data-analysis github hacktoberfest openrank

Last synced: 28 Oct 2024

https://github.com/Derek-Jones/ESEUR-book

Issue handling for Evidence-based Software Engineering: based on the publicly available data

book data-analysis empirical-research engineering-data evidence-based human-cognitive-characteristics software-development software-engineering

Last synced: 26 Nov 2024

https://github.com/rasgointelligence/RasgoQL

Write python locally, execute SQL in your data warehouse

data-analysis data-science pandas python sql

Last synced: 27 Nov 2024

https://github.com/wizardforcel/data-science-notebook

:book: 每一个伟大的思想和行动都有一个微不足道的开始

data-analysis data-science machine-learning notebook numpy pandas sklearn tensorflow

Last synced: 18 Dec 2024

https://github.com/lucasxlu/LagouJob

Data Analysis & Mining for lagou.com

data-analysis data-mining lagou machine-learning nlp python3 web-crawler

Last synced: 25 Nov 2024

https://github.com/kde/labplot

LabPlot is a FREE, open source and cross-platform Data Visualization and Analysis software accessible to everyone.

data-analysis data-science data-visualization fitting graph graph2d plotting scientific-plotting scientific-visualization

Last synced: 18 Dec 2024

https://github.com/bears-r-us/arkouda

Arkouda (αρκούδα): Interactive Data Analytics at Supercomputing Scale :bear:

chapel data data-analysis data-science distributed-computing eda hpc python

Last synced: 16 Dec 2024

https://github.com/Bears-R-Us/arkouda

Arkouda (αρκούδα): Interactive Data Analytics at Supercomputing Scale :bear:

chapel data data-analysis data-science distributed-computing eda hpc python

Last synced: 20 Nov 2024

https://github.com/curiositry/eegrunt

A Collection Python EEG (+ ECG) Analysis Utilities for OpenBCI and Muse

data-analysis data-visualization ecg eeg muse neuroscience openbci python

Last synced: 18 Nov 2024

https://github.com/recodehive/stackoverflow-analysis

Stack overflow is a professional community for developers. This repo analysis 3 years of developer Survey done by Stackoverflow and do visualization and predict the salary of Data Scientist in future.

canva collaborate data-analysis data-science data-visualization ghdesktop github github-pages machine-learning stack-overflow student-vscode survey-analysis vscode

Last synced: 21 Dec 2024

https://github.com/CICIFLY/Data-Analytics-Projects

This repository contains the projects related to data collecting, assessing,cleaning,visualizations and analyzing

data-analysis data-visualization jupyter-notebook matplotlib numpy pandas seaborn

Last synced: 12 Nov 2024

https://github.com/tkrabel/edaviz

edaviz - Python library for Exploratory Data Analysis and Visualization in Jupyter Notebook or Jupyter Lab

altair data-analysis data-exploration data-sciene data-visualization eda edaviz exploratory-data interactive jupyter-notebook matplotlib pandas plotly project-jupyter pyhon qgrid seaborn

Last synced: 18 Dec 2024