An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/JosephLai241/URS

Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

archiving command-line comments csv data-analysis data-science json livestream osint-tool praw pyo3 python reddit reddit-scraper redditor rust scraper subreddit trees wordcloud

Last synced: 24 Mar 2025

https://github.com/Litlyx/litlyx

Powerful Analytics Solution. Setup in 30 seconds. Display all your data on a Simple, AI-powered dashboard. Fully self-hostable and GDPR compliant.

ai analytics angular charts data data-analysis data-visualization javascript metrics nextjs nodejs nuxt open-source react statistics typescript vue website

Last synced: 22 Dec 2024

https://github.com/ipython-books/cookbook-2nd-code

Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]

computing data-analysis data-mining data-science data-visualization ipython jupyter jupyter-notebook machine-learning numerical-computation python visualization

Last synced: 12 Apr 2025

https://github.com/arvkevi/kneed

Knee point detection in Python :chart_with_upwards_trend:

data-analysis data-science elbow-method knee-point python scientific-computing systems

Last synced: 23 Mar 2025

https://github.com/program-spiritual/dataanalysisinaction

(Finished) Geek Time Data Analysis Practical 45 Lecture - Detailed notes containing markdown images mind map code data can be read directly code test

data-analysis data-analytics in-action notebook-jupyter pipenv pyenv python python-data-analysis python-data-science python3

Last synced: 15 Jan 2025

https://github.com/nicolaskruchten/jupyter_pivottablejs

Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js

data-analysis data-science interactive jupyter-notebook pivot-chart pivot-tables

Last synced: 07 Apr 2025

https://github.com/ashishpatel26/Amazing-Feature-Engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

data-analysis data-mining data-science data-scientists data-visualization deep-learning feature-engineering feature-extraction feature-scaling feature-selection features machine-learning scikit-learn

Last synced: 10 Apr 2025

https://github.com/ashishpatel26/amazing-feature-engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

data-analysis data-mining data-science data-scientists data-visualization deep-learning feature-engineering feature-extraction feature-scaling feature-selection features machine-learning scikit-learn

Last synced: 12 Apr 2025

https://github.com/dmpe/r

Exercises (incl. analyses) with R language (math+statistics)

course data-analysis exercise r statistics

Last synced: 12 Apr 2025

https://github.com/mpw0311/antd-umi-sys

企业BI系统,数据可视化平台,主要技术:react、antd、umi、dva、es6、less等,与君共勉,互相学习,如果喜欢请start ⭐。

antd antd-umi-sys company-site d3js data-analysis data-visualization dva dvajs echarts echarts-for-react es6 gitdatav react react-redux react-router redux sankey umi umijs

Last synced: 04 Apr 2025

https://github.com/abixen/abixen-platform

Abixen Platform is a microservices based software platform for building enterprise applications delivering functionalities through creating particular microservices and integrating by provided CMS.

analytics angularjs architecture aws business-intelligence businessintelligence charts cloud dashboard data-analysis data-analytics data-visualization low-code microservices netflixoss reporting spring-boot spring-cloud sql-editor visualization

Last synced: 12 Apr 2025

https://github.com/anthonydb/practical-sql

Code and Data for the First Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2018).

data-analysis postgresql sql

Last synced: 12 Apr 2025

https://github.com/dmpe/R

Exercises (incl. analyses) with R language (math+statistics)

course data-analysis exercise r statistics

Last synced: 22 Nov 2024

https://github.com/elastic/eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

big-data data-analysis dataframe dataframes eland elasticsearch etl lightgbm machine-learning pandas python scikit-learn time-series-forecasting

Last synced: 14 Apr 2025

https://github.com/aloctavodia/bap

Bayesian Analysis with Python (Second Edition)

arviz bayesian-analysis data-analysis data-visualization errata pymc3 python

Last synced: 12 Apr 2025

https://github.com/dmnfarrell/pandastable

Table analysis in Tkinter using pandas DataFrames.

data-analysis dataframe pandas plotting scientific tkinter

Last synced: 09 Apr 2025

https://github.com/SciTools/iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data

data-analysis earth-science grib iris meteorology netcdf oceanography python spaceweather visualisation

Last synced: 07 Apr 2025

https://github.com/planetlabs/notebooks

interactive notebooks from Planet Engineering

api data-analysis jupyter-notebooks python remote-sensing satellite-imagery

Last synced: 06 Apr 2025

https://github.com/farukalamai/advanced-machine-learning-engineer-roadmap-2024

A Full Stack ML (Machine Learning) Roadmap involves learning the necessary skills and technologies to become proficient in all aspects of machine learning, including data collection and preprocessing, model development, deployment, and maintenance.

aws computer-vision data-analysis data-science data-visualization deep-learning git-github machine-learning machine-learning-roadmap mlops natural-language-processing neural-network nlp opencv pandas python pytorch statistics tensorflow yolo

Last synced: 04 Apr 2025

https://github.com/jacksonwuxs/dapy

Easy-to-use data analysis / manipulation framework for humans

analysis data-analysis data-science efficiency pypi python statistical-reports

Last synced: 05 Apr 2025

https://github.com/JacksonWuxs/DaPy

Easy-to-use data analysis / manipulation framework for humans

analysis data-analysis data-science efficiency pypi python statistical-reports

Last synced: 28 Mar 2025

https://github.com/binpash/pash

PaSh: Light-touch Data-Parallel Shell Processing

bash bash-scripting data-analysis parallelism pash posix-sh shell

Last synced: 14 Apr 2025

https://github.com/specterops/nemesis

An offensive data enrichment pipeline

data-analysis offensive

Last synced: 04 Apr 2025

https://github.com/LearnDataSci/articles

A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci

data-analysis data-science data-visualization machine-learning machine-learning-algorithms machinelearning python

Last synced: 13 Apr 2025

https://github.com/anthonydb/practical-sql-2

Code and Data for the Second Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press.

data-analysis postgresql sql

Last synced: 14 Apr 2025

https://github.com/pgalko/bambooai

A Python library powered by Language Models (LLMs) for conversational data discovery and analysis.

ai ai-agents data-analysis data-science gemini groq llm mistral ollama openai-api pandas pinecone python vector-database

Last synced: 07 Apr 2025

https://github.com/dfm/corner.py

Make some beautiful corner plots

data-analysis data-visualization plotting python

Last synced: 10 Apr 2025

https://github.com/akanz1/klib

Easy to use Python library of customized functions for cleaning and analyzing data.

data-analysis data-cleaning data-preprocessing data-science data-visualization feature-selection klib python

Last synced: 15 Nov 2024

https://github.com/s-shemmee/sql-101

Get started with SQL database programming. This beginner's guide provides step-by-step tutorials, practical examples, exercises, and resources to master SQL. Let's unlock the power of data with SQL!

data-analysis data-science sql sql-challenges sql-commands sql-database sql-injection sql-server

Last synced: 05 Apr 2025

https://github.com/shaohua0116/ICLR2020-OpenReviewData

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.

conference crawler data-analysis iclr iclr2020 machine-learning visualization

Last synced: 27 Nov 2024

https://github.com/openbiox/awosome-bioinformatics

A curated list of resources for learning bioinformatics.

bioinformatics data-analysis next-generation-sequencing

Last synced: 13 Nov 2024

https://github.com/pgalko/BambooAI

A lightweight library that leverages Language Models (LLMs) to enable natural language interactions, allowing you to source and converse with data.

ai ai-agents data-analysis data-science gemini groq llm mistral ollama openai-api pandas pinecone python vector-database

Last synced: 23 Mar 2025

https://github.com/supercowpowers/zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

bro data-analysis kafka networking pandas python scikit-learn security spark zeek zeek-analysis

Last synced: 09 Apr 2025

https://github.com/ptyadana/data-science-and-machine-learning-projects-dojo

collections of data science, machine learning and data visualization projects with pandas, sklearn, matplotlib, tensorflow2, Keras, various ML algorithms like random forest classifier, boosting, etc

boosting-algorithms data-analysis data-science data-visualization deep-learning keras machine-learning machine-learning-algorithms natural-language-processing pandas probability-statistics scikit-learn seaborn tensorflow

Last synced: 05 Apr 2025

https://github.com/xiaopujun/light-chaser

light chaser is a lightweight data visualization designer tool

blueprints data-analysis data-visualization draggable javascript typescript web-editor

Last synced: 19 Nov 2024

https://github.com/SuperCowPowers/zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

bro data-analysis kafka networking pandas python scikit-learn security spark zeek zeek-analysis

Last synced: 27 Nov 2024

https://github.com/Niketkumardheeryan/ML-CaPsule

ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.

analytics data-analysis data-science data-visualization datascience deep-learning deep-neural-networks deployment flask heroku-deployment machine-learning python r statistics streamlit-webapp

Last synced: 13 Nov 2024

https://github.com/kunalj101/data-science-hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 13 Feb 2025

https://github.com/kunalj101/Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 13 Nov 2024

https://github.com/rio-labs/rio

WebApps in pure Python. No JavaScript, HTML and CSS needed

data-analysis data-science data-visualization deep-learning machine-learning python ui webapp

Last synced: 09 Apr 2025

https://github.com/olavolav/uniplot

Lightweight plotting to the terminal. 4x resolution via Unicode.

data-analysis data-science plot python

Last synced: 29 Mar 2025

https://github.com/weijie-chen/econometrics-with-python

Tutorials of econometrics featuring Python programming. This is a crash course for reviewing the most important concepts and techniques of basic econometrics, the theories are presented lightly without hustles of derivation and Python codes are straightforward.

data-analysis data-science econometrics economics python statistics time-series

Last synced: 05 Apr 2025

https://github.com/greppo-io/greppo

Build & deploy geospatial applications quick and easy.

data-analysis data-visualization developer-tools framework geospatial machine-learning python webapp

Last synced: 18 Apr 2025

https://github.com/discoveryjs/discovery

A framework for ad hoc JSON data analysis, shareable server-less reports and dashboards

data-analysis data-visualization json

Last synced: 15 Apr 2025

https://github.com/jkrumbiegel/chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 12 Apr 2025

https://github.com/mouseland/suite2p

cell detection in calcium imaging recordings

data-analysis imaging neuroscience

Last synced: 12 Apr 2025

https://github.com/jkrumbiegel/Chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 19 Nov 2024

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 13 Apr 2025

https://github.com/MouseLand/suite2p

cell detection in calcium imaging recordings

data-analysis imaging neuroscience

Last synced: 14 Nov 2024

https://github.com/CICIFLY/Data-Analytics-Projects

This repository contains the projects related to data collecting, assessing,cleaning,visualizations and analyzing

data-analysis data-visualization jupyter-notebook matplotlib numpy pandas seaborn

Last synced: 02 May 2025

https://github.com/d4software/QueryTree

Data reporting and visualization for your app

analytics data-analysis data-visualization database report-builder

Last synced: 27 Nov 2024

https://github.com/toxictoskey/dex-autotrader-bot

This is a Cryptocurrency Trading bot on DeFi that works in multiple Chain with unique trading strategies for cryptocurrencies. It performs automated technical analysis of cryptocurrencies, manages risk, reduces slippage and has customizable strategies such as Stop Loss and Buy the Dip.

arbitrum automated-trading base-network binance binance-smart-chain blockchain bybit curve data-analysis defi dydx eth fraxtal kucoin layer2 polygon sniping-bot solana starknet zksync

Last synced: 09 Apr 2025

https://github.com/databrickslabs/tempo

API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation

data-analysis data-science pandas python scala time-series timeseries timeseries-analysis timeseries-data

Last synced: 29 Apr 2025

https://github.com/pydpiper/pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library

api data-analysis excel microsoft office pypi python python-library python2 python3

Last synced: 08 Apr 2025

https://github.com/helicalinsight/helicalinsight

Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.

amazon-redshift big-data business-intelligence dashboard data-analysis data-visualization druid graph-database hive mongodb mysql neo4j nosql oracle-database postgresql rdbms reporting sql-editor sqllite

Last synced: 06 Apr 2025

https://github.com/PydPiper/pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library

api data-analysis excel microsoft office pypi python python-library python2 python3

Last synced: 29 Mar 2025

https://github.com/boostorg/histogram

Fast multi-dimensional generalized histogram with convenient interface for C++14

boost boost-libraries c-plus-plus c-plus-plus-14 convenient convenient-interface data-analysis header-only histogram statistics

Last synced: 05 Apr 2025

https://github.com/CJWorkbench/cjworkbench

The data journalism platform with built in training

data-analysis data-journalism data-science data-visualization journalism notebook

Last synced: 24 Nov 2024

https://github.com/kde/labplot

LabPlot is a FREE, open source and cross-platform Data Visualization and Analysis software accessible to everyone.

data-analysis data-science data-visualization fitting graph graph2d plotting scientific-plotting scientific-visualization

Last synced: 12 Apr 2025

https://github.com/pavelkomarov/exportify

Export Spotify playlists using the Web API. Analyze them in the Jupyter notebook.

data-analysis github-pages-website javascript javascript-promise jupyter-notebook spotify spotify-api spotify-web-api

Last synced: 12 Apr 2025