Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2024-07-29 13:36:31 UTC
- JSON Representation
https://github.com/boostorg/histogram
Fast multi-dimensional generalized histogram with convenient interface for C++14
boost boost-libraries c-plus-plus c-plus-plus-14 convenient convenient-interface data-analysis header-only histogram statistics
Last synced: 02 Aug 2024
https://github.com/Technion-Kishony-lab/quibbler
Your data - interactive!
data-analysis data-science data-visualization declarative graphics gui interactive jupyter matplotlib python widgets
Last synced: 31 Jul 2024
https://github.com/e-m-b-a/embark
EMBArk - The firmware security scanning environment
data-analysis django embedded-linux embedded-systems firmware firmware-analysis firmware-tools hacking iot linux penetration-testing pentesting scanner security security-automation security-scanner security-testing security-tools ubuntu-server vulnerability-scanners
Last synced: 02 Aug 2024
https://github.com/sn3fru/datascience_course
Curso de Data Science em Português
artificial-intelligence brasil curso dados data data-analysis data-science data-science-learning dataset deep-learning machine-learning python
Last synced: 02 Aug 2024
https://github.com/databrickslabs/tempo
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
data-analysis data-science pandas python scala time-series timeseries timeseries-analysis timeseries-data
Last synced: 02 Aug 2024
https://github.com/CJWorkbench/cjworkbench
The data journalism platform with built in training
data-analysis data-journalism data-science data-visualization journalism notebook
Last synced: 06 Aug 2024
https://github.com/swcarpentry/python-novice-inflammation
Programming with Python
automation carpentries data-analysis data-visualization english functions lesson loops matplotlib numpy programming python software-carpentry stable
Last synced: 01 Aug 2024
https://swcarpentry.github.io/python-novice-inflammation/
Programming with Python
automation carpentries data-analysis data-visualization english functions lesson loops matplotlib numpy programming python software-carpentry stable
Last synced: 03 Aug 2024
https://github.com/PydPiper/pylightxl
A light weight, zero dependency, minimal functionality excel read/writer python library
api data-analysis excel microsoft office pypi python python-library python2 python3
Last synced: 31 Jul 2024
https://github.com/deepgraph/deepgraph
Analyze Data with Pandas-based Networks. Documentation:
data-analysis data-mining data-science data-structures data-visualization graph-database graph-theory graphs graphviz interfacing iterative-methods multilayer-networks network network-analysis network-visualization networkx pandas parallel partitioning
Last synced: 01 Aug 2024
https://github.com/Derek-Jones/ESEUR-book
Issue handling for Evidence-based Software Engineering: based on the publicly available data
book data-analysis empirical-research engineering-data evidence-based human-cognitive-characteristics software-development software-engineering
Last synced: 07 Aug 2024
https://github.com/X-lab2017/open-digger
Open source analysis tools
data-analysis github hacktoberfest openrank
Last synced: 31 Jul 2024
https://github.com/rasgointelligence/RasgoQL
Write python locally, execute SQL in your data warehouse
data-analysis data-science pandas python sql
Last synced: 08 Aug 2024
https://github.com/gnudatalanguage/gdl
GDL - GNU Data Language
antlr astronomy data-analysis dicom eigen3 fits-files geophysics grib gsl-library hdf hdf5 mapping netcdf plotting plplot programming-language pv-wave python scientific-computing scientific-visualization
Last synced: 01 Aug 2024
https://github.com/cloudberrydb/cloudberrydb
Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.
ai cloudberrydb data-analysis data-warehouse database database-management gpdb greenplum greenplum-database mpp olap postgres postgresql postgresql-database sql
Last synced: 31 Jul 2024
https://github.com/glotzerlab/freud
Powerful, efficient particle trajectory analysis in scientific Python.
analysis computational-chemistry computational-physics data-analysis hacktoberfest molecular-dynamics monte-carlo-simulation particle-system python science scientific-computing spatial-analysis
Last synced: 02 Aug 2024
https://github.com/lucasxlu/LagouJob
Data Analysis & Mining for lagou.com
data-analysis data-mining lagou machine-learning nlp python3 web-crawler
Last synced: 06 Aug 2024
https://github.com/bucketeer-io/bucketeer
Feature Flag Management and A/B Testing platform
ab-testing beta-testing bucketeer dark-launch data-analysis data-driven experiments feature-flags feature-toggles go multivariate-testing python react trunk-based-development typescript
Last synced: 02 Aug 2024
https://github.com/zavtech/morpheus-core
The foundational library of the Morpheus data science framework
data-analysis data-analytics dataframe dataframe-library datascience finance principal-component-analysis quantitative-finance regression regression-models statistical-analysis statistics
Last synced: 01 Aug 2024
https://github.com/przemek83/volbx
Graphical tool for data manipulation written in C++/Qt.
c-plus-plus c-plus-plus-17 cpp cpp17 csv data data-analysis data-export data-filtering data-import data-manipulation data-visualization dynamic graphical ods plots qt spreadsheet statistical-analysis xlsx
Last synced: 01 Aug 2024
https://github.com/graphia-app/graphia
A visualisation tool for the creation and analysis of graphs
analysis data data-analysis data-science data-visualization graphs interpretation networks visualisation visualization
Last synced: 01 Aug 2024
https://github.com/Yu-Group/covid19-severity-prediction
Extensive and accessible COVID-19 data + forecasting for counties and hospitals. 📈
coronavirus coronavirus-tracking county-health-data county-level covid-19 covid-19-data covid-19-data-analysis data-analysis data-science epidemic-model forecasting outbreak outbreak-severity python3 response4life risk-assessment risk-modelling statistics ventilator visualization
Last synced: 08 Aug 2024
https://github.com/nickslevine/zebras
Data analysis library for JavaScript built with Ramda
data-analysis data-science functional-programming javascript pandas ramda
Last synced: 01 Aug 2024
https://github.com/Jean-njoroge/Breast-cancer-risk-prediction
Classification of Breast Cancer diagnosis Using Support Vector Machines
breast-cancer-prediction breast-cancer-tumor breastcancer-classification classification data-analysis dataprocessing exploratory-data-analysis notebook pipelines prediction-model python supervised-learning svm
Last synced: 04 Aug 2024
https://github.com/GZTipDM/TipDM
TipDM建模平台,开源的数据挖掘工具。
bigdata data-analysis data-analysis-python data-mining graph-schedule machine-learning tensorflow workflow
Last synced: 31 Jul 2024
https://github.com/acerbilab/vbmc
Variational Bayesian Monte Carlo (VBMC) algorithm for posterior and model inference in MATLAB
bayesian-inference data-analysis gaussian-processes machine-learning matlab variational-inference
Last synced: 31 Jul 2024
https://github.com/dataplane-app/dataplane
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.
airflow data data-analysis data-engineering data-integration data-pipelines data-science dataplane datawarehouse etl finance golang kubernetes pipelines robotics-process-automation rpa scheduler workflow workflow-automation workflows
Last synced: 02 Aug 2024
https://github.com/dylan-profiler/visions
Type System for Data Analysis in Python
data-analysis data-science hacktoberfest numpy pandas python spark type-inference type-system
Last synced: 03 Aug 2024
https://github.com/milaan9/DataScience_Interview_Questions
My Solutions to 120 commonly asked data science interview questions.
data-analysis data-science interview-preparation interview-questions machine-learning predective-modeling probability product-metrics python-jupyter-notebooks python-tutorial-github python4datascience statistical-inference tutor-milaan9
Last synced: 02 Aug 2024
https://github.com/h2oai/nitro
Create apps 10x quicker, without Javascript/HTML/CSS.
app apps data-analysis data-science developer-tools devtools graphics h2o-nitro low-code python ui ui-components user-interface web-application webapp widget-library widgets
Last synced: 01 Aug 2024
https://github.com/storieswithsiva/Data-Science-Resources
👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
artificial-intelligence artificial-neural-networks data data-analysis data-analytics data-mining data-science data-science-resource data-science-resources data-scientist data-scientists data-visualization data-world datascience dataset learning learning-kit machine-learning python repository
Last synced: 01 Aug 2024
https://github.com/aws/amazon-redshift-python-driver
Redshift Python Connector. It supports Python Database API Specification v2.0.
amazon-redshift aws-redshift data-analysis data-science
Last synced: 01 Aug 2024
https://github.com/ayush1997/visualize_ML
Python package for consolidated and extensive Univariate,Bivariate Data Analysis and Visualization catering to both categorical and continuous datasets.
data-analysis machine-learning matplotlib python statisics visualization
Last synced: 30 Jul 2024
https://github.com/apachecn/matplotlib-doc-zh
:book: [译] Matplotlib 用户指南
data-analysis data-visualization documentation matplotlib python
Last synced: 02 Aug 2024
https://github.com/simbafl/Data-analysis
数据分析,挖掘建模。
data-analysis kaggle matplotlib numpy pandas python3 scipy sklearn
Last synced: 31 Jul 2024
https://github.com/NeuralEnsemble/elephant
Elephant is the Electrophysiology Analysis Toolkit
data-analysis electrophysiology hacktoberfest neurophysiology neuroscience python statistics
Last synced: 09 Aug 2024
https://github.com/nshiab/simple-data-analysis
Easy-to-use and high-performance JavaScript library for data analysis.
data data-analysis data-science duckdb javascript nodejs typescript
Last synced: 31 Jul 2024
https://github.com/nshiab/simple-data-analysis.js
Easy-to-use and high-performance JavaScript library for data analysis.
data data-analysis data-science duckdb javascript nodejs typescript
Last synced: 12 Aug 2024
https://github.com/eurostat/gridviz
A package for visualizing gridded data 🌐
cartography csv d3 data data-analysis data-science data-visualization datascience geospatial gis gridded-statistics grids gridviz map map-making mapping mapping-tools maps visualization webgl
Last synced: 04 Aug 2024
https://github.com/codekitchen/pipeline
the `pipeline` shell command
data-analysis data-mining shell-scripting
Last synced: 01 Aug 2024
https://github.com/SETL-Framework/setl
A simple Spark-powered ETL framework that just works 🍺
big-data data-analysis data-engineering data-science data-transformation dataset etl etl-pipeline framework machine-learning modularization pipeline scala setl spark
Last synced: 01 Aug 2024
https://github.com/Azure/DataScienceVM
Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
ai azure big-data data-analysis data-science deep-learning dsvm machine-learning ml python r sqlserver
Last synced: 08 Aug 2024
https://github.com/briatte/ida
Introduction to Data Analysis, using R (2013)
Last synced: 09 Aug 2024
https://github.com/totalhack/zillion
Make sense of it all. Semantic data modeling and analytics with a sprinkle of AI. https://totalhack.github.io/zillion/
ai analytics data-analysis data-warehousing datasources openai python query-builder reporting semantic-data-model semantic-layer sql text-to-sql warehouse
Last synced: 01 Aug 2024
https://github.com/markovmodel/deeptime
Deep learning meets molecular dynamics.
autoencoder computational-biology computational-chemistry data-analysis deep-learning dimension-reduction machine-learning markov-model python pytorch tensorflow time-series
Last synced: 31 Jul 2024
https://github.com/calculist/calculist
the open source thinking tool for problem solvers
data-analysis note-taking tree-structure
Last synced: 01 Aug 2024
https://github.com/toobigdata/papa
一个浏览器端数据爬虫,做每个人的数据助手
chrome data-analysis kickstarter spider
Last synced: 01 Aug 2024
https://github.com/varchar-io/nebula
A distributed block-based data storage and compute engine
access-control analytics big-data data-analysis data-visualization distributed-computing distributed-systems real-time
Last synced: 03 Aug 2024
https://github.com/cuducos/calculadora-do-cidadao
💵 Tool for Brazilian Reais monetary adjustment/correction
brasil brazil data-analysis hacktoberfest monetary python
Last synced: 30 Jul 2024
https://github.com/archd3sai/Customer-Survival-Analysis-and-Churn-Prediction
In this project, I have utilized survival analysis models to see how the likelihood of the customer churn changes over time and to calculate customer LTV. I have also implemented the Random Forest model to predict if a customer is going to churn and deployed a model using the flask web app.
customer-churn-prediction customer-survival-analysis data-analysis explainable-ai flask-application hazard partial-dependence-plot random-forest shap-values survival-analysis
Last synced: 01 Aug 2024
https://github.com/opensource9ja/dnotebook
Dnotebook is a Jupyter-like library for javaScript environment. It allows you to create and share pages that contain live code, text and visualizations.
data-analysis interactive-visualizations javascript live-code notebook notebook-javascript
Last synced: 02 Aug 2024
https://github.com/ayush1997/YouTube-Like-predictor
YouTube Like Count Predictions using Machine Learning
data-analysis data-science machine-learning predictive-analysis random-forest visualization youtube-api
Last synced: 07 Aug 2024
https://github.com/DanielBok/copulae
Multivariate data modelling with Copulas in Python
conda copula copula-models copulae copulas data data-analysis dependency-analysis dependency-modeling modeling pypi pypi-packages python python3 statistics
Last synced: 31 Jul 2024
https://github.com/dogoncouch/logdissect
CLI utility and Python module for analyzing log files and other data.
cli command-line data-analysis data-science forensic-analysis forensics json library log-analysis log-parser module parser parsing parsing-library python-library python-module python-modules security syslog
Last synced: 31 Jul 2024
https://github.com/DataHaskell/dh-core
Functional data science
data-analysis data-mining data-science dataframes datahaskell datasets machine-learning numerical-methods
Last synced: 31 Jul 2024
https://github.com/moosetechnology/Moose
MOOSE - Platform for software and data analysis.
data-analysis moose pharo smalltalk software-analysis
Last synced: 03 Aug 2024
https://github.com/dirkhovy/text_analysis_for_social_science
Code for the CUP Elements on text analysis in Python for social scientists
analysis classification-models clustering data-analysis embeddings neural-networks prediction predictive-modeling social-sciences text-analysis text-classification topic-modeling
Last synced: 31 Jul 2024
https://github.com/senbox-org/snap-desktop
Desktop GUI for SNAP based on NetBeans Platform
data-analysis data-processing data-visualization desktop-application earth-observation eo linux macos remote-sensing windows
Last synced: 31 Jul 2024
https://github.com/mattansb/Practical-Applications-in-R-for-Psychologists
Lesson files for Practical Applications in R for Psychologists.
data-analysis easystats psychologists regression rstats statistics tidyverse
Last synced: 05 Aug 2024
https://github.com/globeandmail/startr
A template for data journalism in R
data data-analysis data-journalism data-visualization journalism news r
Last synced: 13 Aug 2024
https://github.com/MBB-team/VBA-toolbox
The VBA toolbox
data-analysis matlab modeling statistics toolbox vba-toolbox
Last synced: 08 Aug 2024
https://github.com/hbuschme/TextGridTools
Read, write, and manipulate Praat TextGrid files with Python
annotation data-analysis elan linguistics praat python textgrid
Last synced: 07 Aug 2024
https://github.com/davebraze/FDBeye
R tools for eyetracker workflows.
data-analysis doi eye-tracking eyelink open-source psychology-experiments r
Last synced: 31 Jul 2024
https://github.com/hay/dataknead
Effortless conversion between data formats like JSON, XML and CSV
csv data-analysis data-conversion json python python3
Last synced: 04 Aug 2024
https://github.com/apachecn/ds100-textbook-zh
:book: [译] UCB DS100 数据科学的原理与技巧
data-analysis ds100 machine-learning python textbook ucb
Last synced: 01 Aug 2024
https://github.com/abhiamishra/ggshakeR
An analysis and visualization R package that works with publicly available soccer data
analysis data-analysis data-visualization football-analytics library machine-learning plotting r soccer soccer-analytics visualization
Last synced: 02 Aug 2024
https://github.com/bccp/nbodykit
Analysis kit for large-scale structure datasets, the massively parallel way
astrophysics clustering cosmology data-analysis large-scale-structure mpi mpi4py parallel-computing python
Last synced: 09 Aug 2024
https://github.com/deanmarchiori/analysis-flow
Data Analysis Workflows & Reproducibility Learning Resources
data-analysis reproducibility reproducible-data-science reproducible-science tooling workflow
Last synced: 13 Aug 2024
https://github.com/acerbilab/pyvbmc
PyVBMC: Variational Bayesian Monte Carlo algorithm for posterior and model inference in Python
bayesian-inference data-analysis gaussian-processes machine-learning python variational-inference
Last synced: 02 Aug 2024
https://github.com/innat/ML-Resource
A concise resource repository for machine learning
data-analysis data-science deep-learning kaggle machine-learning python spark
Last synced: 02 Aug 2024
https://github.com/Nesvilab/philosopher
PeptideProphet, PTMProphet, ProteinProphet, iProphet, Abacus, and FDR filtering
bioinformatics data-analysis go mass-spectrometry ms-data proteomics
Last synced: 02 Aug 2024
https://github.com/nicohlr/ipychart
The power of Chart.js with Python
charting-library chartjs charts data data-analysis data-science data-visualization ipywidgets javascript-es6 jupyter jupyter-notebook notebook python
Last synced: 01 Aug 2024
https://github.com/sissa-data-science/DADApy
Distance-based Analysis of DAta-manifolds in python
data-analysis data-science density-based-clustering density-estimation intrinsic-dimension machine-learning manifolds python
Last synced: 02 Aug 2024
https://github.com/dysonance/Temporal.jl
Time series implementation for the Julia language focused on efficiency and flexibility
data-analysis data-structures data-visualization econometrics economics finance io julia julia-language quantitative-finance quantitative-trading scientific-computing time-series time-series-analysis timeseries
Last synced: 02 Aug 2024
https://github.com/nla-group/classix
Fast and explainable clustering in Python
algorithm classification clustering cython data-analysis data-mining data-science database dataset explainable-ml machine-learning python unsupervised-learning unsupervised-machine-learning visualization
Last synced: 03 Aug 2024
https://github.com/PetoLau/TSrepr
TSrepr: R package for time series representations
data-analysis data-mining data-mining-algorithms data-science r r-package representation time-series time-series-analysis time-series-classification time-series-clustering time-series-data-mining time-series-representations
Last synced: 01 Aug 2024
https://github.com/SciRuby/daru-view
daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.
charts daru daru-view data-analysis data-visualization graphs iruby-notebook nanoc plot-library rails ruby sinatra
Last synced: 31 Jul 2024
https://github.com/sciruby/daru-view
daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.
charts daru daru-view data-analysis data-visualization graphs iruby-notebook nanoc plot-library rails ruby sinatra
Last synced: 03 Aug 2024
https://github.com/NCAS-CMS/cf-python
A CF-compliant Earth Science data analysis library
cf cfdm cfunits data-analysis earth-science metadata netcdf pp python um
Last synced: 08 Aug 2024
https://github.com/Coorsaa/shinyMlr
shiny-mlr: Integration of the mlr package into shiny
data-analysis data-visualization machine-learning mlr r r-package shiny shiny-apps
Last synced: 13 Aug 2024
https://github.com/tidypyverse/tidypandas
A grammar of data manipulation for pandas inspired by tidyverse
data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse
Last synced: 01 Aug 2024
https://github.com/talegari/tidypandas
A grammar of data manipulation for pandas inspired by tidyverse
data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse
Last synced: 12 Aug 2024
https://github.com/firmai/business-analytics-and-mathematics-python-book
Advanced Business Analytics and Mathematics with Python (by @firmai)
analytics business data-analysis data-science mathematics python
Last synced: 04 Aug 2024
https://github.com/synthesized-io/fairlens
Identify bias and measure fairness of your data
bias data data-analysis data-science fairness pandas python statistics
Last synced: 03 Aug 2024
https://github.com/connervieira/Predator
A multi-purpose camera system focused on offline license plate and object recognition
alpr computer-vision dashcam data-analysis gps gpx-files license-plate-detection license-plate-recognition object-detection object-recognition openalpr opencv-python realtime security tensorflow video
Last synced: 06 Aug 2024
https://github.com/leerob/facebook-data-analyzer
📊Python script to analyze the contents of your Facebook data export
beautifulsoup data-analysis facebook python
Last synced: 07 Aug 2024
https://github.com/woz-u/DS-Student-Resources
Data Science Student Companion Notebooks and Data Lake
data-analysis data-science data-visualization machine-learning nosql python r sql statistics
Last synced: 08 Aug 2024
https://github.com/jepegit/cellpy
extract and tweak data from electrochemical tests of cells
battery chemistry data-analysis electrochemistry opensource physics
Last synced: 03 Aug 2024
https://github.com/CarlosBergillos/ts2vg
Time series to visibility graphs.
cli data-analysis graph igraph network networkx python snap time-series visibility-graph
Last synced: 30 Jul 2024
https://github.com/kianweelee/Edator
A python package that performs exploratory data analysis for users. Additionally, it generates 3 types of output files (cleaned CSV, plots and a text report).
data-analysis data-science exploratory-data-analysis
Last synced: 03 Aug 2024
https://github.com/PetoLau/petolau.github.io
Blog about time series data mining in R.
artificial-intelligence blog data-analysis data-mining data-science data-visualization forecasting machine-learning r time-series time-series-analysis time-series-clustering time-series-data-mining time-series-forecasting time-series-prediction
Last synced: 02 Aug 2024
https://github.com/capitalone/dataCompareR
dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.
compare-data data data-analysis data-science r
Last synced: 13 Aug 2024
https://github.com/PolyMathOrg/DataFrame
DataFrame in Pharo - tabular data structures for data analysis
data-analysis data-frame data-science data-visualization gsoc hacktoberfest pharo pharo-smalltalk smalltalk statistics tabular-data
Last synced: 03 Aug 2024
https://github.com/hsbc/tslumen
A library for Time Series EDA (exploratory data analysis)
analysis data-analysis data-science data-visualization eda exploratory-data-analysis exploratory-data-visualizations pandas profiling python time-series time-series-analysis time-series-eda time-series-profiling timeseries timeseries-analysis timeseries-eda
Last synced: 01 Aug 2024
https://github.com/apachecn/pandas-cookbook-code-notes
:book: Pandas Cookbook 带注释源码
code data-analysis notes pandas python
Last synced: 02 Aug 2024
https://github.com/paezha/spatial-analysis-r
Open Educational Resource for teaching spatial data analysis and statistics with R
data-analysis open-educational-resource r r-package r-spatial rstats spatial-data-analysis spatial-statistics statistics
Last synced: 31 Jul 2024
https://github.com/cvjena/libmaxdiv
Implementation of the Maximally Divergent Intervals algorithm for Anomaly Detection in multivariate spatio-temporal time-series.
anomalydetection anomalydiscovery data-analysis data-mining datamining machine-learning machine-learning-library machinelearning time-series timeseries
Last synced: 01 Aug 2024
https://github.com/datadesk/lapd-crime-classification-analysis
A Los Angeles Times analysis of serious assaults misclassified by LAPD
crime crime-data data-analysis data-journalism journalism jupyter-notebook machine-learning news pandas python
Last synced: 01 Aug 2024
https://github.com/404notf0und/FXY
Security-Scenes-Feature-Engineering-Toolkit, Continuous Integration.一款安全数据特征化工具
data-analysis data-mining feature-engineering machine-learning security security-scenes
Last synced: 04 Aug 2024
https://github.com/jmwoloso/pychattr
Python Channel Attribution (pychattr) - A Python implementation of the excellent R ChannelAttribution library
channel-attribution data-analysis data-science machine-learning python python-channel-attribution rpy2 wrapper
Last synced: 02 Aug 2024