Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/boostorg/histogram

Fast multi-dimensional generalized histogram with convenient interface for C++14

boost boost-libraries c-plus-plus c-plus-plus-14 convenient convenient-interface data-analysis header-only histogram statistics

Last synced: 02 Aug 2024

https://github.com/databrickslabs/tempo

API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation

data-analysis data-science pandas python scala time-series timeseries timeseries-analysis timeseries-data

Last synced: 02 Aug 2024

https://github.com/CJWorkbench/cjworkbench

The data journalism platform with built in training

data-analysis data-journalism data-science data-visualization journalism notebook

Last synced: 06 Aug 2024

https://github.com/PydPiper/pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library

api data-analysis excel microsoft office pypi python python-library python2 python3

Last synced: 31 Jul 2024

https://github.com/Derek-Jones/ESEUR-book

Issue handling for Evidence-based Software Engineering: based on the publicly available data

book data-analysis empirical-research engineering-data evidence-based human-cognitive-characteristics software-development software-engineering

Last synced: 07 Aug 2024

https://github.com/X-lab2017/open-digger

Open source analysis tools

data-analysis github hacktoberfest openrank

Last synced: 31 Jul 2024

https://github.com/rasgointelligence/RasgoQL

Write python locally, execute SQL in your data warehouse

data-analysis data-science pandas python sql

Last synced: 08 Aug 2024

https://github.com/cloudberrydb/cloudberrydb

Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.

ai cloudberrydb data-analysis data-warehouse database database-management gpdb greenplum greenplum-database mpp olap postgres postgresql postgresql-database sql

Last synced: 31 Jul 2024

https://github.com/lucasxlu/LagouJob

Data Analysis & Mining for lagou.com

data-analysis data-mining lagou machine-learning nlp python3 web-crawler

Last synced: 06 Aug 2024

https://github.com/nickslevine/zebras

Data analysis library for JavaScript built with Ramda

data-analysis data-science functional-programming javascript pandas ramda

Last synced: 01 Aug 2024

https://github.com/acerbilab/vbmc

Variational Bayesian Monte Carlo (VBMC) algorithm for posterior and model inference in MATLAB

bayesian-inference data-analysis gaussian-processes machine-learning matlab variational-inference

Last synced: 31 Jul 2024

https://github.com/dataplane-app/dataplane

Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.

airflow data data-analysis data-engineering data-integration data-pipelines data-science dataplane datawarehouse etl finance golang kubernetes pipelines robotics-process-automation rpa scheduler workflow workflow-automation workflows

Last synced: 02 Aug 2024

https://github.com/aws/amazon-redshift-python-driver

Redshift Python Connector. It supports Python Database API Specification v2.0.

amazon-redshift aws-redshift data-analysis data-science

Last synced: 01 Aug 2024

https://github.com/ayush1997/visualize_ML

Python package for consolidated and extensive Univariate,Bivariate Data Analysis and Visualization catering to both categorical and continuous datasets.

data-analysis machine-learning matplotlib python statisics visualization

Last synced: 30 Jul 2024

https://github.com/nshiab/simple-data-analysis

Easy-to-use and high-performance JavaScript library for data analysis.

data data-analysis data-science duckdb javascript nodejs typescript

Last synced: 31 Jul 2024

https://github.com/nshiab/simple-data-analysis.js

Easy-to-use and high-performance JavaScript library for data analysis.

data data-analysis data-science duckdb javascript nodejs typescript

Last synced: 12 Aug 2024

https://github.com/codekitchen/pipeline

the `pipeline` shell command

data-analysis data-mining shell-scripting

Last synced: 01 Aug 2024

https://github.com/Azure/DataScienceVM

Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)

ai azure big-data data-analysis data-science deep-learning dsvm machine-learning ml python r sqlserver

Last synced: 08 Aug 2024

https://github.com/briatte/ida

Introduction to Data Analysis, using R (2013)

course data-analysis r

Last synced: 09 Aug 2024

https://github.com/totalhack/zillion

Make sense of it all. Semantic data modeling and analytics with a sprinkle of AI. https://totalhack.github.io/zillion/

ai analytics data-analysis data-warehousing datasources openai python query-builder reporting semantic-data-model semantic-layer sql text-to-sql warehouse

Last synced: 01 Aug 2024

https://github.com/calculist/calculist

the open source thinking tool for problem solvers

data-analysis note-taking tree-structure

Last synced: 01 Aug 2024

https://github.com/toobigdata/papa

一个浏览器端数据爬虫,做每个人的数据助手

chrome data-analysis kickstarter spider

Last synced: 01 Aug 2024

https://github.com/cuducos/calculadora-do-cidadao

💵 Tool for Brazilian Reais monetary adjustment/correction

brasil brazil data-analysis hacktoberfest monetary python

Last synced: 30 Jul 2024

https://github.com/archd3sai/Customer-Survival-Analysis-and-Churn-Prediction

In this project, I have utilized survival analysis models to see how the likelihood of the customer churn changes over time and to calculate customer LTV. I have also implemented the Random Forest model to predict if a customer is going to churn and deployed a model using the flask web app.

customer-churn-prediction customer-survival-analysis data-analysis explainable-ai flask-application hazard partial-dependence-plot random-forest shap-values survival-analysis

Last synced: 01 Aug 2024

https://github.com/opensource9ja/dnotebook

Dnotebook is a Jupyter-like library for javaScript environment. It allows you to create and share pages that contain live code, text and visualizations.

data-analysis interactive-visualizations javascript live-code notebook notebook-javascript

Last synced: 02 Aug 2024

https://github.com/moosetechnology/Moose

MOOSE - Platform for software and data analysis.

data-analysis moose pharo smalltalk software-analysis

Last synced: 03 Aug 2024

https://github.com/mattansb/Practical-Applications-in-R-for-Psychologists

Lesson files for Practical Applications in R for Psychologists.

data-analysis easystats psychologists regression rstats statistics tidyverse

Last synced: 05 Aug 2024

https://github.com/hbuschme/TextGridTools

Read, write, and manipulate Praat TextGrid files with Python

annotation data-analysis elan linguistics praat python textgrid

Last synced: 07 Aug 2024

https://github.com/hay/dataknead

Effortless conversion between data formats like JSON, XML and CSV

csv data-analysis data-conversion json python python3

Last synced: 04 Aug 2024

https://github.com/apachecn/ds100-textbook-zh

:book: [译] UCB DS100 数据科学的原理与技巧

data-analysis ds100 machine-learning python textbook ucb

Last synced: 01 Aug 2024

https://github.com/abhiamishra/ggshakeR

An analysis and visualization R package that works with publicly available soccer data

analysis data-analysis data-visualization football-analytics library machine-learning plotting r soccer soccer-analytics visualization

Last synced: 02 Aug 2024

https://github.com/bccp/nbodykit

Analysis kit for large-scale structure datasets, the massively parallel way

astrophysics clustering cosmology data-analysis large-scale-structure mpi mpi4py parallel-computing python

Last synced: 09 Aug 2024

https://github.com/deanmarchiori/analysis-flow

Data Analysis Workflows & Reproducibility Learning Resources

data-analysis reproducibility reproducible-data-science reproducible-science tooling workflow

Last synced: 13 Aug 2024

https://github.com/acerbilab/pyvbmc

PyVBMC: Variational Bayesian Monte Carlo algorithm for posterior and model inference in Python

bayesian-inference data-analysis gaussian-processes machine-learning python variational-inference

Last synced: 02 Aug 2024

https://github.com/innat/ML-Resource

A concise resource repository for machine learning

data-analysis data-science deep-learning kaggle machine-learning python spark

Last synced: 02 Aug 2024

https://github.com/Nesvilab/philosopher

PeptideProphet, PTMProphet, ProteinProphet, iProphet, Abacus, and FDR filtering

bioinformatics data-analysis go mass-spectrometry ms-data proteomics

Last synced: 02 Aug 2024

https://github.com/SciRuby/daru-view

daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.

charts daru daru-view data-analysis data-visualization graphs iruby-notebook nanoc plot-library rails ruby sinatra

Last synced: 31 Jul 2024

https://github.com/sciruby/daru-view

daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.

charts daru daru-view data-analysis data-visualization graphs iruby-notebook nanoc plot-library rails ruby sinatra

Last synced: 03 Aug 2024

https://github.com/NCAS-CMS/cf-python

A CF-compliant Earth Science data analysis library

cf cfdm cfunits data-analysis earth-science metadata netcdf pp python um

Last synced: 08 Aug 2024

https://github.com/Coorsaa/shinyMlr

shiny-mlr: Integration of the mlr package into shiny

data-analysis data-visualization machine-learning mlr r r-package shiny shiny-apps

Last synced: 13 Aug 2024

https://github.com/tidypyverse/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 01 Aug 2024

https://github.com/talegari/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 12 Aug 2024

https://github.com/firmai/business-analytics-and-mathematics-python-book

Advanced Business Analytics and Mathematics with Python (by @firmai)

analytics business data-analysis data-science mathematics python

Last synced: 04 Aug 2024

https://github.com/synthesized-io/fairlens

Identify bias and measure fairness of your data

bias data data-analysis data-science fairness pandas python statistics

Last synced: 03 Aug 2024

https://github.com/leerob/facebook-data-analyzer

📊Python script to analyze the contents of your Facebook data export

beautifulsoup data-analysis facebook python

Last synced: 07 Aug 2024

https://github.com/woz-u/DS-Student-Resources

Data Science Student Companion Notebooks and Data Lake

data-analysis data-science data-visualization machine-learning nosql python r sql statistics

Last synced: 08 Aug 2024

https://github.com/jepegit/cellpy

extract and tweak data from electrochemical tests of cells

battery chemistry data-analysis electrochemistry opensource physics

Last synced: 03 Aug 2024

https://github.com/kianweelee/Edator

A python package that performs exploratory data analysis for users. Additionally, it generates 3 types of output files (cleaned CSV, plots and a text report).

data-analysis data-science exploratory-data-analysis

Last synced: 03 Aug 2024

https://github.com/capitalone/dataCompareR

dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.

compare-data data data-analysis data-science r

Last synced: 13 Aug 2024

https://github.com/apachecn/pandas-cookbook-code-notes

:book: Pandas Cookbook 带注释源码

code data-analysis notes pandas python

Last synced: 02 Aug 2024

https://github.com/paezha/spatial-analysis-r

Open Educational Resource for teaching spatial data analysis and statistics with R

data-analysis open-educational-resource r r-package r-spatial rstats spatial-data-analysis spatial-statistics statistics

Last synced: 31 Jul 2024

https://github.com/cvjena/libmaxdiv

Implementation of the Maximally Divergent Intervals algorithm for Anomaly Detection in multivariate spatio-temporal time-series.

anomalydetection anomalydiscovery data-analysis data-mining datamining machine-learning machine-learning-library machinelearning time-series timeseries

Last synced: 01 Aug 2024

https://github.com/404notf0und/FXY

Security-Scenes-Feature-Engineering-Toolkit, Continuous Integration.一款安全数据特征化工具

data-analysis data-mining feature-engineering machine-learning security security-scenes

Last synced: 04 Aug 2024

https://github.com/jmwoloso/pychattr

Python Channel Attribution (pychattr) - A Python implementation of the excellent R ChannelAttribution library

channel-attribution data-analysis data-science machine-learning python python-channel-attribution rpy2 wrapper

Last synced: 02 Aug 2024