Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/minusxai/minusx

MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, & Posthog.

artificial-intelligence data-analytics data-science jupyter metabase

Last synced: 11 Oct 2024

https://github.com/whitews/flowkit

A Python toolkit for flow cytometry analysis supporting GatingML and FlowJo workspaces

cytometry data-science fcs fcs-files flow-cytometry flow-cytometry-analysis flowjo gatingml immunology python

Last synced: 11 Nov 2024

https://github.com/emilhvitfeldt/r-text-data

List of textual data sources to be used for text mining in R

data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext

Last synced: 30 Oct 2024

https://github.com/mybridge/learn-python

Python Top 45 Articles of 2017

algorithm data-science machine-learning python python3

Last synced: 07 Nov 2024

https://github.com/EmilHvitfeldt/R-text-data

List of textual data sources to be used for text mining in R

data-science nlp rstats text-analysis text-analytics-in-r text-mining tidytext

Last synced: 05 Aug 2024

https://github.com/voila-dashboards/voici

Voici turns any Jupyter Notebook into a static web application

dashboards data-science emscripten jupyter jupyterlite voila-dashboard wasm

Last synced: 04 Sep 2024

https://github.com/rivasiker/ggHoriPlot

A user-friendly, highly customizable R package for building horizon plots in ggplot2

data-science data-visualization ggplot2 horizon-plots r r-package

Last synced: 12 Nov 2024

https://github.com/aws-samples/aws-ml-jp

SageMakerで機械学習モデルを構築、学習、デプロイする方法が学べるNotebookと教材集

aws data-science deep-learning jupyter-notebook machine-learning mlops sagemaker

Last synced: 08 Nov 2024

https://github.com/arabacibahadir/sup-res

A great companion for finding key support and resistance levels on financial charts, cryptocurrencies.

algotrade analysis binance binance-api bitcoin cryptocurrency data-science finance pandas pinescript python stock telegram telegram-bot tradingview

Last synced: 27 Oct 2024

https://github.com/apache/incubator-liminal

Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.

ai airflow big-data data-science machine-learning ml workflows

Last synced: 01 Oct 2024

https://github.com/dlab-berkeley/R-Fundamentals-Legacy

D-Lab's 12 hour introduction to R Fundamentals. Learn how to create variables and functions, manipulate data frames, make visualizations, use control flow structures, and more, using R in RStudio.

automation data-science data-visualization data-wrangling r

Last synced: 11 Nov 2024

https://github.com/rk2900/drsa

Deep Recurrent Survival Analysis, an auto-regressive deep model for time-to-event data analysis with censorship handling. An implementation of our AAAI 2019 paper and a benchmark for several (Python) implemented survival analysis methods.

data-science deep-learning machine-learning survival-analysis

Last synced: 07 Nov 2024

https://github.com/jupyterhub/repo2docker-action

A GitHub action to build data science environment images with repo2docker and push them to registries.

actions binder data-science datascience docker jupyter jupyter-notebook repo2docker repo2docker-action

Last synced: 08 Nov 2024

https://github.com/h2oai/wave-apps

Sample AI Apps built with H2O Wave.

data-science h2oai hacktoberfest low-code machine-learning python3

Last synced: 06 Nov 2024

https://rivasiker.github.io/ggHoriPlot/

A user-friendly, highly customizable R package for building horizon plots in ggplot2

data-science data-visualization ggplot2 horizon-plots r r-package

Last synced: 02 Aug 2024

https://github.com/picnicml/doddle-model

:cake: doddle-model: machine learning in Scala.

breeze data-science doddle-model machine-learning scala

Last synced: 04 Aug 2024

https://github.com/hamelsmu/seq2seq_tutorial

Code For Medium Article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"

data-science deep-learning deeplearning keras keras-tutorials machine-learning medium-article nlp-machine-learning rnn-encoder-decoder seq2seq-tutorial sequence-to-sequence

Last synced: 27 Oct 2024

https://github.com/hamelsmu/Seq2Seq_Tutorial

Code For Medium Article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"

data-science deep-learning deeplearning keras keras-tutorials machine-learning medium-article nlp-machine-learning rnn-encoder-decoder seq2seq-tutorial sequence-to-sequence

Last synced: 29 Oct 2024

https://github.com/gzuidhof/zarr.js

Javascript implementation of Zarr

array data-science gehlenborglab javascript typescript zarr

Last synced: 30 Oct 2024

https://github.com/jacobgil/confidenceinterval

The long missing library for python confidence intervals

data-science machine-learning metrics statistics

Last synced: 12 Nov 2024

https://github.com/ing-bank/probatus

Validation (like Recursive Feature Elimination for SHAP) of (multiclass) classifiers & regressors and data used to develop them.

binary-classifiers data-analysis data-science feature-elimination machine-learning multi-class-classification recursive-feature-elimination regressors shap statistics tree-model

Last synced: 08 Nov 2024

https://github.com/morganjwilliams/pyrolite

A set of tools for getting the most from your geochemical data.

chemistry data-science geochemical-data geochemistry geoscience pyrolite ternary-diagrams

Last synced: 25 Oct 2024

https://github.com/njtierney/rmd4sci

Rmarkdown for Scientists

book bookdown data-science r rmarkdown rstats science

Last synced: 27 Oct 2024

https://github.com/machine-learning-apps/ml-template-azure

Template for getting started with automated ML Ops on Azure Machine Learning

aml azure azure-machine-learning data-science machine-learning machine-learning-lifecycle mlops

Last synced: 02 Nov 2024

https://github.com/RamiKrispin/Introduction-to-Docker

(WIP) Getting started with Docker - An introduction to Docker with data science and engineering applications

data-engineering data-science docker dockerfile

Last synced: 25 Oct 2024

https://github.com/scitime/scitime

Training time estimation for scikit-learn algorithms

data-science machine-learning python scikit-learn timer

Last synced: 01 Nov 2024

https://github.com/romanmichaelpaolucci/AI_Stock_Trading

Design pattern for critical stages in the development process of an AI Stock Trading Bot

artificial-intelligence data-science machine-learning neural-network python trading trading-algorithms trading-bot trading-strategies

Last synced: 07 Nov 2024

https://github.com/suji04/normalizednerd

Codes for the videos of my YouTube channel

data-science machine-learning python tutorial youtube

Last synced: 10 Nov 2024

https://github.com/scrapinghub/python-simhash

An efficient simhash implementation for python

data-science

Last synced: 10 Nov 2024

https://github.com/vkoul/Econ-Data-Science

Articles/ Journals and Videos related to Economics:chart_with_upwards_trend: and Data Science :bar_chart:

casual-inference data-science econometrics economics economist machine-learning social-sciences

Last synced: 02 Aug 2024

https://github.com/winvector/pyvtreat

vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.

data-science machine-learning pydata python

Last synced: 07 Nov 2024

https://github.com/autoviml/deep_autoviml

Build tensorflow keras model pipelines in a single line of code. Now with mlflow tracking. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

autokeras automl data-science deep-learning gcp keras machine-learning mlflow mljar pycaret python tensorflow tensorflow2 tpot

Last synced: 10 Oct 2024

https://github.com/jadianes/spark-r-notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

big-data bigdata data-analysis data-science exploratory-data-analysis jupyter jupyter-notebook notebook r sparkr

Last synced: 09 Nov 2024

https://github.com/napjon/krisk

Statistical Interactive Visualization with pandas+Jupyter integration on top of Echarts.

dashboard data-science data-visualization echarts interactive-charts jupyter-notebook python

Last synced: 31 Oct 2024

https://github.com/yandexdataschool/roc_comparison

The fast version of DeLong's method for computing the covariance of unadjusted AUC.

data-science statistics

Last synced: 06 Nov 2024

https://github.com/autoviml/pandas_dq

Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.

data data-science dataquality dataqualitycheck machine-learning pandas python scikit-learn

Last synced: 31 Oct 2024

https://github.com/diffusionkinetics/open

DiffusionKinetics open-source monorepo

data-science haskell

Last synced: 11 Nov 2024

https://github.com/WinVector/pyvtreat

vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.

data-science machine-learning pydata python

Last synced: 05 Aug 2024

https://github.com/winvector/data_algebra

Codd method-chained SQL generator and Pandas data processing in Python.

data-analysis data-science pandas python

Last synced: 07 Nov 2024

https://github.com/medtagger/MedTagger

A collaborative framework for annotating medical datasets using crowdsourcing.

crowdsourcing data-science data-validation deep-learning labeling medical-imaging

Last synced: 03 Aug 2024

https://github.com/LankyCyril/pyvenn

Python module for plotting Venn diagrams of 2..6 sets

data-science matplotlib matplotlib-venn venn venn-diagram venndiagram visualization

Last synced: 03 Aug 2024

https://github.com/mybridge/learn-machine-learning

Learn to Build a Machine Learning Application from Top Articles

computer-vision data-science deep-learning machine-learning neural-networks

Last synced: 07 Nov 2024

https://github.com/ColtAllen/btyd

Buy Till You Die and Customer Lifetime Value statistical models in Python.

bayesian buy-til-you-die customer-lifetime-value data-science python

Last synced: 02 Aug 2024

https://github.com/ujjwalkarn/xda

R package for exploratory data analysis

data-analysis data-science exploratory-data-analysis r

Last synced: 11 Nov 2024

https://github.com/alexandervnikitin/tsgm

Generation and evaluation of synthetic time series datasets (also, augmentations, visualizations, a collection of popular datasets)

augmentations data-augmentation data-science datasets deep-learning generative-model keras machine-learning python synthetic-data synthetic-time-series tensorflow2 time-series vae

Last synced: 13 Oct 2024

https://github.com/solegalli/hyperparameter-optimization

Code repository for the online course Hyperparameter Optimization for Machine Learning

data-science hyperopt hyperparameter-optimization machine-learning optuna python scikit-optimize

Last synced: 30 Oct 2024

https://github.com/jovianhq/jovian-py

Collaboration platform for data science projects & Jupyter notebooks

data-science deep-learning jupyter-notebook machine-learning ml

Last synced: 02 Nov 2024

https://github.com/JovianHQ/jovian-py

Collaboration platform for data science projects & Jupyter notebooks

data-science deep-learning jupyter-notebook machine-learning ml

Last synced: 11 Oct 2024

https://github.com/jayantgoel001/jayantgoel001

JayantGoel001's profile with 111 stars ⭐ and 110 forks 🎉.

android data-science devops git github mean-stack portfolio profile readme web-development

Last synced: 12 Nov 2024

https://github.com/lawmurray/Birch

A probabilistic programming language that combines automatic differentiation, automatic marginalization, and automatic conditioning within Monte Carlo methods.

autodiff bayesian bayesian-inference bayesian-methods bayesian-statistics data-science machine-learning machine-learning-algorithms machine-learning-projects monte-carlo-methods monte-carlo-sampling probabilistic-programming-languages statistics

Last synced: 30 Oct 2024

https://github.com/lsys/forestplot

A Python package to make publication-ready but customizable coefficient plots.

coefficientplot data-science data-visualization dataviz forestplot matplotlib python visualization

Last synced: 02 Nov 2024

https://github.com/innat/ML-Resource

A concise resource repository for machine learning

data-analysis data-science deep-learning kaggle machine-learning python spark

Last synced: 11 Nov 2024

https://github.com/imsanjoykb/data-science-regular-bootcamp

Regular practice on Data Science, Machien Learning, Deep Learning, Solving ML Project problem, Analytical Issue. Regular boost up my knowledge. The goal is to help learner with learning resource on Data Science filed.

artificial-intelligence data-analysis data-science data-science-notebook data-science-projects data-visualization database-connection deep-learning etl-pipeline etl-process feature-engineering machine-learning mysql-database neural-network numpy pandas postgresql python python-automation sqlite

Last synced: 12 Oct 2024