Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ksachdeva/rethinking-tensorflow-probability

Statistical Rethinking (2nd Ed) with Tensorflow Probability
https://github.com/ksachdeva/rethinking-tensorflow-probability

bayesian-inference bayesian-statistics markov-chain-monte-carlo statistical-rethinking tensorflow tensorflow-probability tutorials variational-inference

Last synced: 1 day ago
JSON representation

Statistical Rethinking (2nd Ed) with Tensorflow Probability

Awesome Lists containing this project

README

        

# Statistical Rethinking (2nd Edition) with Tensorflow Probability

This repository provides jupyter notebooks that port various R code fragments found in the
chapters of [Statistical Rethinking 2nd Edition](https://xcelab.net/rm/statistical-rethinking/) by Professor Richard McElreath to python using tensorflow probability framework.

Note - These notebooks are based on the 8th December 2019 draft. I will update the notebooks once the book is released.

## Misc Notes

* **Why Tensorflow Probability ?** There are many great probabilitic frameworks (PPLs) out there. I especially like `Numpyro` & `PyMC3`. There are 2 main reasons why I chose to do this exercise in tfp.
* First and main reason is to not use the magic of the libraries. Sometimes higher level libraries hide the details which are necessary for one to truly understand the subject. As a matter of fact, working with TFP has resulted in me becoming more appreciable of these high level libraries as indeed they not only provide great helpers but make the code easy to read and reuse.
* Second is that I have other investments in Tensorflow ecosystem so am not keen on switching to pyTorch even though I really like what Pyro team has done.

> For production use, I strongly recommend that one must use these higher level libraries i.e. `Numpyro`, `PyMC3`

* **What worked ?** Well of course this book is the best there is in this area. The community is also great. I got quick responses from tensorflow probability team whenever I asked questions on tfp google group.

* **What was hard ?** It may be tad bit subjective because I am **challenged** when it comes to manipulating shapes (high dimensional arrays). I find numpy to be difficult and tensorflow is way more harder when it comes to working with multi-dimensional arrays. This is one of the main problems I have faced and continue to face. Another problem is that the stack trace generated by TFP can be really difficult to understand. This mostly is the side effect of graphs that make debugging difficult. Quite often as long as I used only 1 chain things would work but working with multiple chains require that you pay special attention to the shapes/batches of the various tensors/distributions.

* **Visualization** I have made use of `arviz` and in order to do that I converted the output of various sampling procedures to the format/structure required by it. This made me learn and discover `xarray`. It was really worth doing it and made it easy to plot the graphs.

## Chapters

If you prefer the readonly view of notebooks (html pages) then use this link - [https://ksachdeva.github.io/rethinking-tensorflow-probability/](https://ksachdeva.github.io/rethinking-tensorflow-probability/)

If you want to run the notebooks locally -

```bash
# install the requirements
pip install -r requirements.txt
# install jupyter in your virtual environment
pip install -r requirements-extra.txt
# do the dev setup (as some common code resides in rethinking module)
pip install -e .
```

If you prefer to run the notebooks in binder then click here [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ksachdeva/rethinking-tensorflow-probability/master)

*Clicking on the links will open the notebooks in Google Colab *

* [Preface](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/preface.ipynb)

* [Chapter 1 - The Golem of Prague](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/01_the_golem_of_prague.ipynb)

* [Chapter 2 - Small worlds & large worlds](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/02_small_worlds_and_large_worlds.ipynb)

* [Chapter 3 - Sampling the Imaginary](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/03_sampling_the_imaginary.ipynb)

* [Chapter 4 - Geocentric Models](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/04_geocentric_models.ipynb)

* [Chapter 5 - The Many Variables and The Spurious Waffles](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/05_the_many_variables_and_the_spurious_waffles.ipynb)

* [Chapter 6 - The Haunted DAG & The Causal Terror](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/06_the_haunted_dag_and_the_causal_terror.ipynb)

* [Chapter 7 - Ulysses' Compass](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/07_ulysses_compass.ipynb)

* [Chapter 8 - Conditional Manatees](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/08_conditional_manatees.ipynb)

* [Chapter 9 - Markov Chain Monte Carlo](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/09_markov_chain_monte_carlo.ipynb)

* [Chapter 10 - Big Entropy and The Generalized Linear Model](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/10_big_entropy_and_the_generalized_linear_model.ipynb)

* [Chapter 11 - God Spiked the Integers (WIP)](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/11_god_spiked_the_integers.ipynb)

* [Chapter 12 - Monsters and Mixtures (WIP)](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/12_monsters_and_mixtures.ipynb)

* [Chapter 13 - Models with Memory](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/13_models_with_memory.ipynb)

* [Chapter 14 - Adventures in Covariance (WIP)](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/14_adventures_in_covariance.ipynb)

* [Chapter 15 - Missing data & Other Opportunities (WIP)](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/15_missing_data_and_other_opportunities.ipynb)

* [Chapter 16 - Generalized Linear Madness](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/16_generalized_linear_madness.ipynb)

* [Chapter 17 - Horoscopes](https://colab.research.google.com/github/ksachdeva/rethinking-tensorflow-probability/blob/master/notebooks/17_horoscopes.ipynb)

## Acknowledgements

My immense gratitude goes to Professor Richard McElreath for writing such a wonderful book. His method of teaching has made somewhat difficult subject of Bayesian Statistics approachable, interesting and to some extent fun as well. We need more educators like you Sir !.

Another person I want to thank is Du Phan (https://github.com/fehiepsi). He is the main author of [Numpyro](https://github.com/pyro-ppl/numpyro), a great framework to do Bayesian Analysis. He has ported Statsical Rethinking (2nd Ed) to Numpyro and his notebooks were not only insipirational but were also of great help to me in creating graphs. I borrowed most of his code fragments when it came to plotting the figures using matplotlib.