{"id":13577877,"url":"https://github.com/google/uncertainty-baselines","last_synced_at":"2025-05-14T00:05:40.511Z","repository":{"id":37045394,"uuid":"280026201","full_name":"google/uncertainty-baselines","owner":"google","description":"High-quality implementations of standard and SOTA methods on a variety of tasks.","archived":false,"fork":false,"pushed_at":"2025-04-29T16:23:51.000Z","size":8798,"stargazers_count":1502,"open_issues_count":122,"forks_count":210,"subscribers_count":21,"default_branch":"main","last_synced_at":"2025-04-29T17:34:28.493Z","etag":null,"topics":["bayesian-methods","data-science","deep-learning","machine-learning","neural-networks","probabilistic-programming","statistics","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-07-16T01:54:32.000Z","updated_at":"2025-04-29T16:23:53.000Z","dependencies_parsed_at":"2023-02-19T10:00:18.214Z","dependency_job_id":"35e47e4d-f78a-4f7c-a74b-40a8fa089717","html_url":"https://github.com/google/uncertainty-baselines","commit_stats":{"total_commits":1113,"total_committers":59,"mean_commits":"18.864406779661017","dds":0.8562443845462713,"last_synced_commit":"81a2e2eb1ab1d77277088ab8ea69fad779c2cdfa"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Funcertainty-baselines","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Funcertainty-baselines/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Funcertainty-baselines/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Funcertainty-baselines/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google","download_url":"https://codeload.github.com/google/uncertainty-baselines/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254043394,"owners_count":22004945,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-methods","data-science","deep-learning","machine-learning","neural-networks","probabilistic-programming","statistics","tensorflow"],"created_at":"2024-08-01T15:01:25.126Z","updated_at":"2025-05-14T00:05:40.458Z","avatar_url":"https://github.com/google.png","language":"Python","readme":"# Uncertainty Baselines\n\n[![Tests](https://github.com/google/uncertainty-baselines/actions/workflows/ci.yml/badge.svg)](https://github.com/google/uncertainty-baselines/actions/workflows/ci.yml)\n\nThe goal of Uncertainty Baselines is to provide a template for researchers to\nbuild on. The baselines can be a starting point for any new ideas, applications,\nand/or for communicating with other uncertainty and robustness researchers. This\nis done in three ways:\n\n1. Provide high-quality implementations of standard and state-of-the-art methods\n   on standard tasks.\n2. Have minimal dependencies on other files in the codebase. Baselines should be\n   easily forkable without relying on other baselines and generic modules.\n3. Prescribe best practices for uncertainty and robustness benchmarking.\n\n__Motivation.__ There are many uncertainty and robustness implementations across\nGitHub. However, they are typically one-off experiments for a specific paper\n(many papers don't even have code). There are no clear examples that uncertainty\nresearchers can build on to quickly prototype their work. Everyone must\nimplement their own baseline. In fact, even on standard tasks, every project\ndiffers slightly in their experiment setup, whether it be architectures,\nhyperparameters, or data preprocessing. This makes it difficult to compare\nproperly against baselines.\n\n## Installation\n\nTo install the latest development version, run\n\n```sh\npip install \"git+https://github.com/google/uncertainty-baselines.git#egg=uncertainty_baselines\"\n```\n\nThere is not yet a stable version (nor an official release of this library). All\nAPIs are subject to change. Installing `uncertainty_baselines` does not\nautomatically install any backend. For TensorFlow, you will need to install\nTensorFlow ( `tensorflow` or `tf-nightly`), TensorFlow Addons (`tensorflow-\naddons` or `tfa-nightly`), and TensorBoard (`tensorboard` or `tb-nightly`). See\nthe extra dependencies one can install in `setup.py`.\n\n## Usage\n\n### Baselines\n\nThe\n[`baselines/`](https://github.com/google/uncertainty-baselines/tree/main/baselines)\ndirectory includes all the baselines, organized by their training dataset.\nFor example,\n[`baselines/cifar/determinstic.py`](https://github.com/google/uncertainty-baselines/tree/main/baselines/cifar/deterministic.py)\nis a Wide ResNet 28-10 obtaining 96.0% test accuracy on CIFAR-10.\n\n__Launching with TPUs.__ You often need TPUs to reproduce baselines. There are three options:\n\n1. __Colab.__\n[Colab offers free TPUs](https://colab.research.google.com/notebooks/tpu.ipynb).\nThis is the most convenient and budget-friendly option. You can experiment with\na baseline by copying its script and running it from scratch. This works well for simple experimentation. However, be careful relying on Colab long-term: TPU access isn't guaranteed, and Colab can only go so far for managing multiple long experiments.\n\n2. __Google Cloud.__\nThis is the most flexible option. First, you'll need to\ncreate a virtual machine instance (details\n[here](https://cloud.google.com/compute/docs/instances/create-start-instance)).\n\n    Here's an example to launch the BatchEnsemble baseline on CIFAR-10. We assume\n    a few environment variables which are set up with the cloud TPU (details\n    [here](https://cloud.google.com/tpu/docs/quickstart)).\n\n    ```sh\n    export BUCKET=gs://bucket-name\n    export TPU_NAME=ub-cifar-batchensemble\n    export DATA_DIR=$BUCKET/tensorflow_datasets\n    export OUTPUT_DIR=$BUCKET/model\n\n    python baselines/cifar/batchensemble.py \\\n        --tpu=$TPU_NAME \\\n        --data_dir=$DATA_DIR \\\n        --output_dir=$OUTPUT_DIR\n    ```\n\n    Note the TPU's accelerator type must align with the number of cores for\n    the baseline (`num_cores` flag). In this example, BatchEnsemble uses a\n    default of `num_cores=8`. So the TPU must be set up with `accelerator_type=v3-8`.\n\n3. __Change the flags.__ For example, go from 8 TPU cores to 8 GPUs, or reduce the number of cores to train the baseline.\n\n    ```sh\n    python baselines/cifar/batchensemble.py \\\n        --data_dir=/tmp/tensorflow_datasets \\\n        --output_dir=/tmp/model \\\n        --use_gpu=True \\\n        --num_cores=8\n    ```\n\n    Results may be similar, but ultimately all bets are off. GPU vs TPU may not make much of a difference in practice, especially if you use the same numerical precision. However, changing the number of cores matters a lot. The total batch size during each training step is often determined by `num_cores`, so be careful!\n\n### Datasets\n\nThe\n[`ub.datasets`](https://github.com/google/uncertainty-baselines/tree/main/uncertainty_baselines/datasets)\nmodule consists of datasets following the\n[TensorFlow Datasets](https://www.tensorflow.org/datasets) API.\nThey add minimal logic such as default data preprocessing.\nNote: in ipython/colab notebook, one may need to activate tf earger execution mode `tf.compat.v1.enable_eager_execution()`.\n\n```python\nimport uncertainty_baselines as ub\n\n# Load CIFAR-10, holding out 10% for validation.\ndataset_builder = ub.datasets.Cifar10Dataset(split='train',\n                                             validation_percent=0.1)\ntrain_dataset = dataset_builder.load(batch_size=FLAGS.batch_size)\nfor batch in train_dataset:\n  # Apply code over batches of the data.\n```\n\nYou can also use `get` to instantiate datasets from strings (e.g., commandline\nflags).\n\n```python\ndataset_builder = ub.datasets.get(dataset_name, split=split, **dataset_kwargs)\n```\n\nTo use the datasets in Jax and PyTorch:\n\n```python\nfor batch in tfds.as_numpy(ds):\n  train_step(batch)\n```\n\nNote that `tfds.as_numpy` calls `tensor.numpy()`. This invokes an unnecessary\ncopy compared to `tensor._numpy()`.\n\n```python\nfor batch in iter(ds):\n  train_step(jax.tree.map(lambda y: y._numpy(), batch))\n```\n\n### Models\n\nThe\n[`ub.models`](https://github.com/google/uncertainty-baselines/tree/main/uncertainty_baselines/models)\nmodule consists of models following the\n[`tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model)\nAPI.\n\n```python\nimport uncertainty_baselines as ub\n\nmodel = ub.models.wide_resnet(input_shape=(32, 32, 3),\n                              depth=28,\n                              width_multiplier=10,\n                              num_classes=10,\n                              l2=1e-4)\n```\n\n## Metrics\n\nWe define metrics used across datasets below. All results are reported by roughly 3 significant digits and averaged over 10 runs.\n\n1. __# Parameters.__ Number of parameters in the model to make predictions after training.\n2. __Test Accuracy.__ Accuracy over the test set. For a dataset of `N` input-output pairs `(xn, yn)` where the label `yn` takes on 1 of `K` values, the accuracy is\n\n    ```sh\n    1/N \\sum_{n=1}^N 1[ \\argmax{ p(yn | xn) } = yn ],\n    ```\n\n    where `1` is the indicator function that is 1 when the model's predicted class is equal to the label and 0 otherwise.\n3. __Test Cal. Error.__ Expected calibration error (ECE) over the test set ([Naeini et al., 2015](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4410090)). ECE discretizes the probability interval `[0, 1]` under equally spaced bins and assigns each predicted probability to the bin that encompasses it. The calibration error is the difference between the fraction of predictions in the bin that are correct (accuracy) and the mean of the probabilities in the bin (confidence). The expected calibration error averages across bins.\n\n    For a dataset of `N` input-output pairs `(xn, yn)` where the label `yn` takes on 1 of `K` values, ECE computes a weighted average\n\n    ```sh\n    \\sum_{b=1}^B n_b / N | acc(b) - conf(b) |,\n    ```\n\n    where `B` is the number of bins, `n_b` is the number of predictions in bin `b`, and `acc(b)` and `conf(b)` is the accuracy and confidence of bin `b` respectively.\n4. __Test NLL.__ Negative log-likelihood over the test set (measured in nats). For a dataset of `N` input-output pairs `(xn, yn)`, the negative log-likelihood is\n\n    ```sh\n    -1/N \\sum_{n=1}^N \\log p(yn | xn).\n    ```\n\n    It is equivalent up to a constant to the KL divergence from the true data distribution to the model, therefore capturing the overall goodness of fit to the true distribution ([Murphy, 2012](https://www.cs.ubc.ca/~murphyk/MLbook/)). It can also be intepreted as the amount of bits (nats) to explain the data ([Grunwald, 2004](https://arxiv.org/abs/math/0406077)).\n5. __Train/Test Runtime.__ Training runtime is the total wall-clock time to train the model, including any intermediate test set evaluations. Test Runtime refers to the time it takes to run a forward pass on the GPU/TPU, i.e., the duration for which the device is not idle. Note that Test Runtime does not include time on the coordinator: this is more precise in comparing baselines because including the coordinator adds overhead in GPU/TPU scheduling and data fetching---producing high variance results.\n\n__Viewing metrics.__\nUncertainty Baselines writes TensorFlow summaries to the `model_dir` which can\nbe consumed by TensorBoard. This includes the TensorBoard hyperparameters\nplugin, which can be used to analyze hyperparamter tuning sweeps.\n\nIf you wish to upload to the *PUBLICLY READABLE* [tensorboard.dev](https://tensorboard.dev), use:\n\n```sh\ntensorboard dev upload --logdir MODEL_DIR --plugins \"scalars,graphs,hparams\" --name \"My experiment\" --description \"My experiment details\"\n```\n\n## References\n\nIf you'd like to cite Uncertainty Baselines, use the following BibTeX entry.\n\n\u003e Z. Nado, N. Band, M. Collier, J. Djolonga, M. Dusenberry,\n\u003e S. Farquhar, A. Filos, M. Havasi, R. Jenatton, G.\n\u003e Jerfel, J. Liu, Z. Mariet, J. Nixon, S. Padhy, J. Ren, T.\n\u003e Rudner, Y. Wen, F. Wenzel, K. Murphy, D. Sculley, B.\n\u003e Lakshminarayanan, J. Snoek, Y. Gal, and D. Tran.\n\u003e [Uncertainty Baselines:  Benchmarks for uncertainty \u0026 robustness in deep learning](https://arxiv.org/abs/2106.04015),\n\u003e _arXiv preprint arXiv:2106.04015_, 2021.\n\n```\n@article{nado2021uncertainty,\n  author = {Zachary Nado and Neil Band and Mark Collier and Josip Djolonga and Michael Dusenberry and Sebastian Farquhar and Angelos Filos and Marton Havasi and Rodolphe Jenatton and Ghassen Jerfel and Jeremiah Liu and Zelda Mariet and Jeremy Nixon and Shreyas Padhy and Jie Ren and Tim Rudner and Yeming Wen and Florian Wenzel and Kevin Murphy and D. Sculley and Balaji Lakshminarayanan and Jasper Snoek and Yarin Gal and Dustin Tran},\n  title = {{Uncertainty Baselines}:  Benchmarks for Uncertainty \\\u0026 Robustness in Deep Learning},\n  journal = {arXiv preprint arXiv:2106.04015},\n  year = {2021},\n}\n```\n\n### Papers using Uncertainty Baselines\nThe following papers have used code from Uncertainty Baselines:\n\n1. [A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection](https://arxiv.org/abs/2106.09022)\n2. [BatchEnsemble: An Alternative Approach to Efficient Ensembles and Lifelong Learning](https://arxiv.org/abs/2002.06715)\n3. [DEUP: Direct Epistemic Uncertainty Prediction](https://arxiv.org/abs/2102.08501)\n4. [Distilling Ensembles Improves Uncertainty Estimates](https://openreview.net/forum?id=Lzi5IMyJTFX)\n5. [Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors](https://arxiv.org/abs/2005.07186)\n6. [Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit](https://arxiv.org/abs/2010.07355)\n7. [Hyperparameter Ensembles for Robustness and Uncertainty Quantification](https://arxiv.org/abs/2006.13570)\n8. [Measuring Calibration in Deep Learning](https://arxiv.org/abs/1904.01685)\n9. [Measuring and Improving Model-Moderator Collaboration using Uncertainty Estimation](https://arxiv.org/abs/2107.04212)\n10. [Neural networks with late-phase weights](https://arxiv.org/abs/2007.12927)\n11. [On the Practicality of Deterministic Epistemic Uncertainty](https://arxiv.org/abs/2107.00649)\n12. [Prediction-Time Batch Normalization for Robustness under Covariate Shift](https://arxiv.org/abs/2006.10963)\n13. [Refining the variational posterior through iterative optimization](http://bayesiandeeplearning.org/2019/papers/8.pdf)\n14. [Revisiting One-vs-All Classifiers for Predictive Uncertainty and Out-of-Distribution Detection in Neural Networks](https://arxiv.org/abs/2007.05134)\n15. [Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness](https://proceedings.neurips.cc/paper/2020/file/543e83748234f7cbab21aa0ade66565f-Paper.pdf)\n16. [Training independent subnetworks for robust prediction](https://openreview.net/forum?id=OGg9XnKxFAH)\n17. [Plex: Towards Reliability Using Pretrained Large Model Extensions](https://goo.gle/plex-paper), available [here](https://goo.gle/plex-code)\n\n\n## Contributing\n\n### Formatting Code\n\nBefore committing code, make sure that the file is formatted according to yapf's yapf style:\n\n```\nyapf -i --style yapf [source file]\n```\n\n### Adding a Baseline\n\n1. Write a script that loads the fixed training dataset and model. Typically, this is forked from other baselines.\n2. After tuning, set the default flag values to the best hyperparameters.\n3. Add the baseline's performance to the table of results in the corresponding `README.md`.\n\n### Adding a Dataset\n\n1. Add the bibtex reference to [`references.md`](https://github.com/google/uncertainty-baselines/blob/main/references.md).\n2. Add the dataset definition to the datasets/ dir. Every file should have a subclass of `datasets.base.BaseDataset`, which at a minimum requires implementing a constructor, a `tfds.core.DatasetBuilder`, and `_create_process_example_fn`.\n3. Add a test that at a minimum constructs the dataset and checks the shapes of elements.\n4. Add the dataset to `datasets/datasets.py` for easy access.\n5. Add the dataset class to `datasets/__init__.py`.\n\nFor an example of adding a dataset, see [this pull request](https://github.com/google/uncertainty-baselines/pull/175).\n\n### Adding a Model\n\n1. Add the bibtex reference to [`references.md`](https://github.com/google/uncertainty-baselines/blob/main/references.md).\n2. Add the model definition to the models/ dir. Every file should have a `create_model` function with the following signature:\n\n    ```python\n    def create_model(\n        batch_size: int,\n        ...\n        **unused_kwargs: Dict[str, Any])\n        -\u003e tf.keras.models.Model:\n    ```\n\n3. Add a test that at a minimum constructs the model and does a forward pass.\n4. Add the model to `models/models.py` for easy access.\n5. Add the `create_model` function to `models/__init__.py`.\n","funding_links":[],"categories":["Python","Tools 🛠️"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Funcertainty-baselines","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle%2Funcertainty-baselines","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Funcertainty-baselines/lists"}