Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/thomasgesseyjones/fullybayesianforecastsexample

Example of a fully Bayesian forecast using evidence networks applied to 21-cm cosmology
https://github.com/thomasgesseyjones/fullybayesianforecastsexample

21cm-signal astronomy astrophysics bayesian-methods cosmology python simulation-based-inference statistics

Last synced: 1 day ago
JSON representation

Example of a fully Bayesian forecast using evidence networks applied to 21-cm cosmology

Awesome Lists containing this project

README

        

===============================
Fully Bayesian Forecast Example
===============================

Overview
--------

:Name: Fully Bayesian Forecast Example
:Author: Thomas Gessey-Jones
:Version: 1.0.1
:Homepage: https://github.com/ThomasGesseyJones/FullyBayesianForecastsExample
:Letter: https://ui.adsabs.harvard.edu/abs/2024PhRvD.109l3541G/abstract

.. image:: https://img.shields.io/badge/python-3.8-blue.svg
:target: https://www.python.org/downloads/
:alt: Python version
.. image:: https://img.shields.io/badge/license-MIT-blue.svg
:target: https://github.com/ThomasGesseyJones/ErrorAffirmations/blob/main/LICENSE
:alt: License information
.. image:: https://img.shields.io/badge/arXiv-2309.06942-b31b1b.svg?style=flat
:target: https://ui.adsabs.harvard.edu/abs/2023arXiv230906942G
:alt: arXiv link

Example of a fully Bayesian forecast performed using an `Evidence Network `__.
This code also replicates the analysis of
`Gessey-Jones et al. (2024) `__.
This repository thus serves the dual purposes of providing an example code base others
can modify to perform their own fully Bayesian forecasts and also providing a
reproducible analysis pipeline for the letter.

The overall goal of the code is to produce a fully Bayesian forecast of
the chance of a `REACH `__-like experiment
making a significant detection of the 21-cm global signal from within foregrounds and noise. It also produces
figures showing how this conclusion changes with different astrophysical parameter values
and validates the forecast through blind coverage
tests and comparison to `PolyChord `__.

Installation
------------

The repository is intended to be installed locally
by directly cloning it from GitHub. To do this run the following command in the terminal

.. code:: bash

git clone [email protected]:ThomasGesseyJones/FullyBayesianForecastsExample.git

This will create a local copy of the repository. The pipeline can
then be run from the terminal (see below).

Structure and Usage
-------------------

The code is split into two main parts. The first part is the
modules which provide the general functionality of evidence networks,
data simulators, and prior samplers. The second part
is the scripts which run the fully Bayesian forecast.

There are three modules included in the repository:

- evidence_networks: This module contains the code for the evidence network
class. This class is used to build the evidence network used in the forecasts.
The module also provides an implementation of the l-POP exponential loss
function.
See the class docstring for more details of its capabilities and usage.
- priors: This module contains the code to generate functions that
sample standard prior distributions. These include
uniform, log-uniform, and Gaussian priors.
- simulators: This module defines simulators. In our code, these are functions
that take a number of data simulations to run and return that number of mock data
simulations alongside the values of any parameters that were used in the
simulations. Submodules of this module define functions to generate specific
simulators for noise, foregrounds, and the 21-cm signal.

These three modules are used in the three analysis scripts:

- verification_with_polychord.py: This script generates a range of mock data
sets from both the no-signal model and the with-signal model, and then
performs a Bayesian analysis on each of them.
Evaluating the Bayes ratio between the two models of the data
using Polychord. These results are then stored in the verification_data directory
for later comparison with the results from the evidence network to
verify its accuracy. It should be run first, ideally with a large number of
versions in parallel as it is very computationally expensive but
splits simply into one task per data set.
- train_evidence_network.py: This script builds the evidence network object and
the data simulator functions, then trains the evidence network. Once trained
it stores the evidence network in the models directory, then runs a blind
coverage test on the network and validates its performance against the
Polychord Bayes ratio evaluations from the previous script. It should
be run second.
- visualize_forecasts.py: This script loads the evidence network from the
models directory and uses it to forecast the chance of a REACH-like
experiment detecting the 21-cm global signal by applying it to many
data sets generated from the noisy-signal model. It then plots this result
for fixed astrophysical parameters as in Figure 1 of the letter. This is
done for detection significance thresholds of 2, 3 and 5 sigma. Selected
numerical values are also output to a .txt file. It should be run last.

All three scripts have docstrings describing their role in more detail, as
well as giving advice on how to run them most efficiently. The
scripts can be run from the terminal using the following commands:

.. code:: bash

python verification_with_polychord.py 0
python train_evidence_network.py
python visualize_forecasts.py

to run with the default noise level of 15 mK and replicate the
analysis from `Gessey-Jones et al. (2024) `__.
Alternatively you can pass
the scripts a command line argument to specify the experiments noise level in K. For example
to run with a noise level of 100 mK you would run the following commands:

.. code:: bash

python verification_with_polychord.py 0 0.1
python train_evidence_network.py 0.1
python visualize_forecasts.py 0.1

Two other files of interest are:

- fbf_utilities.py: which defines IO functions
needed by the three scripts, utility functions to assemble the data
simulators for the noise-only and noisy-signal model, and standard
whitening transforms.
- configuration.yaml: which defines several parameters used in the code
including the experimental frequency resolution, the priors on the
astrophysical and foreground parameters, and the astrophysical parameters
which are plotted in the forecast figures. If you change the priors or
resolution the entire pipeline needs to be rerun to get accurate results.

The various figures produced in the analysis are stored in the
figures_and_results directory alongside the timing_data to assess the
performance of the methodology and some summary statistics of the evidence
networks performance. The figures and data generated in the
analysis for `Gessey-Jones et al. (2024) `__ are provided in this
repository for reference, alongside the figures generated for an earlier
version of the letter which did not model foregrounds.

Licence and Citation
--------------------

The software is free to use on the MIT open source license.
If you use the software for academic purposes then we request that you cite
the `letter `__ ::

Gessey-Jones, T. and W. J. Handley. “Fully Bayesian forecasts with evidence
networks.” (June 2024). Physical Review D, Volume 109, Issue 12, 123541

If you are using Bibtex you can use the following to cite the letter

.. code:: bibtex

@ARTICLE{2024PhRvD.109l3541G,
author = {{Gessey-Jones}, T. and {Handley}, W.~J.},
title = "{Fully Bayesian forecasts with evidence networks}",
journal = {\prd},
year = 2024,
month = jun,
volume = {109},
number = {12},
eid = {123541},
pages = {123541},
doi = {10.1103/PhysRevD.109.123541},
adsurl = {https://ui.adsabs.harvard.edu/abs/2024PhRvD.109l3541G},
adsnote = {Provided by the SAO/NASA Astrophysics Data System}}

Note some of the packages used (see below) in this code have their own licenses that
require citation when used for academic purposes (e.g. `globalemu `__ and
`pypolychord `__). Please check the licenses of these packages for more details.

Requirements
------------

To run the code you will need to following additional packages:

- `globalemu `__
- `tensorflow `__
- `numpy `__
- `keras `__
- `matplotlib `__
- `nvidia-cudnn-cu11 `__
- `pandas `__
- `PyYAML `__
- `pypolychord `__
- `scipy `__
- `mpi4py `__
- `scikit-learn `__
- `anesthetic `__

The code was developed using python 3.8. It has not been tested on other versions
of python. Exact versions of the packages used in our analysis
can be found in the
`requirements.txt `__ file
for reproducibility.

Additional packages that were used for linting, versioning, and pre-commit hooks
are also listed in the requirements.txt file.

Issues and Questions
--------------------

If you have any issues or questions about the code please raise an
`issue `__
on the github page.

Alternatively you can contact the author directly at
`[email protected] `__.