Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/vindaar/timepixanalysis

Contains code related to calibration and analysis of Timepix based detector + CAST related code
https://github.com/vindaar/timepixanalysis

nim nim-lang timepix

Last synced: about 1 month ago
JSON representation

Contains code related to calibration and analysis of Timepix based detector + CAST related code

Awesome Lists containing this project

README

        

* Timepix analysis & calibration
[[https://github.com/Vindaar/TimepixAnalysis/workflows/TPA%20CI/badge.svg]]
[[https://gitter.im/TimepixAnalysis/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge][file:https://badges.gitter.im/TimepixAnalysis/Lobby.svg]]

This repository contains code related to the data analysis of Timepix
based gaseous detectors.

It contains code to calibrate a Timepix ASIC and perform event shape
analysis of data to differentiate between background events (mainly
cosmic muons) and signal events (X-rays).

The software in this repository is at the heart of my PhD thesis,

https://phd.vindaar.de

All data required to reproduce the results of my thesis, including
results reconstructed with this code, can be found here:

https://zenodo.org/uploads/10521887

** CAST

Many parts of this repository are specifically related to an InGrid
based X-ray detector in use at the CERN Axion Solar Telescope:
[[http://cast.web.cern.ch/CAST/]]

* Project structure
This repository contains a big project combining several tools used to
analyze data based on Timepix detectors as well as the CAST
experiment.

*NOTE:* If you are mainly interested in using the reconstruction and analysis
utilities for TOS data, the [[file:Analysis/][Analysis]] folder is what you're looking
for. See the [[Installation]] section for more information.

*UPDATE* <2024-09-11 Wed 16:10>: For a more updated overview of the
repository structure, see the overview here:
https://phd.vindaar.de/html/software.html#sec:appendix:timepix_analysis

- [[file:Analysis/][Analysis]]:
Is the =ingrid= module, which contains the major programs of this
repository [[file:Analysis/ingrid/raw_data_manipulation.nim][raw_data_manipulation]] and [[file:Analysis/ingrid/reconstruction.nim][reconstruction]] and to a lesser
extent (depending on your use case) [[file:Analysis/ingrid/likelihood.nim][likelihood]].
- [[file:Analysis/ingrid/raw_data_manipulation.nim][raw_data_manipulation]]:
Reads folders of raw TOS data and outputs to a HDF5 file.
Supported TOS data types:
- old ~2015 era Virtex V6 TOS
- current Virtex V6 TOS
- current SRS TOS
- [[file:Analysis/ingrid/reconstruction.nim][reconstruction]]:
Takes the output of the above program and performs reconstruction
of clusters within the data, i.e. calculate geometric properties.
- [[file:Analysis/ingrid/likelihood.nim][likelihood]]:
Performs an event shape likelihood based analysis on
the reconstructed data comparing with reference X-ray datasets.
The other files in the folder are imported by these programs. An
exception is skeleton program [[file:Analysis/ingrid/analysis.nim][analysis]], which will eventually become
a wrapper of the other programs so that a nicer interface can be
provided. A combination of a https://github.com/yglukhov/nimx based
GUI with a =readline= based command line interface will be
developed.
- [[file:InGridDatabase/][InGridDatabase]]:
A Nim program which provides, writes to and reads from the /InGrid
database/. If the a folder describing the used detector is given to
it (containing =fsr=, =threshold=, =thresholdMeans=, =ToT=
calibration and / or =SCurves= and an additional file containing the
chip name and additional information) it can be added to that
database, which is simply a HDF5 file. The analysis progam makes use
of this database to read calibration relevant data from it.
TODO: link to explanation of required folder structure and add files
/ folders for current chips part of database.
- [[file:LogReader/][LogReader]]:
A Nim tool to read and process CAST slow control and tracking log
files. From these environmental sensors can be read if needed for
data analysis puposes of CAST data as well as information about when
solar trackings took place. If a HDF5 file is given the tracking
information is added to the appropriate runs.
- [[file:NimUtil][NimUtil]]:
The =helpers= nimble module. It contains general procedures used in the rest
of the code, which are unrelated to CAST or Timepix detectors.
- [[file:Plotting/][Plotting]]:
A Nim tool to create plots of Timepix calibration data. Reads from
the InGrid database and plots =ToT= calibration (+ fits) and
SCurves.
- [[file:README.org][README.org]]: this file. :)
- [[file:resources/][resources]]:
Contains data, which is needed for analysis purposes,
e.g. information about run numbers for data taking periods, the
2014/15 background rates etc.
TODO: maybe add folders for known chips for InGrid database in here
or at least an example directory.
- [[file:Tools/][Tools]]:
Directory for other smaller tools, for which a separate directory in
the root of the repository does not make sense (either used too
infrequently or are very specific and small tools).
- [[file:SolarEclipticToEarth][SolarEclipticToEarth]]:
A simple Python tool part of solar chameleon
analysis, which calculates the projection of the solar ecliptic onto
Earth (chameleon flux potentially varies greatly depending on solar
latitude).
TODO: should be moved to [[file:Tools/][Tools]].
- [[file:Tests/][Tests]]:
Some very simple "test cases", which typically just test new
features separately from the rest of the analysis programs.
- [[file:VerticalShiftProblem/][VerticalShiftProblem]]:
A simple Python tool to plot CAST log data to debug a problem with
the belt, which slipped and caused misalignment. That problem has
since been fixed.
TODO: should be moved to [[file:Tools/][Tools]].
- [[file:CDL-RootToHdf5/][CDL-RootToHdf5]]:
A Python tool to (currently only) convert X-ray calibration data
from the CAST detector lab from ROOT trees to HDF5 files. This could
be easily extended to be a ROOT to HDF5 converter.
TODO: this should be moved to [[file:Tools/][Tools]].
- [[file:endTimeExtractor/][endTimeExtractor]]:
A Nim tool to extract the following information from a TOS run:
- start of the Run
- end of the Run
- total run time
and output it as an Org date string.
TODO: should be moved to [[file:Tools/][Tools]].
- [[file:InGrid-Python/][InGrid-Python]]:
An (outdated) Python module containing additional functions used in the Nim
analysis (fit of Fe55 spectrum and polya gas gain fit done using
https://github.com/yglukhov/nimpy) and the Python plotting tool (see
below).
- [[file:Figs/][Figs]]:
Plots, which are created from the analysis and have been used in a
talk etc.

* Installation

The project has only a few dependencies, which are all mostly easy to
install. The Nim compiler is *only* a dependency to compile the Nim
programs. But if you just wish to run the built binaries, the Nim
compiler is *not* a dependency! E.g. compiling the
=raw_data_manipulation= and =reconstruction= on an x86-64 linux system
creates an (almost) dependency free binary.

The following shared libraries are linked at runtime:
- =libhdf5=
- =libnlopt=
- =libmpfit=
- =libpcre=
Their installation procedures are explained below.

For instructions to install the dependencies, see sec. [[#sec:deps]]. Note
that for ~NLopt~ and ~MPFIT~ the build tool (see
sec. [[#sec:install:build]])

** Nim

Nim is obviously required to compile the Nim projects of this
repository. There are two approaches to install the Nim
compiler. Using =choosenim= or cloning the Nim repository.

*** Clone the Nim repository and build the compiler locally

Go to some folder where you wish to store the Nim compiler, e.g. [[file:~/src/][~/src]]
or create a folder if does not exist:
#+BEGIN_SRC sh
cd ~/
mkdir src
#+END_SRC
Please replace this directory by your choice in the rest of this
section.

Then clone the git repository from GitHub (assuming =git= is
installed):
#+BEGIN_SRC
git clone https://github.com/nim-lang/nim
#+END_SRC
enter the folder:
#+BEGIN_SRC sh
cd nim
#+END_SRC
and if you're on a Unix system run:
#+BEGIN_SRC sh
sh build_all.sh
#+END_SRC
to build the compiler and additional tools like =nimble= (Nim's
package manager), =nimsuggest= (allows smart auto complete for Nim
procs), etc.

Now add the following to your =PATH= variable in your shell's
configuration file, e.g. [[file:~/.bashrc][~/.bashrc]]:
#+BEGIN_SRC sh
# add location of Nim's binaries to PATH
export PATH=$PATH:$HOME/src/nim/bin
#+END_SRC
and finally reload the shell via
#+BEGIN_SRC sh
source ~/.bashrc
#+END_SRC
or the appropriate shell config (or start a new shell).

With this approach updating the Nim compiler is trivial. First update
your local git repository by pulling from the =devel= branch:
#+BEGIN_SRC sh
cd ~/src/nim
git pull origin devel
#+END_SRC
and finally use Nim's build tool =koch= to update the Nim compiler:
#+BEGIN_SRC sh
./koch boot -d:release
#+END_SRC

*** Choosenim
An alternative to the above mentioned method is to use =choosenim=.
Type the following into your terminal:
#+BEGIN_SRC sh
curl https://nim-lang.org/choosenim/init.sh -sSf | sh
#+END_SRC
Then follow the instructions and extend the =PATH= variable in your
shell's configuration file, e.g. [[file:~/.bashrc][~/.bashrc]].
Finally reload that file via:
#+BEGIN_SRC sh
source ~/.bashrc
#+END_SRC
or simply start a new shell.

** Install the TimepixAnalysis framework

Once the dependencies are installed, we can prepare the framework.

*** Preparing the =TimepixAnalysis= repository

We start by cloning the =TimepixAnalysis= repository somewhere, e.g.:
#+BEGIN_SRC sh
cd ~/src
git clone https://github.com/Vindaar/TimepixAnalysis
#+END_SRC

*** External dependency overview

On a fresh build of Debian, installing the following packages should
have you covered in terms of dependencies:
#+begin_src sh
git \
build-essential \
locate \
cmake \
libhdf5-dev \
libnlopt0 \
libnlopt-dev \
libcairo2-dev \
liblapack-dev \
libpcre3-dev \
libblosc1 \
libblosc-dev \
libgtk-3-dev \
libwebkit2gtk-4.0
#+end_src

~locate~ may also be called ~mlocate~. After installing it, make sure
to run ~sudo updatedb~ to update the ~locate~ database.

*** Using the build tool to build (most) binaries
:PROPERTIES:
:CUSTOM_ID: sec:install:build
:END:

As of <2024-09-11 Wed 18:48> there is now a build tool to automate the
compilation of (most; all relevant for the majority of users)
binaries. In addition, the nimble (i.e. other Nim packages)
dependencies are now fixed using a lock file, so that precisely the
versions that are fixed are pulled and used. The latter should
hopefully remove the occurrence for spurious compilation failures due
to random version mismatches.

*NOTE*: <2024-09-12 Thu 12:19>
As of right now, before running the ~nimble setup~ command below, you
will need to manually install ~weave~ using ~nimble~. There is a
current issue causing the setup step to fail otherwise. So run:
#+begin_src sh
nimble install weave
#+end_src

First we need to setup the Nimble dependencies:
#+begin_src sh
cd Analysis
nimble setup
#+end_src
The command pulls all dependencies written in
[[file:Analysis/nimble.lock]]. Afterwards, any compilation within the
~Analysis~ directory will _only_ use those packages.

Next, we compile the build tool found in the root of the repository:
#+begin_src sh
nim c buildTpa
#+end_src

#+begin_src sh :results code
./buildTpa -h
#+end_src

#+begin_src sh
Usage:
main [optional-params]
Options:
-h, --help print this cligen-erated help

--help-syntax advanced: prepend,plurals,..

-l=, --locateTool= string "locate" Program to use to detect installed shared
libraries on the system.

-a, --allowClone bool true If true will automatically clone a git
repository and build shared library
dependencies.

-c=, --clonePath= string "~/src" Base path in which cloned directories will
be installed.

--args= string "" An additional command line argument string
passed to all programs being compiled.
#+end_src

A few things to note:
- it tries to use ~locate~ to determine if the NLopt (~libnlopt.so~)
and MPFIT (~libmpfit.so~) shared libraries can be found by ~ld.so~
- if not as long as ~allowClone~ is ~true~ it will pull the code for
these libraries and build them manually. In that case you still need
to make sure the shared libraries can be found by ~ld.so~ on your
system. By default (changed via ~--clonePath:/foo/bar~) the
repositories will be cloned into =~/src=. See sec [[#sec:deps]] for more
information.

All you need to do to build the binaries then is:
#+begin_src sh
./buildTpa
#+end_src

It builds:
- ~parse_raw_tpx3~
- ~raw_data_manipulation~
- ~reconstruction~
- ~runAnalysisChain~
- ~fake_event_generator~
- ~plotBackgroundRate~
- ~plotBackgroundClusters~
- ~plotData~

Symbolic links to the location of the binaries are found in the [[fe:][./bin]]
directory of this repository. I recommend to add the path to that
directory to your ~.zshrc~ / ~.bashrc~ (or whatever else your setup
looks like).

Assuming TPA is located in =~/src/TimepixAnalysis= that might look like:

**** Zsh

#+begin_src sh
path+=$HOME/src/TimepixAnalysis/bin
#+end_src

**** Bash

#+begin_src sh
export PATH=$PATH:$HOME/src/TimepixAnalysis/bin
#+end_src

*** Troubleshooting

If you run into problems trying to run one of the programs, it might
be an easy fix.

An error such as
#+BEGIN_EXAMPLE
could not import: H5P_LST_FILE_CREATE_g
#+END_EXAMPLE
means that you compiled against a different HDF5 libary version than
the one you have installed and is being tried to link at run time.
_Solution:_ compile the program with the =-d:H5_LEGACY= option, e.g.:
#+BEGIN_SRC sh
nim c -d:release --threads:on -d:H5_LEGACY raw_data_manipulation.nim
#+END_SRC

Another common problem is an error such as:
#+BEGIN_SRC sh
Error: cannot open file: docopt
#+END_SRC
This indicates that the module named =docopt= (only an example) could
not be imported. Most likely a simple
#+BEGIN_SRC sh
nimble install docopt
#+END_SRC
would suffice. A call to =nimble install= with a package name will try
to install a package from the path declared in the =packages.json=
from here:
https://github.com/nim-lang/packages/blob/master/packages.json

If you know that you need the =#head= of such a package, you can
install it via
#+BEGIN_SRC sh
nimble install "docopt@#head"
#+END_SRC
_Note:_ depending on your shell the ="= may not be needed.
_Note 2:_ instead of a simple package name, you may also hand nimble a
full path to a git or mercurial repository. This is necessary in some
cases, e.g. for the =seqmath= module, because we depend on a fork:
#+BEGIN_SRC sh
nimble install "https://github.com/vindaar/seqmath#head"
#+END_SRC

*** List of nimble dependencies

For a list of Nimble dependencies, see [[file:Analysis/ingrid.nimble]].

** Dependencies
:PROPERTIES:
:CUSTOM_ID: sec:deps
:END:

*** HDF5
The major dependency of the Nim projects is HDF5. On a reasonably
modern Linux distribution the =libhdf5= should be part of the package
repositories. The supported HDF5 versions are:
- =1.8=: as a legacy mode, compile the Nim projects with
=-d:H5_LEGACY=
- =1.10=: the current HDF5 version and the default
- versions newer than ~1.10~ might require the ~-d:H5_FUTURE~
compilation flag.

If the HDF5 library is not available on your OS, you may download the
binaries or the source code from the [[url:https://www.hdfgroup.org/downloads/hdf5/][HDF group]].

**** Ubuntu

On Ubuntu systems the following packages install all you need:
#+begin_src sh
sudo apt-get install libhdf5-103 libhdf5-dev
#+end_src

In addition ~hdf5-tools~ might come in handy.

**** Void Linux

On Void you need:
#+begin_src sh
sudo xbps-install -S hdf5 hdf5-devel
#+end_src

**** HDF View
HDF View is a very useful tool to look at HDF5 files with a graphical
user interface. For HEP users: it is very similar to ROOT's TBrowser.

Although many package repositories contain a version of HDF View, it
is typically relatively old. The current version is version 3.0.0,
which has some nice features, so it may be a good idea to install it
manually.

*** NLopt

The NLopt library is a nonlinear optimization library, which is used
in this project to fit the rotation angle of clusters and perform fits of
the gas gain. The Nim wrapper is found at
[[https://github.com/vindaar/nimnlopt]]. To build the C library follow the
following instructions, (taken from [[https://github.com/vindaar/nimnlopt/c_header][here]]):
#+BEGIN_SRC sh
git clone https://github.com/stevengj/nlopt # clone the repository
cd nlopt
mkdir build
cd build
cmake ..
make
sudo make install
#+END_SRC
This introduces =cmake= as a dependency. Note that this installs the
=libnlopt.so= system wide. If you do not wish to do that, you need to
set your =LD_PRELOAD_PATH= accordingly!

Afterwards installation of the Nim =nlopt= module is sufficient (done
automatically later).

**** Ubuntu

On Ubuntu systems the following packages install all you need:
#+begin_src sh
sudo apt-get install libnlopt0 libnlopt-dev
#+end_src

**** Void Linux

#+begin_src sh
sudo xbps-install -S nlopt nlopt-devel
#+end_src

*** MPfit

MPfit is a non-linear least squares fitting library. It is required as
a dependency, since it's used to perform different fits in the
analysis. The Nim wrapper is located at
[[https://github.com/vindaar/nim-mpfit]]. Compilation of this shared
object is easiest by cloning the git repository of the Nim wrapper:
#+BEGIN_SRC sh
cd ~/src
git clone https://github.com/vindaar/nim-mpfit
cd nim-mpfit
#+END_SRC
And then build the library from the =c_src= directory as follows:
#+BEGIN_SRC sh
cd c_src
gcc -c -Wall -Werror -fpic mpfit.c mpfit.h
gcc -shared -o libmpfit.so mpfit.o
#+END_SRC
which should create the =libmpfit.so=. Now install that library system
wide (again to avoid having to deal with =LD_PRELOAD_PATH=
manually). Depending on your system, a suitable choice may be
[[file:/usr/local/lib/]]:
#+BEGIN_SRC sh
sudo cp libmpfit.so /usr/local/lib
#+END_SRC

Finally, you may install the Nim wrapper via
#+BEGIN_SRC sh
nimble install
#+END_SRC
or tell =nimble= to point to the directory of the respitory here via:
#+BEGIN_SRC sh
nimble develop
#+END_SRC
The latter makes updating the package much easier, since updating the
git repository is enough.

*** PCRE
Perl Compatible Regular Expressions (PCRE) is a library for regular
expression matching. On almost any unix system, this library is
already available. For some distributions (possibly some CentOS or
Scientific Linux) it may not be.

This currently means you'll have to build this library by yourself.

**** Different RE implementations

The default RE library in Nim is a wrapper around PCRE, due to PCRE's
very high performance. However, the performance critical parts do not
depend on PCRE anymore.
In principle we could thus replace the =re= module with
https://github.com/nitely/nim-regex, a purely Nim based regex
engine. PRs welcome! :)

*** Blosc :optional:

[[https://github.com/Blosc/c-blosc][Blosc]] is a compression library used to compress the binary data in the
HDF5 files. By default however =Zlib= compression is used, so this is
typically not needed.
If one wishes to read Timepix3 based HDF5 files, ~blosc~ support is
mandatory (in [[file:Analysis/ingrid/parse_raw_tpx3.nim]] and after that
in [[file:Analysis/ingrid/raw_data_manipulation.nim]]).

**** Ubuntu

On Ubuntu systems the following packages install all you need:
#+begin_src sh
sudo apt-get install libblosc1 libblosc-dev
#+end_src

**** Void Linux

#+begin_src sh
sudo xbps-install -S c-blosc c-blosc-devel
#+end_src

* Usage

*NOTE*: <2024-09-11 Wed 16:16> This is also a bit outdated. Again, see
the instructions from:
https://phd.vindaar.de/html/software.html#sec:appendix:timepix_analysis

In general the usage of the analysis programs is straight forward and
explained in the docstring, which can be echoed by calling a program
with the =-h= or =--help= option:
#+BEGIN_SRC sh
raw_data_manipulation -h
#+END_SRC
would print:
#+BEGIN_SRC
Usage:
main [REQUIRED,optional-params]
Version: 12e1820 built on: 2024-09-11 at 14:52:49
Options:
-h, --help print this cligen-erated help

--help-syntax advanced: prepend,plurals,..

-p=, --path= string REQUIRED set path

-r=, --runType= RunTypeKind REQUIRED Select run type (Calib | Back | Xray)
The following are parsed case insensetive:
Calib = {"calib", "calibration", "c"}
Back = {"back", "background", "b"}
Xray = {"xray", "xrayfinger", "x"}

-o=, --out= string "" Filename of output file. If none
given will be set to run_file.h5.

-n, --nofadc bool false Do not read FADC files.

-i, --ignoreRunList bool false If set ignores the run list 2014/15
to indicate using any rfOldTos run

-c=, --config= string "" Path to the configuration file to use. Default is config.toml
in directory of this source file.

--overwrite bool false If set will overwrite runs already existing in the
file. By default runs found in the file will be skipped.
HOWEVER: overwriting is assumed, if you only hand a
run folder!

-t, --tpx3 bool false Convert data from a Timepix3 H5 file
to TPA format instead of a Tpx1 run
directory

-k, --keepExtracted bool false If a .tar.gz archive is given for a run folder and this flag is true
we won't remove the extracted archive afterwards.

-e=, --extractTo= string "/tmp/" If a .tar.gz archive is given extract
the data to this directory.
#+END_SRC
similar docstrings are available for all programs.

In order to analyze a raw TOS run, we'd perform the following
steps. The command line arguments are examples. Those required will be
explained, for the others see the doc strings.

** Raw data manipulation

Assuming we have a TOS run folder located in
=~/data/Run_168_180702-15-24/=:
#+BEGIN_SRC sh
raw_data_manipulation -p ~/data/Run_168_180702-15-24/ --runType=calibration --out=run_168.h5
#+END_SRC
where we give the =runType= (either calibration, background or X-ray
finger run), which is useful to store in the resulting HDF5 file. For
calibration runs several additional reconstruction steps are also done
automatically during the reconstruction phase. We also store the data
in a file called =run168.h5=. The default filename is
=run_file.h5=. The HDF5 file now contains two groups (=runs= and
=reconstruction=). =runs= stores the raw data. =reconstruction is
still mainly empty, some datasets are linked from the =runs= group.

Alternatively you may also hand a directory, which contains several
run folders. So if you had several runs located in =~/data=, simply
handing that would work. The program would work on all runs in =data=
after another. Each run is stored in its own group in the resulting
HDF5 file.

** Reconstruction

Afterwards we go on to the reconstruction phase. Here the raw data is
read back from the HDF5 file and clusters within events are separated
and geometric properties calculated. This is done by:
#+BEGIN_SRC sh
reconstruction -i run_168.h5 --out reco_168.h5
#+END_SRC

After the reconstruction is done and depending on whether the run type
is calibration or background / X-ray finger run, you can continue to
calculate futher properties, e.g. the energy of all clusters.

The next step is to apply the ToT calibration to calculate the charge
of all clusters via:
#+BEGIN_SRC sh
reconstruction -i reco_168.h5 --only_charge
#+END_SRC
_Note:_ this requires an entry for your chip in the ingrid
database. See below for more information.

Once the charges are calibrated, you may calculate the gas gain of
the run via:
#+BEGIN_SRC sh
reconstruction -i run_168.h5 --only_gas_gain
#+END_SRC

A purely pixel based energy calculation is available via
~--only_energy=~ and a gas gain based one
via ~--only_energy_from_e~. However, the latter requires calibration
runs that need to be analyzed before hand.

** Likelihood :optional:

The likelihood analysis is the final step done in order to filter out
events, which are not X-ray like, based on a likelihood cut or MLP
classifier. The likelihood program however, needs two different input
files. This is not yet as streamlined as it should be, which is why
it's not explained here in detail. Take a look at the docstring of the
program or ask me (@Vindaar).

*TODO:* make the CDL data part of the repository somehow?

** Adding a chip to the InGrid database :optional:

If you wish to perform charge calibration and from that energy
calibration, you need to add your chip to the ingrid database.

See the explanation in [[file:InGridDatabase/README.org]] for details on
how to do this. Existing data files are found in
[[file:resources/ChipCalibrations/]].

** Plotting

There are a large number of different tools available to visualize the
data created by the programs in this repository.

* License

The code in this repository is published under the MIT license.