https://github.com/camdavidsonpilon/eem_analysis

Last synced: 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/camdavidsonpilon/eem_analysis
Owner: CamDavidsonPilon
Created: 2020-04-09T03:06:55.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2020-05-05T13:00:52.000Z (about 6 years ago)
Last Synced: 2025-01-13T13:31:14.222Z (over 1 year ago)
Language: Python
Size: 5.08 MB
Stars: 1
Watchers: 3
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

## Autoencoding EEMs

### Analysis of EEMs

EEMs (excitation emission matrices) are measurements of a sample's fluorescence intensity at varying excitation and emission wavelengths.

Traditionally, EEMs have been analyzed using linear matrix decomposition methods like PARAFAC. To interpret the decomposition, PARAFAC relies on some strong _chemical_ assumptions (not just statistical), namely:

1. There are no inner filter effects occurring
2. No quenching is present
3. Beer-Lambert law is satisfied
4. No additional scattering is present

If we generalize to non-linear decomposition, and ignore any attempt at interpretation, we can expand the models used. Namely, we can try a convolutional autoencoder to project the 2D EEMs to a lower space, and perform analysis there. The convolutional autoencoder has a much more accurate compression than alternative methods like PARAFAC. (This also means that the decompression is more accurate, as seen in the image below.)

![comparison](https://i.imgur.com/2t2CdT4l.png)

PARAFAC does do a better job when scattering is reduced. If we apply a naive Rayleigh scattering filter to our EEMS:

![comparision2](https://i.imgur.com/XGMIzNnl.png)

In the comparison above, the convolutional autoencoder, henceforth CNN-AE, squeezes the 28x28 data into 12 dimensions. From these 12 dimensions, further dimensionality reduction can be applied, like PCA. The following figure is a PCA-reduced dataset of four vegetables' EEMS:

![pca](https://i.imgur.com/AwDAdrVl.png)

We can clearly see the clusters of vegetables are almost perfectly separated, hence their original EEMs have enough information to distinguish vegetables.

### Existing CNN-AE network

Encoder -> Decoder.

![network](https://i.imgur.com/FRYHunI.png)

### Installation

1. Clone/download the repo to a local directory.
2. Optional: create a virtualenv for this.
3. From the command line:
```
python setup.py install
```

### Configuration

1. Currently the supported EEMs must be NxN (a square). One can use image / scientific software to resize EEMs to be square. Change the `INPUTS` variable in `src/utils.py`.
2. Data, in the form of csv (with `.csv` extension), should be put into the folder `data/flat_files`.
3. To added labeling information, you can user `-` delimiters in the filename and edit the `Labels` in `src/utils.py`.

### Running on an example dataset

```
python src/keras_conv_ae_training.py && python src/keras_encoder_reconstruction.py
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/camdavidsonpilon/eem_analysis

Awesome Lists containing this project

README