https://github.com/camdavidsonpilon/eem_analysis
https://github.com/camdavidsonpilon/eem_analysis
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/camdavidsonpilon/eem_analysis
- Owner: CamDavidsonPilon
- Created: 2020-04-09T03:06:55.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-05-05T13:00:52.000Z (about 6 years ago)
- Last Synced: 2025-01-13T13:31:14.222Z (over 1 year ago)
- Language: Python
- Size: 5.08 MB
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Autoencoding EEMs
### Analysis of EEMs
EEMs (excitation emission matrices) are measurements of a sample's fluorescence intensity at varying excitation and emission wavelengths.
Traditionally, EEMs have been analyzed using linear matrix decomposition methods like PARAFAC. To interpret the decomposition, PARAFAC relies on some strong _chemical_ assumptions (not just statistical), namely:
1. There are no inner filter effects occurring
2. No quenching is present
3. Beer-Lambert law is satisfied
4. No additional scattering is present
If we generalize to non-linear decomposition, and ignore any attempt at interpretation, we can expand the models used. Namely, we can try a convolutional autoencoder to project the 2D EEMs to a lower space, and perform analysis there. The convolutional autoencoder has a much more accurate compression than alternative methods like PARAFAC. (This also means that the decompression is more accurate, as seen in the image below.)

PARAFAC does do a better job when scattering is reduced. If we apply a naive Rayleigh scattering filter to our EEMS:

In the comparison above, the convolutional autoencoder, henceforth CNN-AE, squeezes the 28x28 data into 12 dimensions. From these 12 dimensions, further dimensionality reduction can be applied, like PCA. The following figure is a PCA-reduced dataset of four vegetables' EEMS:

We can clearly see the clusters of vegetables are almost perfectly separated, hence their original EEMs have enough information to distinguish vegetables.
### Existing CNN-AE network
Encoder -> Decoder.

### Installation
1. Clone/download the repo to a local directory.
2. Optional: create a virtualenv for this.
3. From the command line:
```
python setup.py install
```
### Configuration
1. Currently the supported EEMs must be NxN (a square). One can use image / scientific software to resize EEMs to be square. Change the `INPUTS` variable in `src/utils.py`.
2. Data, in the form of csv (with `.csv` extension), should be put into the folder `data/flat_files`.
3. To added labeling information, you can user `-` delimiters in the filename and edit the `Labels` in `src/utils.py`.
### Running on an example dataset
```
python src/keras_conv_ae_training.py && python src/keras_encoder_reconstruction.py
```