Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/fmolivato/ramandatagenerator
Generator useful to handle Raman spectra data augmentation for deep learning models
https://github.com/fmolivato/ramandatagenerator
augmentation deep-learning keras raman-spectra raman-spectroscopy tensorflow
Last synced: 3 months ago
JSON representation
Generator useful to handle Raman spectra data augmentation for deep learning models
- Host: GitHub
- URL: https://github.com/fmolivato/ramandatagenerator
- Owner: fmolivato
- License: mit
- Created: 2021-04-27T15:00:44.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-08-21T19:25:17.000Z (over 1 year ago)
- Last Synced: 2024-04-29T00:15:28.927Z (9 months ago)
- Topics: augmentation, deep-learning, keras, raman-spectra, raman-spectroscopy, tensorflow
- Language: Python
- Homepage:
- Size: 15.6 KB
- Stars: 18
- Watchers: 3
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Raman Data Generator
[![Generic badge](https://img.shields.io/badge/python-v3.6+-.svg)]() [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
This project aims to offer a fast :zap: and reliable data augmentation generator of **Raman spectra**
## Download
You can download the python class by executing the following script in you target directory
```
wget https://raw.githubusercontent.com/fmolivato/RamanDataGenerator/master/raman_data_generator.py
```
## Usage
```python
from raman_data_generator import RamanDataGeneratordataset = RamanDataGenerator(...)
for batch in dataset:
print(batch.shape) # do something with the batch
```
## Arguments### Basic
Param|Type|Description
---|---|---
df|pandas.DataFrame|A pandas dataframe with shift's values as columns + a column called "labels" for the categories
batch_size|int|batch size of samples
max_classes|int|categories in the labels### Advanced
The standard paramenter were validated on a Raman task, however if you need a greater customization you can still tweak them!The augmentation process works as follow.
For each $sample_i$ of the current batch, takes another sample of the same class $sample_j$ (randomly) and performes:
1. __roll__ (shift horizontally, i used the _roll_ term because it's easy to misunderstand the _horizontal shift_ with the _Raman's shift_) $sample_j$ of some __roll_factor__ (Raman's shift values).2. a __weighted sum__ with respect of some $a$ probability variable
$$ sample_k = a·sample_i + (1-a)·sample_j $$
This augmentation step is based on the assumption that two samples of the same class are semantically equal (natural class variability) + some sensor noise.
3. on $sample_k$ apply a __slope__ of some __slope factor__, which is baseline linear error that emulates the fluorescence issue of some sensors.
4. on $sample_k$ apply addittive white gaussian noise to the signalParam|Type|Description
-|---|---
roll|bool|Enable/disable the __roll__ step during the augmentation
roll_factor|int|The signal is rolled(horizontal shifted) of this amount of shifts. It rolls along the dataframe columns. If a signal has a precision of 10 Raman's shifts, wich means that the columns increase 10 shifts at time, using a roll factor of 5, it actually shifts 10*5 = 50 shifts
slope|bool|Enable/disable the __slope__ step during the augmentation
slope_factor|float|It's the slope angle of the baseline linear error
noise|bool|Enable/disable the __noise__ step during the augmentation
noise_range|tuple|The noise factor is sampled in this range. e.g. (min, max)## Requirements
The python libraries needed are:random
dataclasses
pandas
numpy
tensorflow---
The code is documented for more insightful informations :wink: !Contributors are welcome :thumbsup: