https://github.com/guildai/rare-event
https://github.com/guildai/rare-event
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/guildai/rare-event
- Owner: guildai
- Created: 2019-08-04T19:30:46.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-08-05T19:26:27.000Z (almost 7 years ago)
- Last Synced: 2025-03-10T19:25:25.650Z (over 1 year ago)
- Language: Jupyter Notebook
- Size: 3.86 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Rare Event Prediction
[These models](guild.yml) are adapted from these blog posts:
- [Extreme Rare Event Classification using Autoencoders in Keras](https://towardsdatascience.com/extreme-rare-event-classification-using-autoencoders-in-keras-a565b386f098)
- [LSTM Autoencoder for Extreme Rare Event Classification in Keras](https://towardsdatascience.com/lstm-autoencoder-for-extreme-rare-event-classification-in-keras-ce209a224cfb)
by [Chitta Ranjan](https://www.linkedin.com/in/chitta-ranjan-b0851911/)
The original source is included as Notebooks:
- [autoencoder_classifier.ipynb](autoencoder_classifier.ipynb)
- [lstm_autoencoder_classifier.ipynb](lstm_autoencoder_classifier.ipynb)
To train the models, use:
$ guild run ae:train
$ guild run lstm:train
The LSTM does not include validation accuracy.
## To Do
- [ ] Generate sample log (treat as simulation problem)
- Contains mostly normal log events of whatever (negative example)
- Supports SIGTERM or some other signal
- Prints signal
- After some period with a random component, logs a "crash"
(positive example)
- [ ] Convert simulated logs into format we can train
- [x] Activation functions (elu, leaky relu, etc) (see advanced
activations in Keras)
- [x] More or fewer layers
- [ ] Different optimizers
- [ ] Within the LSTM:
- Dropout
- ???
- [x] Bump epochs to 1000
- [x] Add early stopping (Keras callback)
- [ ] Learning rate schedules
- [ ] Use custom Keras metic for roc_auc (unless slows training)
- [ ] Check if metrics for LSTM is slowing training
### Bug in data processing
- Losing a column somehow
- He's using the row number in the xs, which masks the missing col
------------------
- Highlight feature engineering in data-preparation (convert from raw
to prepared - time shift of y values)
- Use validation data for examples