https://github.com/cvjena/libmaxdiv
Implementation of the Maximally Divergent Intervals algorithm for Anomaly Detection in multivariate spatio-temporal time-series.
https://github.com/cvjena/libmaxdiv
anomalydetection anomalydiscovery data-analysis data-mining datamining machine-learning machine-learning-library machinelearning time-series timeseries
Last synced: 4 days ago
JSON representation
Implementation of the Maximally Divergent Intervals algorithm for Anomaly Detection in multivariate spatio-temporal time-series.
- Host: GitHub
- URL: https://github.com/cvjena/libmaxdiv
- Owner: cvjena
- License: lgpl-3.0
- Created: 2016-12-07T08:46:22.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2021-05-25T10:13:55.000Z (over 4 years ago)
- Last Synced: 2025-09-10T07:39:39.064Z (5 months ago)
- Topics: anomalydetection, anomalydiscovery, data-analysis, data-mining, datamining, machine-learning, machine-learning-library, machinelearning, time-series, timeseries
- Language: C++
- Size: 42.2 MB
- Stars: 63
- Watchers: 7
- Forks: 13
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
- awesome-time-series - libmaxdiv - temporal time-series. (📦 Packages / Python)
README
Maximally Divergent Intervals for Anomaly Detection
===================================================
The **Maximally Divergent Intervals (MDI) Algorithm** can be used to detect anomalous *intervals* (as opposed to
anomalous points) in multi-variate spatio-temporal time-series. A description of the algorithm along with a variety
of application examples can be found in the following article:
> [**Detecting Regions of Maximal Divergence for Spatio-Temporal Anomaly Detection.**][paper]
> Björn Barz, Erik Rodner, Yanira Guanche Garcia, Joachim Denzler.
> *IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018*.
An efficient C++ implementation called `libmaxdiv` is provided in `maxdiv/libmaxdiv` and may be used stand-alone. If it has been
built in `maxdiv/libmaxdiv/bin`, it will be used automatically by the GUI and the `maxdiv` function in the `maxdiv.maxdiv` Python
package. See [`maxdiv/libmaxdiv/README.md`](maxdiv/libmaxdiv/README.md) for build instructions.
Otherwise, the pure Python implementation of the MDI algorithm will be used, which is not recommended, since it is extremely slow
and lacks some features such as support for spatial data.
The directories `experiments` and `tools` contain some benchmarks, experiments and scripts we've been using for development and
evaluation of the algorithm; for example, detecting severe North Sea storms like the Hamburg-Flut shown below.
You don't need those directories if you just want to use `libmaxdiv`.
.gif)
More examples of what can be done with the MDI algorithm can be found on the [project page](https://cvjena.github.io/libmaxdiv/)
and in the article mentioned above.
GUI
---

Dependencies of the Python implementation and the GUI
-----------------------------------------------------
- Python >= 2.7 or Python 3
- `numpy`
- `matplotlib`
- `scipy`
- `scikit-learn <= 0.19`
- `PIL`/`Pillow`
`libmaxdiv` has its own dependencies in addition. Please refer to [`maxdiv/libmaxdiv/README.md`](maxdiv/libmaxdiv/README.md) for build instructions.
Getting started
---------------
After having installed the dependencies mentioned above and built `libmaxdiv` according to the instructions in `maxdiv/libmaxdiv/README.md`,
just run `python launch-gui.py` to start the interactive interface.
If you are using Windows, pre-built binaries of the `libmaxdiv` library are available, so that it can be used out of the box.
However, the default `launch-gui.py` script will always search for self-built libraries only. To use the pre-built ones, launch the GUI by
running `python win32\launch-gui.py` instead.
A comprehensive installation guide and user manual for the GUI can be found in [`libmaxdiv user guide.pdf`](libmaxdiv%20user%20guide.pdf).
You may also use the maxdiv algorithm programmatically via the [`maxdiv`](maxdiv/maxdiv.py#L733) function in the [`maxdiv.maxdiv`](maxdiv/maxdiv.py)
package or by using the `libmaxdiv` library directly from your application. It provides a C-style procedural interface defined in
[`libmaxdiv.h`](maxdiv/libmaxdiv/libmaxdiv.h) for maximum inter-operability.
Note that not all functions of `libmaxdiv` are made available through the high-level interface `maxdiv` python function in the `maxdiv.maxdiv` package.
In particular, it can only process temporal, but not spatio-temporal data.
However, you can still use python to interact with `libmaxdiv` by calling the C-style functions defined in [`libmaxdiv.h`](maxdiv/libmaxdiv/libmaxdiv.h)
from python using the wrapper provided in [`maxdiv.libmaxdiv_wrapper`](maxdiv/libmaxdiv_wrapper.py).
### A simple step-by-step example for using the GUI
In this simple example, we will use the graphical user interface (GUI) for detecting hurricanes in a non-spatial time-series comprising measurements
of significant wave height (Hs), sea level pressure (SLP), and wind speed (W). The measurements have been taken in 2012 at a location near the
Bahamas and are provided by [NOAA](http://www.ndbc.noaa.gov/).
The hurricane season was particularly active in that year and 3 major hurricanes passed the Bahamas: Isaac (August 22-25), Rafael (October 12-18),
and Sandy (October 22-29).
We have already converted the data to the correct format for the libmaxdiv GUI and provide it in the file
[`experiments/HPW_2012_41046_standardized.csv`](experiments/HPW_2012_41046_standardized.csv).
First, we launch the user interface loading that file directly on start-up by running the following command on the command line:
python launch-gui.py experiments/HPW_2012_41046_standardized.csv
You may also omit the argument and will then be prompted to select the file using a dialog window.
Since the measured variables have very different scales, it is a bit difficult to see anything in the plot of the time-series.
Thus, we first check the box labeled "Plot pre-processed instead of original data", which will show a normalized version of the time-series,
which is also used by `libmaxdiv`.
Since hurricanes usually last at least 12 hours and are considered two independent storms if they last longer than 3 days, we set the "minimum
interval length" to 12 and the "maximum interval length" to 72.
In order to keep the visualization clean and the analysis easy, we set the "number of detections" to 5.
After hitting the "Detect Anomalous Intervals" button, the time-series will be overlayed with the top 5 detections, shown as red regions.
The first three detections should correspond to the three hurricanes mentioned above.
You can use the buttons next to the visualization for zooming and panning or navigating between the individual detections.
[paper]: https://ieeexplore.ieee.org/document/8352745/