{"id":17131448,"url":"https://github.com/vc1492a/pynomaly","last_synced_at":"2025-05-15T00:13:05.876Z","repository":{"id":41063301,"uuid":"91844483","full_name":"vc1492a/PyNomaly","owner":"vc1492a","description":"Anomaly detection using LoOP: Local Outlier Probabilities, a local density based outlier detection method providing an outlier score in the range of [0,1].","archived":false,"fork":false,"pushed_at":"2024-12-02T20:19:23.000Z","size":35756,"stargazers_count":322,"open_issues_count":7,"forks_count":37,"subscribers_count":24,"default_branch":"main","last_synced_at":"2025-05-14T04:33:34.933Z","etag":null,"topics":["anomalies","anomaly-detection","machine-learning","nearest-neighbors","outlier-detection","outlier-scores","outliers","probability"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vc1492a.png","metadata":{"files":{"readme":"readme.md","changelog":"changelog.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["vc1492a"]}},"created_at":"2017-05-19T20:51:20.000Z","updated_at":"2025-04-18T23:23:54.000Z","dependencies_parsed_at":"2024-09-17T21:07:43.375Z","dependency_job_id":"8d2471a7-1f9e-427c-9f68-5965c87d3d37","html_url":"https://github.com/vc1492a/PyNomaly","commit_stats":{"total_commits":130,"total_committers":8,"mean_commits":16.25,"dds":0.09230769230769231,"last_synced_commit":"6f5077e57850f1814652860932aea9a82765b7c8"},"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vc1492a%2FPyNomaly","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vc1492a%2FPyNomaly/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vc1492a%2FPyNomaly/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vc1492a%2FPyNomaly/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vc1492a","download_url":"https://codeload.github.com/vc1492a/PyNomaly/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254249206,"owners_count":22039029,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anomalies","anomaly-detection","machine-learning","nearest-neighbors","outlier-detection","outlier-scores","outliers","probability"],"created_at":"2024-10-14T19:23:43.179Z","updated_at":"2025-05-15T00:13:00.852Z","avatar_url":"https://github.com/vc1492a.png","language":"Python","funding_links":["https://github.com/sponsors/vc1492a"],"categories":[],"sub_categories":[],"readme":"# PyNomaly\n\nPyNomaly is a Python 3 implementation of LoOP (Local Outlier Probabilities).\nLoOP is a local density based outlier detection method by Kriegel, Kröger, Schubert, and Zimek which provides outlier\nscores in the range of [0,1] that are directly interpretable as the probability of a sample being an outlier. \n\nPyNomaly is a core library of [deepchecks](https://github.com/deepchecks/deepchecks) and [pysad](https://github.com/selimfirat/pysad). \n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![PyPi](https://img.shields.io/badge/pypi-0.3.4-blue.svg)](https://pypi.python.org/pypi/PyNomaly/0.3.4)\n[![Downloads](https://img.shields.io/pypi/dm/PyNomaly.svg?logoColor=blue)](https://pypistats.org/packages/pynomaly)\n![Tests](https://github.com/vc1492a/PyNomaly/actions/workflows/tests.yml/badge.svg)\n[![Coverage Status](https://coveralls.io/repos/github/vc1492a/PyNomaly/badge.svg?branch=main)](https://coveralls.io/github/vc1492a/PyNomaly?branch=main)\n[![JOSS](http://joss.theoj.org/papers/f4d2cfe680768526da7c1f6a2c103266/status.svg)](http://joss.theoj.org/papers/f4d2cfe680768526da7c1f6a2c103266)\n\nThe outlier score of each sample is called the Local Outlier Probability.\nIt measures the local deviation of density of a given sample with\nrespect to its neighbors as Local Outlier Factor (LOF), but provides normalized\noutlier scores in the range [0,1]. These outlier scores are directly interpretable\nas a probability of an object being an outlier. Since Local Outlier Probabilities provides scores in the\nrange [0,1], practitioners are free to interpret the results according to the application.\n\nLike LOF, it is local in that the anomaly score depends on how isolated the sample is\nwith respect to the surrounding neighborhood. Locality is given by k-nearest neighbors,\nwhose distance is used to estimate the local density. By comparing the local density of a sample to the\nlocal densities of its neighbors, one can identify samples that lie in regions of lower\ndensity compared to their neighbors and thus identify samples that may be outliers according to their Local\nOutlier Probability.\n\nThe authors' 2009 paper detailing LoOP's theory, formulation, and application is provided by\nLudwig-Maximilians University Munich - Institute for Informatics;\n[LoOP: Local Outlier Probabilities](http://www.dbs.ifi.lmu.de/Publikationen/Papers/LoOP1649.pdf).\n\n## Implementation\n\nThis Python 3 implementation uses Numpy and the formulas outlined in\n[LoOP: Local Outlier Probabilities](http://www.dbs.ifi.lmu.de/Publikationen/Papers/LoOP1649.pdf)\nto calculate the Local Outlier Probability of each sample.\n\n## Dependencies\n- Python 3.6 - 3.13\n- numpy \u003e= 1.16.3\n- python-utils \u003e= 2.3.0\n- (optional) numba \u003e= 0.45.1\n\nNumba just-in-time (JIT) compiles the function with calculates the Euclidean \ndistance between observations, providing a reduction in computation time \n(significantly when a large number of observations are scored). Numba is not a \nrequirement and PyNomaly may still be used solely with numpy if desired\n(details below). \n\n## Quick Start\n\nFirst install the package from the Python Package Index:\n\n```shell\npip install PyNomaly # or pip3 install ... if you're using both Python 3 and 2.\n```\n\nAlternatively, you can use conda to install the package from conda-forge:\n\n```shell\nconda install conda-forge::pynomaly\n```\nThen you can do something like this:\n\n```python\nfrom PyNomaly import loop\nm = loop.LocalOutlierProbability(data).fit()\nscores = m.local_outlier_probabilities\nprint(scores)\n```\nwhere *data* is a NxM (N rows, M columns; 2-dimensional) set of data as either a Pandas DataFrame or Numpy array.\n\nLocalOutlierProbability sets the *extent* (in integer in value of 1, 2, or 3) and *n_neighbors* (must be greater than 0) parameters with the default\nvalues of 3 and 10, respectively. You're free to set these parameters on your own as below:\n\n```python\nfrom PyNomaly import loop\nm = loop.LocalOutlierProbability(data, extent=2, n_neighbors=20).fit()\nscores = m.local_outlier_probabilities\nprint(scores)\n```\n\nThis implementation of LoOP also includes an optional *cluster_labels* parameter. This is useful in cases where regions\nof varying density occur within the same set of data. When using *cluster_labels*, the Local Outlier Probability of a\nsample is calculated with respect to its cluster assignment.\n\n```python\nfrom PyNomaly import loop\nfrom sklearn.cluster import DBSCAN\ndb = DBSCAN(eps=0.6, min_samples=50).fit(data)\nm = loop.LocalOutlierProbability(data, extent=2, n_neighbors=20, cluster_labels=list(db.labels_)).fit()\nscores = m.local_outlier_probabilities\nprint(scores)\n```\n\n**NOTE**: Unless your data is all the same scale, it may be a good idea to normalize your data with z-scores or another\nnormalization scheme prior to using LoOP, especially when working with multiple dimensions of varying scale.\nUsers must also appropriately handle missing values prior to using LoOP, as LoOP does not support Pandas\nDataFrames or Numpy arrays with missing values.\n\n### Utilizing Numba and Progress Bars\n\nIt may be helpful to use just-in-time (JIT) compilation in the cases where a lot of \nobservations are scored. Numba, a JIT compiler for Python, may be used \nwith PyNomaly by setting `use_numba=True`:\n\n```python\nfrom PyNomaly import loop\nm = loop.LocalOutlierProbability(data, extent=2, n_neighbors=20, use_numba=True, progress_bar=True).fit()\nscores = m.local_outlier_probabilities\nprint(scores)\n```\n\nNumba must be installed if the above to use JIT compilation and improve the \nspeed of multiple calls to `LocalOutlierProbability()`, and PyNomaly has been \ntested with Numba version 0.45.1. An example of the speed difference that can \nbe realized with using Numba is avaialble in `examples/numba_speed_diff.py`. \n\nYou may also choose to print progress bars _with our without_ the use of numba \nby passing `progress_bar=True` to the `LocalOutlierProbability()` method as above.\n\n### Choosing Parameters\n\nThe *extent* parameter controls the sensitivity of the scoring in practice. The parameter corresponds to\nthe statistical notion of an outlier defined as an object deviating more than a given lambda (*extent*)\ntimes the standard deviation from the mean. A value of 2 implies outliers deviating more than 2 standard deviations\nfrom the mean, and corresponds to 95.0% in the empirical \"three-sigma\" rule. The appropriate parameter should be selected\naccording to the level of sensitivity needed for the input data and application. The question to ask is whether it is\nmore reasonable to assume outliers in your data are 1, 2, or 3 standard deviations from the mean, and select the value\nlikely most appropriate to your data and application.\n\nThe *n_neighbors* parameter defines the number of neighbors to consider about\neach sample (neighborhood size) when determining its Local Outlier Probability with respect to the density\nof the sample's defined neighborhood. The idea number of neighbors to consider is dependent on the\ninput data. However, the notion of an outlier implies it would be considered as such regardless of the number\nof neighbors considered. One potential approach is to use a number of different neighborhood sizes and average\nthe results for reach observation. Those observations which rank highly with varying neighborhood sizes are\nmore than likely outliers. This is one potential approach of selecting the neighborhood size. Another is to\nselect a value proportional to the number of observations, such an odd-valued integer close to the square root\nof the number of observations in your data (*sqrt(n_observations*).\n\n## Iris Data Example\n\nWe'll be using the well-known Iris dataset to show LoOP's capabilities. There's a few things you'll need for this\nexample beyond the standard prerequisites listed above:\n- matplotlib 2.0.0 or greater\n- PyDataset 0.2.0 or greater\n- scikit-learn 0.18.1 or greater\n\nFirst, let's import the packages and libraries we will need for this example.\n\n```python\nfrom PyNomaly import loop\nimport pandas as pd\nfrom pydataset import data\nimport numpy as np\nfrom sklearn.cluster import DBSCAN\nimport matplotlib.pyplot as plt\nfrom mpl_toolkits.mplot3d import Axes3D\n```\n\nNow let's create two sets of Iris data for scoring; one with clustering and the other without.\n\n```python\n# import the data and remove any non-numeric columns\niris = pd.DataFrame(data('iris').drop(columns=['Species']))\n```\n\nNext, let's cluster the data using DBSCAN and generate two sets of scores. On both cases, we will use the default\nvalues for both *extent* (0.997) and *n_neighbors* (10).\n\n```python\ndb = DBSCAN(eps=0.9, min_samples=10).fit(iris)\nm = loop.LocalOutlierProbability(iris).fit()\nscores_noclust = m.local_outlier_probabilities\nm_clust = loop.LocalOutlierProbability(iris, cluster_labels=list(db.labels_)).fit()\nscores_clust = m_clust.local_outlier_probabilities\n```\n\nOrganize the data into two separate Pandas DataFrames.\n\n```python\niris_clust = pd.DataFrame(iris.copy())\niris_clust['scores'] = scores_clust\niris_clust['labels'] = db.labels_\niris['scores'] = scores_noclust\n```\n\nAnd finally, let's visualize the scores provided by LoOP in both cases (with and without clustering).\n\n```python\nfig = plt.figure(figsize=(7, 7))\nax = fig.add_subplot(111, projection='3d')\nax.scatter(iris['Sepal.Width'], iris['Petal.Width'], iris['Sepal.Length'],\nc=iris['scores'], cmap='seismic', s=50)\nax.set_xlabel('Sepal.Width')\nax.set_ylabel('Petal.Width')\nax.set_zlabel('Sepal.Length')\nplt.show()\nplt.clf()\nplt.cla()\nplt.close()\n\nfig = plt.figure(figsize=(7, 7))\nax = fig.add_subplot(111, projection='3d')\nax.scatter(iris_clust['Sepal.Width'], iris_clust['Petal.Width'], iris_clust['Sepal.Length'],\nc=iris_clust['scores'], cmap='seismic', s=50)\nax.set_xlabel('Sepal.Width')\nax.set_ylabel('Petal.Width')\nax.set_zlabel('Sepal.Length')\nplt.show()\nplt.clf()\nplt.cla()\nplt.close()\n\nfig = plt.figure(figsize=(7, 7))\nax = fig.add_subplot(111, projection='3d')\nax.scatter(iris_clust['Sepal.Width'], iris_clust['Petal.Width'], iris_clust['Sepal.Length'],\nc=iris_clust['labels'], cmap='Set1', s=50)\nax.set_xlabel('Sepal.Width')\nax.set_ylabel('Petal.Width')\nax.set_zlabel('Sepal.Length')\nplt.show()\nplt.clf()\nplt.cla()\nplt.close()\n```\n\nYour results should look like the following:\n\n**LoOP Scores without Clustering**\n![LoOP Scores without Clustering](https://github.com/vc1492a/PyNomaly/blob/main/images/scores.png)\n\n**LoOP Scores with Clustering**\n![LoOP Scores with Clustering](https://github.com/vc1492a/PyNomaly/blob/main/images/scores_clust.png)\n\n**DBSCAN Cluster Assignments**\n![DBSCAN Cluster Assignments](https://github.com/vc1492a/PyNomaly/blob/main/images/cluster_assignments.png)\n\n\nNote the differences between using LocalOutlierProbability with and without clustering. In the example without clustering, samples are\nscored according to the distribution of the entire data set. In the example with clustering, each sample is scored\naccording to the distribution of each cluster. Which approach is suitable depends on the use case.\n\n**NOTE**: Data was not normalized in this example, but it's probably a good idea to do so in practice.\n\n## Using Numpy\n\nWhen using numpy, make sure to use 2-dimensional arrays in tabular format:\n\n```python\ndata = np.array([\n    [43.3, 30.2, 90.2],\n    [62.9, 58.3, 49.3],\n    [55.2, 56.2, 134.2],\n    [48.6, 80.3, 50.3],\n    [67.1, 60.0, 55.9],\n    [421.5, 90.3, 50.0]\n])\n\nscores = loop.LocalOutlierProbability(data, n_neighbors=3).fit().local_outlier_probabilities\nprint(scores)\n\n```\n\nThe shape of the input array shape corresponds to the rows (observations) and columns (features) in the data:\n\n```python\nprint(data.shape)\n# (6,3), which matches number of observations and features in the above example\n```\n\nSimilar to the above:\n\n```python\ndata = np.random.rand(100, 5)\nscores = loop.LocalOutlierProbability(data).fit().local_outlier_probabilities\nprint(scores)\n```\n\n## Specifying a Distance Matrix\n\nPyNomaly provides the ability to specify a distance matrix so that any\ndistance metric can be used (a neighbor index matrix must also be provided).\nThis can be useful when wanting to use a distance other than the euclidean.\n\nNote that in order to maintain alignment with the LoOP definition of closest neighbors, \nan additional neighbor is added when using [scikit-learn's NearestNeighbors](https://scikit-learn.org/1.5/modules/neighbors.html) since `NearestNeighbors` \nincludes the point itself when calculating the cloest neighbors (whereas the LoOP method does not include distances to point itself). \n\n```python\nimport numpy as np\nfrom sklearn.neighbors import NearestNeighbors\n\ndata = np.array([\n    [43.3, 30.2, 90.2],\n    [62.9, 58.3, 49.3],\n    [55.2, 56.2, 134.2],\n    [48.6, 80.3, 50.3],\n    [67.1, 60.0, 55.9],\n    [421.5, 90.3, 50.0]\n])\n\n# Generate distance and neighbor matrices\nn_neighbors = 3 # the number of neighbors according to the LoOP definition \nneigh = NearestNeighbors(n_neighbors=n_neighbors+1, metric='hamming')\nneigh.fit(data)\nd, idx = neigh.kneighbors(data, return_distance=True)\n\n# Remove self-distances - you MUST do this to preserve the same results as intended by the definition of LoOP\nindices = np.delete(indices, 0, 1)\ndistances = np.delete(distances, 0, 1)\n\n# Fit and return scores\nm = loop.LocalOutlierProbability(distance_matrix=d, neighbor_matrix=idx, n_neighbors=n_neighbors+1).fit()\nscores = m.local_outlier_probabilities\n```\n\nThe below visualization shows the results by a few known distance metrics:\n\n**LoOP Scores by Distance Metric**\n![DBSCAN Cluster Assignments](https://github.com/vc1492a/PyNomaly/blob/main/images/scores_by_distance_metric.png)\n\n## Streaming Data\n\nPyNomaly also contains an implementation of Hamlet et. al.'s modifications\nto the original LoOP approach [[4](http://www.tandfonline.com/doi/abs/10.1080/23742917.2016.1226651?journalCode=tsec20)],\nwhich may be used for applications involving streaming data or where rapid calculations may be necessary.\nFirst, the standard LoOP algorithm is used on \"training\" data, with certain attributes of the fitted data\nstored from the original LoOP approach. Then, as new points are considered, these fitted attributes are\ncalled when calculating the score of the incoming streaming data due to the use of averages from the initial\nfit, such as the use of a global value for the expected value of the probabilistic distance. Despite the potential\nfor increased error when compared to the standard approach, it may be effective in streaming applications where\nrefitting the standard approach over all points could be computationally expensive.\n\nWhile the iris dataset is not streaming data, we'll use it in this example by taking the first 120 observations\nas training data and take the remaining 30 observations as a stream, scoring each observation\nindividually.\n\nSplit the data.\n```python\niris = iris.sample(frac=1) # shuffle data\niris_train = iris.iloc[:, 0:4].head(120)\niris_test = iris.iloc[:, 0:4].tail(30)\n```\n\nFit to each set.\n```python\nm = loop.LocalOutlierProbability(iris).fit()\nscores_noclust = m.local_outlier_probabilities\niris['scores'] = scores_noclust\n\nm_train = loop.LocalOutlierProbability(iris_train, n_neighbors=10)\nm_train.fit()\niris_train_scores = m_train.local_outlier_probabilities\n```\n\n```python\niris_test_scores = []\nfor index, row in iris_test.iterrows():\n    array = np.array([row['Sepal.Length'], row['Sepal.Width'], row['Petal.Length'], row['Petal.Width']])\n    iris_test_scores.append(m_train.stream(array))\niris_test_scores = np.array(iris_test_scores)\n```\n\nConcatenate the scores and assess.\n\n```python\niris['stream_scores'] = np.hstack((iris_train_scores, iris_test_scores))\n# iris['scores'] from earlier example\nrmse = np.sqrt(((iris['scores'] - iris['stream_scores']) ** 2).mean(axis=None))\nprint(rmse)\n```\n\nThe root mean squared error (RMSE) between the two approaches is approximately 0.199 (your scores will vary depending on the data and specification).\nThe plot below shows the scores from the stream approach.\n\n```python\nfig = plt.figure(figsize=(7, 7))\nax = fig.add_subplot(111, projection='3d')\nax.scatter(iris['Sepal.Width'], iris['Petal.Width'], iris['Sepal.Length'],\nc=iris['stream_scores'], cmap='seismic', s=50)\nax.set_xlabel('Sepal.Width')\nax.set_ylabel('Petal.Width')\nax.set_zlabel('Sepal.Length')\nplt.show()\nplt.clf()\nplt.cla()\nplt.close()\n```\n\n**LoOP Scores using Stream Approach with n=10**\n![LoOP Scores using Stream Approach with n=10](https://github.com/vc1492a/PyNomaly/blob/main/images/scores_stream.png)\n\n### Notes\nWhen calculating the LoOP score of incoming data, the original fitted scores are not updated.\nIn some applications, it may be beneficial to refit the data periodically. The stream functionality\nalso assumes that either data or a distance matrix (or value) will be used across in both fitting\nand streaming, with no changes in specification between steps.\n\n## Contributing\n\nPlease use the issue tracker to report any erroneous behavior or desired \nfeature requests. \n\nIf you would like to contribute to development, please fork the repository and make \nany changes to a branch which corresponds to an open issue. Hot fixes \nand bug fixes can be represented by branches with the prefix `fix/` versus \n`feature/` for new capabilities or code improvements. Pull requests will \nthen be made from these branches into the repository's `dev` branch \nprior to being pulled into `main`. \n\n### Commit Messages and Releases\n\n**Your commit messages are important** - here's why. \n\nPyNomaly leverages [release-please](https://github.com/googleapis/release-please-action) to help automate the release process using the [Conventional Commits](https://www.conventionalcommits.org/) specification. When pull requests are opened to the `main` branch, release-please will collate the git commit messages and prepare an organized changelog and release notes. This process can be completed because of the Conventional Commits specification. \n\nConventional Commits provides an easy set of rules for creating an explicit commit history; which makes it easier to write automated tools on top of. This convention dovetails with SemVer, by describing the features, fixes, and breaking changes made in commit messages. You can check out examples [here](https://www.conventionalcommits.org/en/v1.0.0/#examples). Make a best effort to use the specification when contributing to Infactory code as it dramatically eases the documentation around releases and their features, breaking changes, bug fixes and documentation updates. \n\n### Tests\nWhen contributing, please ensure to run unit tests and add additional tests as \nnecessary if adding new functionality. To run the unit tests, use `pytest`: \n\n```\npython3 -m pytest --cov=PyNomaly -s -v\n```\n\nTo run the tests with Numba enabled, simply set the flag `NUMBA` in `test_loop.py` \nto `True`. Note that a drop in coverage is expected due to portions of the code \nbeing compiled upon code execution. \n\n## Versioning\n[Semantic versioning](http://semver.org/) is used for this project. If contributing, please conform to semantic\nversioning guidelines when submitting a pull request.\n\n## License\nThis project is licensed under the Apache 2.0 license.\n\n## Research\nIf citing PyNomaly, use the following: \n\n```\n@article{Constantinou2018,\n  doi = {10.21105/joss.00845},\n  url = {https://doi.org/10.21105/joss.00845},\n  year  = {2018},\n  month = {oct},\n  publisher = {The Open Journal},\n  volume = {3},\n  number = {30},\n  pages = {845},\n  author = {Valentino Constantinou},\n  title = {{PyNomaly}: Anomaly detection using Local Outlier Probabilities ({LoOP}).},\n  journal = {Journal of Open Source Software}\n}\n```\n\n\n## References\n1. Breunig M., Kriegel H.-P., Ng R., Sander, J. LOF: Identifying Density-based Local Outliers. ACM SIGMOD International Conference on Management of Data (2000). [PDF](http://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf).\n2. Kriegel H., Kröger P., Schubert E., Zimek A. LoOP: Local Outlier Probabilities. 18th ACM conference on Information and knowledge management, CIKM (2009). [PDF](http://www.dbs.ifi.lmu.de/Publikationen/Papers/LoOP1649.pdf).\n3. Goldstein M., Uchida S. A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data. PLoS ONE 11(4): e0152173 (2016).\n4. Hamlet C., Straub J., Russell M., Kerlin S. An incremental and approximate local outlier probability algorithm for intrusion detection and its evaluation. Journal of Cyber Security Technology (2016). [DOI](http://www.tandfonline.com/doi/abs/10.1080/23742917.2016.1226651?journalCode=tsec20).\n\n## Acknowledgements\n- The authors of LoOP (Local Outlier Probabilities)\n    - Hans-Peter Kriegel\n    - Peer Kröger\n    - Erich Schubert\n    - Arthur Zimek\n- [NASA Jet Propulsion Laboratory](https://jpl.nasa.gov/)\n    - [Kyle Hundman](https://github.com/khundman)\n    - [Ian Colwell](https://github.com/iancolwell)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvc1492a%2Fpynomaly","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvc1492a%2Fpynomaly","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvc1492a%2Fpynomaly/lists"}