Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/c-bata/outlier-utils
Utility library for detecting and removing outliers from normally distributed datasets using the Smirnov-Grubbs test.
https://github.com/c-bata/outlier-utils
outliers python statistics
Last synced: 3 days ago
JSON representation
Utility library for detecting and removing outliers from normally distributed datasets using the Smirnov-Grubbs test.
- Host: GitHub
- URL: https://github.com/c-bata/outlier-utils
- Owner: c-bata
- License: mit
- Created: 2015-07-28T05:46:54.000Z (over 9 years ago)
- Default Branch: main
- Last Pushed: 2023-09-07T04:45:01.000Z (over 1 year ago)
- Last Synced: 2025-01-03T10:09:27.429Z (10 days ago)
- Topics: outliers, python, statistics
- Language: Python
- Homepage: https://pypi.python.org/pypi/outlier-utils
- Size: 41 KB
- Stars: 56
- Watchers: 4
- Forks: 18
- Open Issues: 2
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGES.rst
- License: LICENSE
Awesome Lists containing this project
README
=============
outlier-utils
=============Utility library for detecting and removing outliers from normally distributed datasets using the Smirnov-Grubbs_ test.
Requirements
------------- Python_ (version 3.8 or later)
- SciPy_
- NumPy_Overview
--------Both the two-sided and the one-sided version of the test are supported. The former allows extracting outliers from both ends of the dataset, whereas the latter only considers min/max outliers. When running a test, every outlier will be removed until none can be found in the dataset. The output of the test is flexible enough to match several use cases. By default, the outlier-free data will be returned, but the test can also return the outliers themselves or their indices in the original dataset.
Examples
--------- Two-sided Grubbs test with a Pandas series input
::
>>> from outliers import smirnov_grubbs as grubbs
>>> import pandas as pd
>>> data = pd.Series([1, 8, 9, 10, 9])
>>> grubbs.test(data, alpha=0.05)
1 8
2 9
3 10
4 9
dtype: int64
- Two-sided Grubbs test with a NumPy array input::
>>> import numpy as np
>>> data = np.array([1, 8, 9, 10, 9])
>>> grubbs.test(data, alpha=0.05)
array([ 8, 9, 10, 9])
- One-sided (min) test returning outlier indices::
>>> grubbs.min_test_indices([8, 9, 10, 1, 9], alpha=0.05)
[3]
- One-sided (max) tests returning outliers::
>>> grubbs.max_test_outliers([8, 9, 10, 1, 9], alpha=0.05)
[]
>>> grubbs.max_test_outliers([8, 9, 10, 50, 9], alpha=0.05)
[50].. _Smirnov-Grubbs: https://en.wikipedia.org/wiki/Grubbs%27_test_for_outliers
.. _SciPy: https://www.scipy.org/
.. _NumPy: http://www.numpy.org/
.. _Python: https://www.python.org/License
=======This software is licensed under the MIT License.