An open API service indexing awesome lists of open source software.

https://github.com/oliverhennhoefer/r-averaged-difference-algorithm

R-Implementation of the "Averaged Difference Algorithm" for Spatial Outlier Detection conceived by Yufeng Kou and Chang-Tien Lu (2006).
https://github.com/oliverhennhoefer/r-averaged-difference-algorithm

algorithm averaged avgdiff chang-tien detection difference farming implementation kou lu outlier precision r spatial weighted yield yufeng

Last synced: 7 months ago
JSON representation

R-Implementation of the "Averaged Difference Algorithm" for Spatial Outlier Detection conceived by Yufeng Kou and Chang-Tien Lu (2006).

Awesome Lists containing this project

README

          

# Averaged Difference Algorithm
R-Implementation of the "Averaged Difference"-Algorithm for Spatial Outlier Detection conceived by Yufeng Kou and Chang-Tien Lu in the scientific paper "Spatial Weighted Outlier Detection" from 2006. The algorithm is suitable for the detection of point observations with distinct features from their surrounding neighbors.

The algorithm is demonstrated by the means of agricultural yield data and is generally suitable especially for use in the context of Precision Farming.

:round_pushpin: disy Informationssysteme GmbH. https://www.disy.net/de/

:seedling: iFAROS. https://www.ifaros-ictagri.com/

## Dependencies:

:wrench: __sp__-package, for geometry types

:wrench: __data.table__-package, as faster alternative for _base::data.frame_

:wrench: __FNN__-package, for k-nearest-neighbor search algorithm

## Parameters:

- Input: _SpatialDataPointsDataFrame_, georeferenced point data with attribute(s)

- Input: _k_, number of neighbours taken into account (as in _k-Nearest-Neighbor_)

- Output: _data.table_, containing index and the corresponding _averaged difference_ in decreasing order

The function returns a list (data.table) with points indices and the _averaged difference_ of the respective point. The data.table allows for the deletion of the top _n_ outliers by their indices. The actual number of outliers to be deleted can be freely chosen by the user.

For the example shown, 1.200 points (of ~ 8.000 points) were deleted. The nearest neighbors considered (_k_) was (arbitrarily) set to 355. Chosen parameter values should orient on the absolute amount of data points and the "severity" of the visible measurement errors. Global outliers can be obtained for larger neighborhoods, while smaller neighborhoods are especially suitable to identify local outliers on a smaller spatial scale.

Execution time for 8.000 point observations: ~3 sec.

## Demonstration:





1 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.9899&rep=rep1&type=pdf