Filters spatial point data by removing erroneous observations based on geometric, statistical, and spatial criteria. The function implements a sequential depuration workflow commonly used in precision agriculture.
Arguments
- x
An
sfobject with POINT geometries.- y
A
characterstring indicating the variable name used for filtering. If missing and only one attribute column is present, it is used by default.- toremove
A
charactervector specifying which procedures to apply. Options are"edges","outlier", and"inlier". The order of execution is fixed and cannot be modified.- crs
Coordinate reference system used when transforming longitude/latitude data. Can be an EPSG code or proj4string.
- buffer
A
numericvalue indicating the distance (in meters) for edge removal. Negative values are recommended to shrink boundaries.- ylimitmax
Numeric upper bound for
y. IfNA,Infis used.- ylimitmin
Numeric lower bound for
y. IfNA,-Infis used.- sdout
Numeric multiplier for standard deviation used to detect global outliers.
- ldist
Numeric lower distance bound for neighborhood definition.
- udist
Numeric upper distance bound for neighborhood definition.
- criteria
Character vector specifying spatial outlier detection methods:
"LM"(Local Moran) and/or"MP"(Moran Plot).- zero.policy
Logical. If
TRUE, allows empty neighbor sets; ifFALSE, stops with an error.- poly_border
Optional
sfpolygon defining field boundaries. IfNULL, a hull is computed automatically.
Value
An object of class paar (list) with:
- depurated_data
Filtered
sfobject- condition
Character vector indicating the reason each observation was removed (or
NAif retained)
Details
The depuration process is applied in a fixed sequence:
Edge removal (
"edges")Global outlier removal (
"outlier")Spatial outlier removal (
"inlier")
The toremove argument controls which of these steps are applied,
but **does not modify the order of execution**.
Available procedures are:
- edges
Removes points located within a specified
bufferdistance from the field boundary. The boundary is computed using a concave hull (concaveman) or a convex hull if the package is not available.- outlier
Removes global outliers based on:
user-defined limits (
ylimitmin,ylimitmax)statistical thresholds defined as \(mean \pm sdout \times sd\)
- inlier
Identifies and removes spatial outliers using:
Local Moran's I statistic ("LM")
Moran scatterplot influence ("MP")
Default parameter values are tuned for precision agriculture datasets (e.g., yield maps).
References
Vega, A., Córdoba, M., Castro-Franco, M. et al. (2019). Protocol for automating error removal from yield maps. Precision Agriculture, 20, 1030–1044. doi:10.1007/s11119-018-09632-8
Examples
library(sf)
data(barley, package = 'paar')
#Convert to an sf object
barley <- st_as_sf(barley, coords = c("X", "Y"), crs = 32720)
depurated <-
depurate(barley, "Yield")
#> Concave hull algorithm is computed with
#> concavity = 2 and length_threshold = 0
# Summary of depurated data
summary(depurated)
#> normal point border spatial outlier MP spatial outlier LM
#> 5673 (77%) 964 (13%) 343 (4.6%) 309 (4.2%)
#> global min outlier
#> 99 (1.3%) 6 (0.081%)
# Keep only depurate data
depurated_data <- depurated$depurated_data
# Combine the condition for all data
all_data_condition <- cbind(depurated, barley)
