Skip to contents

MULTISPATI-PCA clustering

Usage

kmspc(
  data,
  variables,
  number_cluster = 3:5,
  explainedVariance = 70,
  ldist = 0,
  udist = 40,
  center = TRUE,
  fuzzyness = 1.2,
  distance = "euclidean",
  zero.policy = FALSE,
  only_spca_results = TRUE,
  all_results = FALSE
)

Arguments

data

sf object

variables

variables to use for clustering, if missing, all numeric variables will be used

number_cluster

numeric vector with number of final clusters

explainedVariance

numeric number in percentage of explained variance from PCA analysis to keep and make cluster process

ldist

numeric lower distance bound to identify neighbors

udist

numeric upper distance bound to identify neighbors

center

a logical or numeric value, centring option if TRUE, centring by the mean if FALSE no centring if a numeric vector, its length must be equal to the number of columns of the data frame df and gives the decentring

fuzzyness

A number greater than 1 giving the degree of fuzzification.

distance

character Must be one of the following: If "euclidean", the mean square error, if "manhattan", the mean absolute error is computed. Abbreviations are also accepted.

zero.policy

default NULL, use global option value; if FALSE stop with error for any empty neighbors sets, if TRUE permit the weights list to be formed with zero-length weights vectors

only_spca_results

logical; should return both PCA and sPCA results (FALSE), or only sPCA results (TRUE)? This can be a time consuming process if there are multiple variables.

all_results

logical; should return the results from the sPCA and PCA call?

Value

a list with classification results and indices to select best number of clusters.

Examples

library(sf)
data(wheat, package = 'paar')

# Transform the data.frame into a sf object
wheat_sf <- st_as_sf(wheat,
                     coords = c('x', 'y'),
                     crs = 32720)

# Run the kmspc function
kmspc_results <- kmspc(wheat_sf,
                       number_cluster = 2:4)
#> Warning: All numeric Variables will be used to make clusters

# Print the summaryResults
kmspc_results$summaryResults
#>   Clusters Iterations      SSDW
#> 1        2         18 1.8713082
#> 2        3         56 1.3057409
#> 3        4         23 0.9948927

# Print the indices
kmspc_results$indices
#>   Num. Cluster     Xie Beni Partition Coefficient Entropy of Partition
#> 1            2 3.520996e-05             0.9611975           0.06490128
#> 2            3 5.479347e-05             0.9391130           0.10430426
#> 3            4 5.827060e-05             0.9293032           0.12351250
#>   Summary Index
#> 1      1.281105
#> 2      1.597481
#> 3      1.713107

# Print the cluster
head(kmspc_results$cluster, 5)
#>      Cluster_2 Cluster_3 Cluster_4
#> [1,] "1"       "3"       "2"      
#> [2,] "1"       "3"       "2"      
#> [3,] "1"       "3"       "2"      
#> [4,] "1"       "2"       "2"      
#> [5,] "1"       "2"       "2"      

# Combine the results in a single object
wheat_clustered <- cbind(wheat_sf, kmspc_results$cluster)

# Plot the results
plot(wheat_clustered[, "Cluster_2"])