I have a data frame like this:
df <- data.frame(a = runif(100, -10, 10),
b = runif(100, -10, 10),
c = runif(100, -10, 10),
d = runif(100, -10, 10),
e = runif(100, -10, 10))
Now, I’d like to have a function that extracts n rows with the maximum dispersal.
To explain a little better, imagine a one-dimensional data set like c(1:9)
. The function I am looking for would be something like fun(3, c(1:9))
(with n=3) and return c(1, 5, 9)
. fun(5, c(1:9))
would return c(1, 3, 5, 7, 9)
.
A valid result for fun(3, c(1:10))
would be c(1, 5, 10)
or c(1, 6, 9)
. In that case, the function should randomly select one of the valid outputs.
A while ago, I wrote this:
voronoiFilter <- function(occ, select){
n <- nrow(occ) - select
subset <- occ
dropped <- rep(NA, n)
for (i in 1:n) {
v <- voronoi.mosaic(x = subset[,1], y = subset[,2], duplicate = "remove")
info <- cells(v)
areas <- unlist(lapply(info,function(x) x$area))
smallest <- which(areas == min(areas, na.rm = TRUE))
dropped[i] <- which(occ[,1] == subset[smallest,1] & occ[,2] == subset[smallest,2])
subset <- subset[-smallest,]
}
outVec <- 1:nrow(occ)
return(outVec[-dropped])
}
where occ
is the input data frame and selected
is n.
This sometimes works OKish, but often I receive this error:
Warning in hist.default(i, plot = FALSE, freq = TRUE, breaks = seq(0.5, :
argument ‘freq’ is not made use of
I think because there are duplicated elements in the data.
Does anybody have another idea, how such a function could look like?
3