I was reading about this over here: https://en.wikipedia.org/wiki/Valeriepieris_circle . This is a problem where the task is to find the smallest possible circle that contains half of the world’s population. I am trying to replicate this task myself as a learning exercise.
To begin, instead of using an actual world map – to simplify things, I imagined a rectangular world. This rectangular world is actually a network graph made of 1000 nodes, such that each node is only connected to all of its immediate neighbors only once. The nodes in the graph have id’s from 1 to 1000, and each node is assigned a random value to represent the population at that point.
Here is how everything looks like:
library(igraph)
width <- 30
height <- 20
num_nodes <- width * height
# Create a grid
x <- rep(1:width, each = height)
y <- rep(1:height, times = width)
g <- make_empty_graph(n = num_nodes, directed = FALSE)
# Function to get node index
get_node_index <- function(i, j) (i - 1) * height + j
# Add edges
edges <- c()
for(i in 1:width) {
for(j in 1:height) {
current_node <- get_node_index(i, j)
# Connect to right neighbor
if(i < width) edges <- c(edges, current_node, get_node_index(i + 1, j))
# Connect to bottom neighbor
if(j < height) edges <- c(edges, current_node, get_node_index(i, j + 1))
}
}
g <- add_edges(g, edges)
V(g)$x <- x
V(g)$y <- y
par(mfrow=c(1,2))
V(g)$name <- 1:num_nodes
plot(g, vertex.size = 7, vertex.label = V(g)$name, vertex.label.cex = 0.6, main = "Map with Node Indices")
V(g)$value <- sample(1:100, num_nodes, replace = TRUE)
plot(g, vertex.size = 7, vertex.label = V(g)$value, vertex.label.cex = 0.6, main = "Map with Population Values")
It is quite difficult to work with circles. Instead of circles, I decided to work with squares made of 4 nodes. My task is now to find the square with the largest node sums. I tried to do make an exhaustive list of all squares and record their sums:
library(dplyr)
squares <- list()
square_id <- 1
for(i in 1:(width-1)) {
for(j in 1:(height-1)) {
top_left <- get_node_index(i, j)
top_right <- get_node_index(i+1, j)
bottom_left <- get_node_index(i, j+1)
bottom_right <- get_node_index(i+1, j+1)
square <- c(top_left, top_right, bottom_left, bottom_right)
squares[[square_id]] <- square
square_id <- square_id + 1
}
}
result_df <- data.frame(
square_id = seq_along(squares),
nodes_id_selected = sapply(squares, function(s) paste(s, collapse = ", ")),
value = sapply(squares, function(s) sum(V(g)$value[s]))
)
print(head(result_df %>% arrange(-value)))
square_id nodes_id_selected value
334 351, 371, 352, 372 365
51 53, 73, 54, 74 350
Based on this work, I have the following questions:
- Is there a more efficient way to enumerate all possible squares? E.g. Use parallel computing so that multiple squares are identified at the same time by different cores … dont keep a running list of all squares, but simply the current greatest square, etc?
- Is there a way to generalize this approach for any sided shape? e.g. triangle, hexagon, etc. Is it possible to write a function that can carry out these comparisons for any sided shape?
farrow90 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.