I’m trying to iterate through a list of dataframes and use indexing to pass in each dataframe in the list as an argument to a function. The list is simply a list of each dataframe.
myFunc <- function (df1) {
#this is the animals data
my_plot_1 <- ggplot(df1, aes(x=size, y=avg_sleep_hours))
}
rating <- 1:4
animal <- c('koala', 'hedgehog', 'sloth', 'panda')
country <- c('Australia', 'Italy', 'Peru', 'China')
avg_sleep_hours <- c(21, 18, 17, 10)
size <- c(221, 418, 417, 410)
super_sleepers <- data.frame(rating, animal, country, avg_sleep_hours, size)
rating <- 1:4
animal <- c('bear', 'moose', 'alpaca', 'snake')
country <- c('USA', 'Mongolia', 'Argentina', 'Japan')
avg_sleep_hours <- c(1, 8, 7, 1)
size <- c(261, 558, 227, 2)
no_sleepers <- data.frame(rating, animal, country, avg_sleep_hours, size)
animal_parameters <- list(super_sleepers, no_sleepers)
for (i in 1:length(animal_parameters)){
myFunc(animal_parameters[[i]])
}
Thanks.
First of all, we do not need to clutter the environment with all those vectors. Instead, we can define them via =
inside data.frame
like
super_sleepers = data.frame(rating=seq(4), animal=c('koala', 'hedgehog', 'sloth', 'panda'),
country=c('Australia', 'Italy', 'Peru', 'China'),
avg_sleep_hours=c(21, 18, 17, 10), size=c(221, 418, 417, 410))
no_sleepers = data.frame(rating=seq(4), animal= c('bear', 'moose', 'alpaca', 'snake'),
country=c('USA', 'Mongolia', 'Argentina', 'Japan'),
avg_sleep_hours=c(1, 8, 7, 1), size=c(261, 558, 227, 2))
l = list(super_sleepers, no_sleepers)
Secondly, myFunc()
in it’s most simplified form can be re-written as
library(ggplot2)
myFunc = (df) ggplot(df, aes(x=size, y=avg_sleep_hours)) + geom_point()
However, I recommend against this. It does not make much sense to hard-code the xy-variables inside the function. Passing them as arguments is much preferred.
Notice that (x)
is conveniece for function(x)
; we do not need {...}
if it’s one line; assigning and return
ing is also not necessary inside such short functions.
For instance, we can do
library(ggplot2)
library(rlang)
myFunc2 = (df, x, y) ggplot(df, aes(x={{x}}, y={{y}})) + geom_point()
instead. Here is the name injection explained. Finally, we can execute–or execute and assign–via
# result=
lapply(l, myFunc2, size, avg_sleep_hours)
If we use a loop, we need print()
, see this resource or search on SO.
There were a number of things wrong with your code that have been corrected. First of all, you did not create a valid ggplot2
plot in myFunc()
because it was missing the geom
necessary to create the plot.
Secondly, myFunc()
was not returning anything. Nevertheless, here is the corrected version of your code:
library(ggplot2)
myFunc <- function (df1) {
#this is the animals data
my_plot_1 <- ggplot(df1, aes(x=size, y=avg_sleep_hours,color = animal))+
geom_point(stat = "identity")
return(my_plot_1)
}
animal_parameters <- list(super_sleepers, no_sleepers)
# apply function to list of data frames
ggps <- lapply(animal_parameters,myFunc)
# access each plot in the list using its index number
ggps[[1]]
With this, your plots are stored up in a list and can be accessed using their index numbers.