set.seed(2024)
df <- data.frame(x1=rep(c(1:4),each=12), x2=rep(c("A","B","C"),each=4,times=4), y=runif(48), z=rep(c(12,4,19,8,58,2,33,32,25,70,5,25),each=4))
I’m trying to select rows for all unique letters that have the highest value in z
.
In my example, A
would be the rows:
37 4 A 0.0541251726 70
38 4 A 0.9661198864 70
39 4 A 0.0221378489 70
40 4 A 0.6473674888 70
for B
:
17 2 B 0.1103110805 58
18 2 B 0.8632152062 58
19 2 B 0.1310691827 58
20 2 B 0.2954267922 58
For C
, the largest z
value is 25 when x1==3
and x1==4
, so select the rows with the lowest value in x1
:
33 3 C 0.1762014020 25
34 3 C 0.9733794478 25
35 3 C 0.6751627198 25
36 3 C 0.6809547509 25
Together, this is the expected result:
x1 x2 y z
37 4 A 0.0541251726 70
38 4 A 0.9661198864 70
39 4 A 0.0221378489 70
40 4 A 0.6473674888 70
17 2 B 0.1103110805 58
18 2 B 0.8632152062 58
19 2 B 0.1310691827 58
20 2 B 0.2954267922 58
33 3 C 0.1762014020 25
34 3 C 0.9733794478 25
35 3 C 0.6751627198 25
36 3 C 0.6809547509 25
This almost does the job:
library(dplyr)
df %>% group_by(x2) %>% slice(which.max(z))
but I need to select all other rows belonging to the same x1
.