I have a dataframe where I have a group column (a vs b), id column (sample 1-14) and then multiple columns(expression of different genes) in which I would like to identify the outliers. I want to have a code that gives me the sample id of the outlier + if the outlier is extreme or not (I used identify_outliers so far) without having to run the code for each single gene
Example data
I already found this kind of code for Shapiro Wilks test, T-test and I adapted it for Wilcoxon.
However, I’m new to the summarise function and don’t understand all the arguments needed just yet
Identifying outliers:
df %>%
group_by(group) %>%
summarise_all(.funs = funs(Sample = identify_outliers(.)$sample, outlier = identify_outliers(.)$is.extreme))
but all these codes gave me errors
Error in summarise()
:
ℹ In argument: Sample_sample = identify_outliers(Sample)$sample
.
ℹ In group 1: group = "b"
.