My data consists of the moving averages of weekly cases for multiple districts. Below is a short example of the whole dataset which has over 100 columns and 500 rows
State A State B State C State D
1 9.5 64.5 10.0 5.5
2 28.0 64.0 6.0 3.0
3 38.5 104.5 17.0 4.5
4 20.5 118.0 23.0 5.0
5 10.5 99.5 17.0 5.0
6 7.5 78.0 14.0 3.0
7 12.0 73.0 20.5 2.5
8 17.5 74.5 20.0 2.0
9 19.0 69.0 12.5 5.0
10 14.5 64.5 15.0 7.0
I’d like to determine an outbreak by consecutive increase for every 3 rows which will be labeled TRUE
or FALSE
. This should be applied to all columns.
Each row will be labeled ‘TRUE’ only if it is bigger than the previous row AND the previous row is bigger the the row before it.
So technically, if x[i]>x[i-1] & x[i-1]>x[i-2]
is true, then x[i]
should be labeled TRUE
, and false otherwise.
The result should be like this:
State A State B State C State D
1 FALSE FALSE FALSE FALSE
2 FALSE FALSE FALSE FALSE
3 TRUE FALSE FALSE FALSE
4 FALSE TRUE TRUE TRUE
5 FALSE FALSE FALSE FALSE
6 FALSE FALSE FALSE FALSE
7 FALSE FALSE FALSE FALSE
8 TRUE FALSE FALSE FALSE
9 TRUE FALSE FALSE FALSE
10 FALSE FALSE FALSE TRUE
I have tried using the following code:
for (i in 3:nrow(data)) {
# Check for positive increase in each preceding row
if (data[i] > data[i - 1] & data[i - 1] > data[i - 2]) {
data[i] <- "TRUE"
}
else{data[i] <- "FALSE"}
}
but it gives me an error: “Error in if (test[i] > test[i – 1] & test[i – 1] > test[i – 2]) { :
the condition has length > 1”
There is also another code by my teammate but it does not determine the outbreak correctly:
checkfun <- function(x) {
c(F,F,head(diff(x)>0,-1)&tail(diff(x)>0,-1))
}
outbreak=data.frame(sapply(na.omit(data),checkfun))
Izzah is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.