I’m trying to see the number of new employees a manager got between time one and time 2. I have a string of all employee ids that roll up under that manager.
My below code always says there is 1 new employee, but as you can see, there’s 2. How do I find out how many new employees there are? The ids aren’t guaranteed to always be in the same order, but they will always be split by a “, “.
library(dplyr)
library(stringr)
#First data set
mydata_q2 <- tibble(
leader = 1,
reports_q2 = "2222, 3333, 4444"
)
#Second dataset
mydata_q3 <- tibble(
leader = 1,
reports_q3 = "2222, 3333, 4444, 55555, 66666"
)
#Function to count number of new employees
calculate_number_new_emps <- function(reports_time1, reports_time2) {
time_1_reports <- ifelse(is.na(reports_time1), character(0), str_split(reports_time1, " ,\s*")[[1]])
time_2_reports <- str_split(reports_time2, " ,\s*")[[1]]
num_new_employees <- length(setdiff(time_1_reports, time_2_reports))
num_new_employees
}
#Join data and count number of new staff--get wrong answer
mydata_q2 %>%
left_join(mydata_q3) %>%
mutate(new_staff_count = calculate_number_new_emps(reports_q2, reports_q3))