I’m trying to run a custom function, QA2Char, to iterate down all the rows of a specific column in my dataframe and then append the output as a new column.
QA2Char takes a decimal/integer and outputs a character array of length 7. If the function’s argument is an integer, I only want the string at index 4 in the new column. If the function’s argument is NA, I just need the new column value at the same row to be NA as well.
Ideally, this should be easy to do, but I’ve run into two key issues: making sure that I have the right classes in case_when() and editing my custom function to allow vectors as arguments.
My custom function, QA2Char_dec, looks like this:
QA2Char_dec <- function(num) {
if (!is.na(num)) {
# Decimal to binary
char <- paste(sapply(strsplit(paste(rev(intToBits(num))),""),`[[`,2),collapse="")
# Extract the last 16 digits
char <- substr(char, 17, 32)
# QA array
qa.arr <- c(substr(char, 1, 1), # 15 Reserved
substr(char, 2, 3), # 13-14 Aerosol Model
substr(char, 4, 4), # 12 Glint Mask
substr(char, 5, 8), # 8-11 QA AOD
substr(char, 9, 11), # 5-7 Ajacency Mask
substr(char, 12, 13), # 3-4 Land Water Snow/Ice Mask
substr(char, 14, 16)) # 0-2 Cloud Mask
} else {
qa.arr <- rep(NA_character_, 7)
}
return(qa.arr)
}
This function takes a decimal (ex: 1291) and returns a character array of "0" "00" "0" "0101" "000" "01" "011"
. I only need the string in index 4: 0101
As an example, I shoudl be able to run this function over a very basic dataframe with 1 column:
library(tidyverse)
QA.values <- c(1, 1291, NA, 1288, 1000, NA) #numeric
df <- data.frame(QA.values)
And append the output as a new column, QA.check:
df <- df %>%
mutate(QA.check = case_when(QA.values >= 0 ~ QA2Char_dec(QA.values)[4], TRUE ~ NA_character_))
QA.values QA.check
1 1 0000
2 1291 0101
3 NA NA
4 1288 0101
5 1000 0011
6 NA NA
The dataframe above is what I need the final df to look like. However, I keep getting error messages about Error in if (!is.na(num)) { : the condition has length > 1
. I know that the custom function doesn’t take vectors as arguments, but applying a function down a column and mutating the output to a new column shouldn’t be this difficult!
I have also tried using mutate(QA_Check = sapply(QA.values, QA2Char_dec)[4])
, but it ignores NA values and seems to just use the output of the first QA.value and repeat it for the rest of the column.
What can I change?
Seyong Chang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
As mentioned by LMc, you have multiple options. I just want to add the option to include lapply()
in your function and add a parameter to select an element from the array.
QA2Char_dec <- function(num, arr_sel = all()){
# Decimal to binary
char <- lapply(num, function(.x){
paste(sapply(strsplit(paste(rev(intToBits(.x))),""),`[[`,2),
collapse="")})
# Extract the last 16 digits
char <- substr(char, 17, 32)
# Create NA's
char <- ifelse(char == "0000000000000000", NA_character_,
char)
qa.arr <- lapply(char,
function(.x){
c(substr(.x, 1, 1), # 15 Reserved
substr(.x, 2, 3), # 13-14 Aerosol Model
substr(.x, 4, 4), # 12 Glint Mask
substr(.x, 5, 8), # 8-11 QA AOD
substr(.x, 9, 11), # 5-7 Ajacency Mask
substr(.x, 12, 13), # 3-4 Land Water Snow/Ice Mask
substr(.x, 14, 16))[arr_sel]})
return(qa.arr)
}
library(tidyverse)
QA.values <- c(1, 1291, NA, 1288, 1000, NA) #numeric
df <- data.frame(QA.values)
df %>%
mutate(QA.check = QA2Char_dec(QA.values, 4))
QA.values QA.check
1 1 0000
2 1291 0101
3 NA NA
4 1288 0101
5 1000 0011
6 NA NA