I am processing multiple manually filled xlsx files (having the same stucture) containing numeric and text data.
I read them once as text. Some of the ranges are later on converted by as.numeric
. This function warns the user if there is any text data that is not successfully converted to numeric:
Warning message:
NAs introduced by coercion
But this warning is displayed after the whole batch has been processed and thus is useless.
I need to interrupt the batch processing and manually check the specific file if the typo (Excel cell data type mismatch etc.) can be interpreted as numeric or not and prevent data loss.
I was using this work around:
my.as.numeric <- function(x, ...){
xx <- as.numeric(x, ...)
stopifnot(all(x == xx))
xx
}
And it was OK until I met an Excel file with a cell filled with 0.756 sharp (numeric data type, ,
is used in my locale):
It was read by readxl::read_xlsx(col_types = "text")
as "0.75600000000000001"
and converted by as.numeric()
to 0.756
causing interruprion of my.as.numeric()
. Replacement of all(x == xx)
with isTRUE(all.equal(x, xx))
expectedly did not help as the types were different.
Of course I can detect indices of !is.na(xx)
s and exclude them form all(x == xx)
comparison but I hope there is a more elegant solution.
I wonder if there is a way to throw an error and interrupt as.mumeric cast instead of standard warning?