I’m starting to learn R, and I don’t understand an error message I am receiving from write.fwf() from the gdata package. I created a two data files, one using the following code:
testdata <- data.frame (VAR1=c(1),
VAR2=c(2))
And another from importing a test .csv file:
VAR1,VAR2
1,2
My full code is:
library(tidyverse) # Loads the `tidyverse` collection
library(readxl) # Reads CSV and Excel files
library(gdata) # For write.fwf
testdata <- data.frame (VAR1=c(1),
VAR2=c(2))
read_testdata <- read_csv ("test.csv")
write.fwf(x=testdata)
write.fwf(x=read_testdata)
The two data frames seem identical to me. When I run write.fwf(x=testdata)
, the function returns successfully:
> write.fwf(x=testdata)
VAR1 VAR2
1 2
However when I use the imported version of the data by running write.fwf(x=read_testdata)
, I get the following error message:
Error in `[<-`:
! Assigned data `format(...)` must be compatible with existing data.
✖ Existing data has 1 row.
✖ Assigned data has 4 rows.
ℹ Row updates require a list value. Do you need `list()` or `as.list()`?
Caused by error in `vectbl_recycle_rhs_rows()`:
! Can't recycle input of size 4 to size 1.
Run `rlang::last_trace()` to see where the error occurred.
Warning message:
In x2[!test] <- format(y[!test, i], justify = justify, width = tmp, :
number of items to replace is not a multiple of replacement length
I don’t understand what this error message means. I tried to simplify the inputs as much as I could to see where the two inputs differ, but I’m still not seeing what is causing the error. Do I need to process the imported data somehow before using it in write.fwf()? Any insight is appreciated. Thank you.
1
The read_csv
function imports the data as a tibble
, and write.fwf
requires a data.frame
.
You can try to import your CSV file with read.csv
instead, it will be imported as a data.frame
:
read_testdata2 <- read.csv("test.csv")
write.fwf(x=read_testdata2)
Or as proposed by @margusl, first convert the tibble
to a data.frame
:
df = as.data.frame(read_testdata)
write.fwf(x=df)
You can check that the objects have different classes with:
class(testdata)
class(read_testdata)
1
Just to add to the previous answer, perhaps it helps to approach such issues in future.
Error itself might throw you off a bit (what 4 rows?!), but it’s the warning message and this piece of code that provides some hints:
x2[!test] <- format(y[!test, i], ...)
One difference between tibbles (returned by all readr::read_*()
functions, those with underscores, read_csv()
etc) and data.frame
s is how subsetting works – https://tibble.tidyverse.org/articles/tibble.html#subsetting .
In case of a data.frame
, y[, 1]
would (by default) return a simple vector, but with a tibble
you’d get a single column tibble
, which under the hood is still a list of column vectors and can easily break something down the road if it’s assumed to be a vector. For example format()
goes little wacky and starts to process lines from tibble
‘s print()
output (3 lines from header, 1 from rows, adds up to 4 in your error message; 3(header) + 3(rows) = 6 in my example):
# data.frame:
testdata_df <- data.frame(VAR1 = c(1:3), VAR2 = c(4:6))
class(testdata_df)
#> [1] "data.frame"
# tibble:
testdata_tbl <- readr::read_csv(
"VAR1,VAR2
1,4
2,5
3,6", show_col_types = FALSE)
class(testdata_tbl)
#> [1] "spec_tbl_df" "tbl_df" "tbl" "data.frame"
# subsetting a single column:
(df_1 <- testdata_df[,1])
#> [1] 1 2 3
str(df_1)
#> int [1:3] 1 2 3
(tbl_1 <- testdata_tbl[,1])
#> # A tibble: 3 × 1
#> VAR1
#> <dbl>
#> 1 1
#> 2 2
#> 3 3
str(tbl_1)
#> tibble [3 × 1] (S3: tbl_df/tbl/data.frame)
#> $ VAR1: num [1:3] 1 2 3
### how it breaks write.fwf()
# prep
(nRow <- nrow(testdata_df))
#> [1] 3
(x2 <- character(length=nRow))
#> [1] "" "" ""
# expected behaviour:
(x2[] <- format(testdata_df[, 1]))
#> [1] "1" "2" "3"
# with tibble:
(x2[] <- format(testdata_tbl[, 1]))
#> Warning in x2[] <- format(testdata_tbl[, 1]): number of items to replace is not
#> a multiple of replacement length
#> [1] "# A tibble: 3 × 1" " VAR1" " <dbl>"
#> [4] "1 1" "2 2" "3 3"
Actual error occurs when it’s time to write formatted vector back to data.frame
— lengths do not match anymore. So even if error itself seems to make no sense, do keep an eye on warnings too.
For relevant code block you can check the source: R/write.fwf.R#L96