I try to prepare dataframe for mixed multinomial logit model using mlogit package.
My data looks as follows:
PART_ID | Trial_no | Expected_Return | Risk | Curr_item_Price | Rating_distortion | Popularity_distortion | Choice | Option_ID |
---|---|---|---|---|---|---|---|---|
14203F28 | 11 | 4 | 4 | 1108 | -0.9 | -27 | 1 | 1 |
14203F28 | 11 | 2 | 6 | 936 | 1.4 | 8 | 0 | 2 |
14203F28 | 11 | 2 | 6 | 1094 | 1.4 | 8 | 0 | 3 |
14203F28 | 11 | 2 | 5 | 1150 | -1.7 | 29 | 0 | 4 |
14203F28 | 12 | 8 | 2 | 879 | 2.3 | 18 | 0 | 1 |
Each participant (PART_ID) have a bunch of choice opportunities (Trial_no) in which they choose one (Choice) among possible options (Option_ID).
I try to reorganise data for mlogit function using dfidx, as in “nested example” from dfidx docs.
Even thought that I checked that there’s only one unique observation for PART_ID, Trial_no, Option_ID my error message displays the opposite.
# load exemplary set
data("JapaneseFDI", package = "mlogit")
JapaneseFDI <- dplyr::select(JapaneseFDI, 1:8)
JapaneseFDI |>
group_by(firm, country, region) |>
summarise(count = n()) |>
filter(count != 1)
# Output: 0 rows
res |>
group_by(PART_ID, Trial_no, Option_ID) |>
summarise(count = n()) |>
filter(count != 1)
# Output: 0 rows
JP1b <- dfidx(JapaneseFDI,
idx = list("firm", c("region", "country")),
choice = "choice")
# Works fine
mlogit_data <- dfidx(res,
idx = list("PART_ID", c("Trial_no", "Option_ID")),
choice = "Choice")
#Error in dfidx(res, idx = list("PART_ID", c("Trial_no", "Option_ID")), :
# the two indexes don't define unique observations
I see this and I’m aware, that dfidx function is build rather for two indexes, not three, but I think my problem is similarly to nested example from documentation.
Can someone please explain to me what I did wrong?
> sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Linux Mint 21.3
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lmtest_0.9-40 zoo_1.8-12 mlogit_1.1-1 dfidx_0.0-5 lubridate_1.9.3 forcats_1.0.0
[7] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[13] ggplot2_3.5.1 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] utf8_1.2.4 generics_0.1.3 lattice_0.22-6 stringi_1.8.4 hms_1.1.3 magrittr_2.0.3
[7] grid_4.4.0 timechange_0.3.0 Formula_1.2-5 fansi_1.0.6 scales_1.3.0 Rdpack_2.6
[13] cli_3.6.3 rlang_1.1.4 crayon_1.5.3 rbibutils_2.2.16 bit64_4.0.5 munsell_0.5.1
[19] withr_3.0.0 tools_4.4.0 parallel_4.4.0 tzdb_0.4.0 colorspace_2.1-0 vctrs_0.6.5
[25] R6_2.5.1 lifecycle_1.0.4 bit_4.0.5 vroom_1.6.5 MASS_7.3-61 pkgconfig_2.0.3
[31] pillar_1.9.0 gtable_0.3.5 glue_1.7.0 statmod_1.5.0 xfun_0.45 tidyselect_1.2.1
[37] rstudioapi_0.16.0 knitr_1.47 compiler_4.4.0