I am analyzing the response of varible_A (continuous) to variable_B (continuous), variable_C (continuous), and variable_D (categorical with levels “old” and “new”).
dat <- structure(list(variable_A = c(-0.0295856251366527, -1.24738195367725,
0.648945031694853, 0.580402160337185, 0.303715429301314, -0.553980060288217,
-0.607948144698327, 2.71061895113126, -1.53757429143602, 0.129366769790572,
0.16626208677442, -1.41291646270657, -0.200839313526748, 2.03483046591584,
-0.112578482123941, -0.561740791028714, -0.272330872554233, 0.71323537344756,
0.841184675731212, 0.707842765365241, 0.412738113173535, -0.700128662154531,
-1.68043030412451, -1.53152621643461, 0.242993861464165, -0.588067091651035,
-0.997892233042602, -1.58228732028551, -0.309374655641981, -0.891086439260152,
0.560086614100886, -1.00551936564398, 0.216711500378051, -0.452286951494992,
0.728493371811804, -0.720834439782271, 0.693795575916618, 0.0231649761405714,
0.287283950920568, -0.201169463943841, -0.680455750771586, 0.433109625250988,
1.23677460340502, -0.499702557524854, -1.17988096076043, -0.34533911717475,
0.733757575575177, 0.803025086512173, -0.956314331656592, 2.0610849115608,
1.26283438981272, -0.13524932927448, 1.18775411592947, -0.429116253594837,
1.24239940798592, 2.30364888316214, 1.60875733136455, -0.567027215576018,
0.129962056869789, -2.23963324058254, 0.481214590815243, 0.551147992003243,
-0.410903943379136, -0.367157561647619, -0.995447457890487, 0.468528939223885,
-0.295017486543682, 1.5471259341995, 0.149609004075832, -1.37356176484758,
-0.530120009280847), variable_B = c(0.909414301702991, -0.943449516247587,
0.781524506629484, -0.600911949819743, -1.00324860948739, -0.879024346862077,
0.148702302376609, 1.03680921377107, -0.818036316970879, -1.35838203594376,
0.716858807832897, -0.828797106744759, -0.668603665623089, 1.21332444520722,
1.35303750102937, -0.35838533300562, 0.586314940239296, 0.814575853257065,
0.789492242485147, 1.19281997662545, -0.910500570977607, 0.179361003357409,
-1.20694554173343, -0.865556160009293, -1.18138003775792, -0.460239810467593,
-0.347479702200378, -0.376211854380412, 0.695718104445771, 0.311847367815062,
0.422743354183219, -1.31000649682009, 0.688080878453372, -2.6910231963726,
0.959527136748011, 0.816023979122475, 2.47961675089516, 0.417574474989369,
-0.784547963068128, 0.660568497111298, -1.17583280451095, 0.910685709373682,
0.0893873399996857, -0.519872558390938, -0.371391026023119, 1.16814083225284,
0.942464870838783, 1.08748559762075, -0.156271108557384, 1.03813800412845,
0.912245187233877, -0.864089336414513, -0.519872558390938, 0.322936960762913,
0.395796664489034, -0.0134883824090008, 0.880364781167251, -1.02201375308388,
0.946978950240417, -0.182339630757281, 0.566561585615522, -0.663533281364679,
0.732415185925742, -0.466304170570451, -0.672450145062805, -0.519872558390938,
-3.00315063163361, 1.86543741543252, -0.954446227371451, -1.15998107435428,
0.824664738419366), variable_C = c(0.419730106958754, 0.20723938151934,
-0.923643462344663, -0.840808094800484, 2.1592728253865, 0.569193922310206,
-0.756171958396649, -0.963260377257096, -1.0695057399768, -0.606708143045197,
-0.541880464097579, 0.484557785906372, 1.88555595871878, -0.891229622870853,
-0.887628085151541, -0.511267393483426, -0.76517580269493, -0.974064990415032,
-0.756171958396649, -0.939850382081567, -0.554485846115171, -0.165519772429464,
-0.502263549185146, -0.367205884710942, 1.34892683854128, 1.49658988503307,
-0.540079695237923, -0.448240483395464, -0.891229622870853, 0.333293201695263,
-0.808394255326675, -0.70575043032628, -0.864218089976013, 0.171224004326219,
-0.945252688660535, -0.56889199699242, -0.414025875061999, -0.675137359712127,
-0.385213573307502, -0.806593486467019, 0.313484744239047, -1.02628728734506,
-0.224945144798113, -0.781382722431834, -0.619313525062789, -0.644524289097974,
-1.08571265971371, -0.990271910151936, -0.594102761027605, 3.22712875916254,
-0.550884308395859, 0.506167012222245, 0.913140774504513, -0.597704298746917,
-0.269964366289515, -0.657129671115566, -0.884026547432229, -0.356401271553005,
-1.04069343822231, -0.00705211277973086, -0.462646634272713,
0.08838863678204, -0.981268065853656, -0.414025875061999, -0.716555043484216,
0.194633999501747, -0.619313525062789, -0.907436542607758, -0.775980415852866,
-0.289772823745731, -0.597704298746917), variable_D = structure(c(2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L,
1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L,
1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 1L, 2L), levels = c("old", "new"), class = "factor")), row.names = 97:167, class = "data.frame")
I hypothesize that the response of variable_A to variable_B is dependent on variable_C and that this differs between levels of variable_D. Thus, I am interested in the interaction of all variables. I fitted a linear model:
fit <- lm(variable_A ~ variable_B*variable_C*variable_D, data = dat)
summary(fit)
Output:
Call:
lm(formula = variable_A ~ variable_B * variable_C * variable_D,
data = dat)
Residuals:
Min 1Q Median 3Q Max
-1.92931 -0.58884 -0.06014 0.56585 2.57492
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.18132 0.15653 1.158 0.251063
variable_B 0.51698 0.13378 3.865 0.000266 ***
variable_C 0.07710 0.16378 0.471 0.639432
variable_Dnew -0.42993 0.27514 -1.563 0.123157
variable_B:variable_C 0.14710 0.16839 0.874 0.385679
variable_B:variable_Dnew -0.18376 0.33968 -0.541 0.590441
variable_C:variable_Dnew -0.05801 0.35809 -0.162 0.871824
variable_B:variable_C:variable_Dnew -0.78161 0.39110 -1.999 0.049982 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8798 on 63 degrees of freedom
Multiple R-squared: 0.3034, Adjusted R-squared: 0.226
F-statistic: 3.92 on 7 and 63 DF, p-value: 0.001315
I would like to use the package marginaeffects
to further investigate the significant three-way interaction (variable_B:variable_C:variable_Dnew
). How can I test in which direction the response of variable_A to variable_B is influenced by variable_C? How can I test how this differs between levels of variable_D and for which subsets the influence of variable_D is significant?
I managed to plot the model predictions, which gives an initial impression of the directions of the effects. With higher values of variable_C (variable_C was used for faceting), the relationship between variable_B (on the x-axis) and variable_A (on the y-axis) seems to invert for the “new” level of variable_D (used for coloring) but not for the “old” level. I am searching for a way to support this visualization with statistics, preferably using functions from the marginal effects
package.
marginaleffects::plot_predictions(fit, condition = list("variable_B", "variable_D", "variable_C"))