Data sample below.
I’m working on an analysis involving a complex nested dataset and I need to implement a mixed effect model in R. Here’s a brief overview of my situation:
- Objective: Determine the effect of personality traits (OCEAN), emotional states, situational perception, and emotional perception on the evaluation of valence of images generated by artificial intelligence.
- Problem: I’ve previously run regression models without fully accounting for the nested structure in the data. The regression coefficients from these models are extremely small, and I’m not confident in their accuracy.
- Data Structure: The data is nested, with images nested within participants, and participants nested within surveys.
Data Details:
- I conducted 60 surveys, each containing 20 different images, and applied these surveys to 20 participants.
- The surveys also include 5 additional images that are common across all surveys.
- For the analysis, my dataset is structured so that the first 20 rows correspond to the first participant’s reactions to the 20 images of the survey. The next 20 rows correspond to the next participant, and so on. The participant’s personality and emotional states are repeated 20 times, since they are constant for the same person. This repetition is the repeated measures part, while the situational and emotional perceptions vary.
Here is a sample of the data with not repeated images, in case the explanation of the data is unclear.
What I’ve Tried:
- Separating the analysis of the 5 common images and the uncommon images.
- Analyzing each of the common images separately.
- Computing the mean of the 5 common images to account for repeated measures, reducing the data to one row per participant for these images.
- Running simpler regression models that only have the participant as a higher level.
Here is the code for the not repeated images:
# Define the formula for the model for valence
valence_formula <- as.formula(paste("valence ~ water + ground + mountain + tree + snow + O + C + E + A + N + angry + good + bad + happy + sad + nice + unfriendly + tense + relaxed + curiosity + rage + fear + sexy + careness + sadness + playfulness + work + intellectual + threat + romance + positive + negative + deceit + communication + (1 | survey_index/image_name)", collapse = " + "))
# Fit the regression model for valence using lmer
valence_model <- lmer(valence_formula, data = df)
Here is the code for the common images:
# Define the formula for the model for valence
valence_formula <- as.formula(paste("valence ~ water + ground + mountain + tree + O + C + E + A + N + angry + good + bad + happy + sad + nice + unfriendly + tense + relaxed + curiosity + rage + fear + sexy + careness + sadness + playfulness + work + intellectual + threat + romance + positive + negative + deceit + communication + (1 | image_name) + (1 | participant_id)", collapse = " + "))
# Fit the regression model for valence using lmer
valence_model <- lmer(valence_formula, data = df)
As you can see from the data, I recognize a nested design for the not repeated images model and a crossed design for the common images. More on that here.
Questions:
- How can I properly implement a mixed effect model in R to account for the nested structure of my data (images within participants within surveys)?
- What packages and functions should I use for this type of analysis?
Any guidance or example code snippets would be greatly appreciated!