I need help with how to create an actual vs predicted plot for a Tweedie GLM model in R where weights are used.
I have a Tweedie GLM model in R where I have derived the coefficients/factors for a risk premium model (multiplicative). My explanatory variables are vehicle age and total weight, and this is the model:
Risk premium model:
weight = antriar
response variable = cost_t
distribution = tweedie
Where antriar = number of insurance years and cost_t = truncated known damage cost. This is how I have coded this model:
As I understand it (and as my supervisor explained), the risk premium is modeled here because we use antriar (number of insurance years) as a weight. This is because the risk premium is defined as the sum of the claim payments divided by the sum of the insurance years (antriar).
Now, I want to generate an actual vs predicted plot for the risk premium for the explanatory variable vehicle age, and I am wondering how I can do this while accounting for the weight. Here is what I have done so far:
Change referense
LSLP_D$Totalvikt_grp <- relevel(LSLP_D$Totalvikt_grp, ref = “751-1050”)
# Fit GLM model with Tweedie distribution
glm_model_DELK <- glm(
cost_t ~ V_age + V_weight,
data = LSLP_D,
family = tweedie(link.power = 0),
weights = antriar
)
summary(glm_model_DELK)
# Exponentiate coefficients to get factors
coefficients_DELK <- coef(glm_model_DELK)
exponentiated_coefficients_DELK <- exp(coefficients_DELK)
# Print exponentiated coefficients
print(exponentiated_coefficients_DELK)
I have calculated the actual risk premium as sum(cost_t) / sum(antriar)
, but I am not sure if my predicted values are correct. According to my understanding, the predicted values should already represent our risk premium since they are predicted from the Tweedie GLM model with the weight antriar. However, the plot suggests that something is wrong, as you can see below:
I guess I am calculating the actual values correctly because my supervisor instructed me on how to do this. But am I predicting the values incorrectly? Or am I wrong in thinking that it is the risk premium that we get when we predict? Thank you in advance!