I am trying to fit a polynomial or loess to a short series of data (crop coefficient (Kc) ~ growing degree days (gdds)). The actual data is quite short, where
Kc <- c(0.1, 0.41, 0.54, 0.66, 0.85, 1)
gdd <- c(0, 90, 225, 270, 360, 630)
In order to produce smooth ends, I am putting 3 of these data sets together as follows.
Kc <- c(0.1, 0.41, 0.54, 0.66, 0.85, 1, 1.1, 1.41, 1.54, 1.66, 1.85, 2, 2.1, 2.41, 2.54, 2.66, 2.85, 3)
gdd <- c(0, 90, 225, 270, 360, 630, 630, 720, 855, 900, 990, 1260, 1260, 1350, 1485, 1530, 1620, 1890)
df <- data.frame(gdd = gdd, Kc = Kc)
A typical polynomial fit in R would look like:
Kc_fit <- lm(Kc ~ poly(gdd,4))
These results are quite poor, however, and increasing the degree of the polynomial does not make a big difference.
ggplot(df, aes(x=gdd, y=Kc)) + geom_point() + stat_smooth(method='lm', formula = Kc ~ poly(gdd,4), size = 1)
The above ggplot code produces strange results that I can’t make sense of. Not sure why it would do so. Otherwise, does anyone have any ideas that would allow me to fit a smooth polynomial, loess, or other fit to the data (where the fit goes through the points and the amount of smoothing is minimal)?
I haven’t run across this issue in R before, and its kind of strange that it’s having trouble fitting a polynomial to this data. Thanks.
2
I’m not quite sure what you mean by “quite poor”:
Kc_fit <- lm(Kc ~ poly(gdd,4), df)
summary(Kc_fit)
Residual standard error: 0.1032 on 13 degrees of freedom
Multiple R-squared: 0.9898, Adjusted R-squared: 0.9866
(On the other hand, a simple linear model (Kc ~ gdd
) has a multiple R^2 of 0.9873 and an adjusted R^2 of 0.9865, so adding the polynomial terms doesn’t do much — although there is a 2.1 unit improvement in AIC for the 4th-order model …)
The problem with your plot is the formula for stat_smooth
always has to be specified in terms of x
and y
, not the original variables from the data frame, i.e. in this case stat_smooth(method='lm', formula = y ~ poly(x,4))
: