I am using auto.arima
function as the backbone to forecast stock price, with example below:
First off I have the parameters set up and download price data, with Walmart(WMT) being used as an example.
library(quantmod)
library(forecast)
ticker<-"WMT"
start_date<-"2018-04-01"
end_date<-"2024-07-30"
x<-getSymbols(ticker,from=start_date,to=end_date,auto.assign=FALSE)
x<-x[,6] # to extract the adjusted price
plot(as.vector(x[1:length(x)-1]),
as.vector(x[2:length(x)]),main="Plotting WMT’s stock price against itself with a lag of 1")
The plot below shows a pretty strong linear trend which supports an auto regressive relationship.
Then return data on X is calculated and plotted.
x.return<-diff(as.vector(x))/x[1:length(x)-1]
plot(x.return, type="l",main="return")
To run an adf test to ensure stationarity.
adf.test(x.return)
Augmented Dickey-Fuller Test
data: x.return
Dickey-Fuller = -11.745, Lag order = 11, p-value = 0.01
alternative hypothesis: stationary
To create the in-sample data set and out-of-sample period.
numbers.of.days<-length(x.return)
days.out.of.sample=30
in.sample<-x.return[1:(numbers.of.days-days.out.of.sample)]
To use auto.arima
function to do the heavy lifting.
> model=auto.arima(in.sample)
> model
Series: in.sample
ARIMA(1,0,0) with non-zero mean
Coefficients:
ar1 mean
-0.0801 7e-04
s.e. 0.0252 3e-04
sigma^2 = 0.0001843: log likelihood = 4506.08
AIC=-9006.16 AICc=-9006.14 BIC=-8990.09
To forecast with the model built with auto.arima
function and plot:
futures_returns=forecast(model,
h=days.out.of.sample,
level=c(99))
plot(forecast(futures_returns))
I myself always find this kind of plot somehow confusing to look at, converting it to price return seems to be a bit better.
x.vec=as.vector(x)
stock.prices.in.sample=
x.vec[1:(length(x.vec)-days.out.of.sample)]
stock.prices.forecasted=
stock.prices.in.sample[length(stock.prices.in.sample)]*compound_forecasts
forecasted=
c(x.vec[100:(length(x.vec)-days.out.of.sample)],
stock.prices.forecasted)
plot(forecasted,
type="l")
lines(x.vec[100:length(x.vec)],col="red")
Finally, I extract the max and min from the last 20 values, which are the foretasted values, so pretty much in the coming 20 days, it would be relatively safe to buy at 67.75 (min value from the predicted series) and sell at 68.67 (max value from the predicted series).
> forecast_values<-tail(forecasted,20)
> max(forecast_values)
[1] 68.67026
> min(forecast_values)
[1] 67.74756
I totally understand the above set up is very very basic and I would love to seek ideas or comments from you all and see how can I make it better? Many thanks for your help.