Week 2: ARIMA Models
Johnny Y -
This week, I focused on building and optimizing my baseline ARIMA (autoregressive integrated moving average) model. I first built an ARIMA model to predict continuous prices (which ARIMA tends to perform better on). I found the parameters using the Augmented Dickey-Fuller (ADF), Autocorrelation Function (ACF), and Partial Autocorrelation Function (PACF) tests, and used Dogecoin data for the testing and training. I played around with different time frames to see how they affected accuracy.
I then built a separate ARIMA model to predict only whether the price would go up or down on a given day (since that is closer to what sentiment analysis does best) and measured the accuracy using percentage returns as data. With an 80%-20% train-test split on 12 months of data and parameters (p, d, q) = (25, 0, 25), the model achieved a 42.47% accuracy – less than a random coin flip. This would appear to demonstrate that ARIMA perhaps isn’t the best baseline for up/down predictions, though certainly the huge swings in the crypto market (e.g. 2.5x in 1 day because of the election) don’t help. Debugging my code took quite a bit of time, and for some time frames/parameter choices the model seems to have uniform recommendations (i.e. predicting “up” every day or “down” every day); I still haven’t figured that out yet. Another challenge is that manual entry is only weekly, so I really only have 1 years’ worth of data. I will be meeting with my site advisor to discuss the results.
It should be noted that past papers generally used ARIMA as a baseline for predicting continuous prices instead of up/down, so I don’t anticipate the lack of accuracy of ARIMA for up/down movement being an issue, though I will discuss this with my site advisor.
Comments:
All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.