Week 2: ARIMA Models

Johnny Y -

This week, I focused on building and optimizing my baseline ARIMA (autoregressive integrated moving average) model. I first built an ARIMA model to predict continuous prices (which ARIMA tends to perform better on). I found the parameters using the Augmented Dickey-Fuller (ADF), Autocorrelation Function (ACF), and Partial Autocorrelation Function (PACF) tests, and used Dogecoin data for the testing and training. I played around with different time frames to see how they affected accuracy.

I then built a separate ARIMA model to predict only whether the price would go up or down on a given day (since that is closer to what sentiment analysis does best) and measured the accuracy using percentage returns as data. With an 80%-20% train-test split on 12 months of data and parameters (p, d, q) = (25, 0, 25), the model achieved a 42.47% accuracy – less than a random coin flip. This would appear to demonstrate that ARIMA perhaps isn’t the best baseline for up/down predictions, though certainly the huge swings in the crypto market (e.g. 2.5x in 1 day because of the election) don’t help. Debugging my code took quite a bit of time, and for some time frames/parameter choices the model seems to have uniform recommendations (i.e. predicting “up” every day or “down” every day); I still haven’t figured that out yet. Another challenge is that manual entry is only weekly, so I really only have 1 years’ worth of data. I will be meeting with my site advisor to discuss the results.

It should be noted that past papers generally used ARIMA as a baseline for predicting continuous prices instead of up/down, so I don’t anticipate the lack of accuracy of ARIMA for up/down movement being an issue, though I will discuss this with my site advisor.

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    tesla_l
    What is a train test? Are there other baseline models you might consider using instead?
    rob_lee
    Inverse -ARIMA?
    makeen_s
    Hi Johnny. It sounds concerning that you only have around a years worth of data. Would it make sense to investigate a broader scope of coins to get more data?
    austin_l
    How would you increase the accuracy of your model? Would you want to switch or manipulate something in your existing one?
    Anonymous
    If the ARIMA model is less accurate in predicting up/down movement, can it be used in conjunction with another model to accurately predict both up/down and continuous prices?
    johnny_y
    Thank you for your questions, Tesla! By 80%-20% train-test split I mean 80% of the data was used for training and 20% of the data was used for testing. Other potential baselines I can think of are random choice, tweet/post/comment volume, and random forest classifier, though I'm sure my site mentor can suggest more alternatives.
    johnny_y
    Mr. Lee, could you elaborate on what you mean by inverse ARIMA? Converting the data back before calculating error?
    johnny_y
    Thank you for your question, Makeen! Based on past papers, 1 year of data should be enough (especially since I have 3 coins), though it's possible I might need to expand the scope.
    johnny_y
    Thank you for your question, Austin! I'll be meeting with my mentor later this week to see how I can improve the model's performance. I may need to combine it with machine learning like LSTM or a random forest classifier.
    johnny_y
    Yes, another model might be better suited for up/down prediction. I don't anticipate issues running that model alongside ARIMA.

Leave a Reply

Your email address will not be published. Required fields are marked *