Week 1: Coin Price Data Gathering and ARIMA
Johnny Y -
This week, I focused on obtaining the data necessary for my project and studying the ARIMA model. I found that the CoinGecko API can be used to download pricing data up to 1 year, and that CoinMarketCap can be used for manual entry beyond 1 year. I had to learn some Python in order to create the code for importing data from the API. I also settled on Dogecoin, Shiba Inu, and Pepe as my coins of study, and learned about the ARIMA forecasting model. Next week, I will finish importing data (with manual entry) and build and test the ARIMA model, and I also hope to identify candidates for the classifier from Hugging Face.
I feel that this project is progressing on schedule, though I anticipate that I may find issues during testing. That being said, I did run into some obstacles. Though I originally planned to use 3 years of data, only Dogecoin meets this requirement. After discussing with my site advisor, we decided that 1.5 years of data might be sufficient. I examined several cryptocurrency sentiment analysis papers to confirm this, including Huang et al (2021), Khan and Ihsan (2024), Raheman et al (2022), Abraham et al (2018), and Colianni et al (2015). Refreshing my Python knowledge and debugging code was a major time drain (turns out pip is run from Command Prompt, not the Python shell). Lastly, cryptocurrency pricing data is behind a paywall beyond 1 year on CoinGecko, and I was unable to find a free solution (Yahoo Finance also requires a premium subscription and CoinMarketCap doesn’t allow downloads with granularity beyond a week).
Some notes on the papers I read:
- Tweet volume may be a strong benchmark to compare sentiment analysis to
- Choice of model plays a significant role in results
- Comparing results from overall sentiment analysis with simply analyzing posts from top “expert” accounts may be interesting
- Sentiment analysis may be less effective when prices are falling due to bot posts attempting to keep the price up

Comments:
All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.