Week 10: Classifier Complete!

Johnny Y -

This week, I finished coding my LR classifier. Again, the goal was to take some pre-computed text embeddings and predict one of three labels: “down”, “no change”, or “up”. It uses one logistic regression layer.

The data came in CSV format, with each row being a vector (basically a long list of numbers) and a label. My first job was to clean up the data so everything was the same size and could be used for training (the reason each row has to be the same size is so that it can be turned into a proper PyTorch tensor). Since the data for the first 30 days lacked an ARIMA prediction, I manually added a 0 term.

The model itself just followed the logistic regression logic I’ve described in previous blog posts. The only major hiccup was the progress bar not showing up properly, but that was easily fixed by importing the tqdm library instead of tqdm.notebook.

At first, the model just predicted the same label over and over — not helpful. So, I added a tiny feature: after every training round (or “epoch”), I printed the first ten predictions alongside their true answers. I then tweaked the parameters until there was sufficient variety.

Current results aren’t super great, however, so I’ll be meeting with my advisor next week to see how we can improve them. The model used is pretty simple, so we may turn to more sophisticated models (adding another layer, etc). Manually entering more training data is also a possibility.

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    makeen_s
    Hi Johnny! Its awesome you finished your LR classifier (please re-explain I beg)! What are you thinking for your final product?
    Tesla Lukow
    Hi Johnny! I agree with Makeen - another intuitive explaination would be very helpful (economics is definitely not my strong suit). Would it be beneficial to add another part to your AI that could read blog posts from influential people such as Trump?
    johnny_y
    Thank you for your question, Makeen! My final product will probably be a GitHub repository of code, with comments explaining what each part does.
    johnny_y
    Thank you for your question, Tesla! That could be interesting, though for this project I'm more focusing on the reactions of retail investors on Reddit to information like that, as measured through their posts and comments.

Leave a Reply

Your email address will not be published. Required fields are marked *