Week 4 Updates

Aarshdeep Singh N -

Hello everyone, and welcome back to Week 4 of my Senior Research Project on Reinforcement Learning!

This week was an exciting step forward—I finally ran my own reinforcement learning simulation using OpenAI’s Gym environment and trained an agent to balance a pole in the classic CartPole problem. I spent time going through multiple YouTube tutorials to understand how reinforcement learning (RL) models interact with environments and learn over time. Seeing my model slowly improve after each iteration was both frustrating and fascinating. It was a great hands-on experience that helped me visualize how an RL agent optimizes its actions based on rewards.

Alongside this, I worked on my first set of theoretical questions involving Bellman’s Equation. While I already understand the equation conceptually, solving these problems was challenging. They required applying the equation in different scenarios and thinking through the mathematical intricacies of state transitions and expected rewards. It definitely pushed me to think deeper about how value functions are computed and reinforced my understanding of dynamic programming in RL.

Looking ahead, I’ll also be diving deeper into explainable reinforcement learning—one of the most important aspects of my project. To get started, I’ll be looking for open courseware related to the research paper I studied last week on Interpretable and Explainable Logical Policies. The goal is to understand how we can make reinforcement learning more transparent and less of a black box, which is crucial for real-world applications where AI needs to be both powerful and trustworthy.

Thank you so much for reading through my week 4 update. If you have any questions, please don’t hesitate to ask.

Until next time, I will see you later!

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    camille_bennett
    Hi Aarsh, sounds like you are doing a lot of interesting work with OpenAI. Can you explain what the CartPole problem is?
    aarshdeep_singh_n
    Hi, Mrs. Bennett. Thank you for your comment! The cart pole problem is a complicated name for something very simple. Basically, it's the same idea as trying to balance a stick on your finger. As the stick falls, you want to move your finger directly below it so the stick doesn't tip over. Similarly, the cart pole problem is the same idea, but in a 2-D environment, and it's a pole on a minecart. Hope this helps!

Leave a Reply

Your email address will not be published. Required fields are marked *