Week 2

Aarshdeep Singh N -

Hello everyone, and welcome to my week 2 updates of the senior research project. After spending my first week learning the fundamentals of reinforcement learning, this week I shifted my focus toward finding open-courseware and resources to start training my own RL model. The transition from theory to implementation has been exciting, as I explored various platforms and frameworks that provide environments for reinforcement learning experiments. Understanding how to structure an RL model, define states and actions, and set up reward mechanisms has been a key part of my learning process.

Alongside this, I also delved into the mathematical foundations necessary for working with reinforcement learning algorithms, particularly Bellman’s Equation. This equation is fundamental in RL because it expresses the relationship between the value of a state and the expected rewards from future actions. By breaking down complex decision-making into recursive value functions, Bellman’s Equation helps optimize an agent’s long-term rewards. Understanding the mathematics behind it, including Markov Decision Processes (MDPs), dynamic programming, and value iteration, is crucial for building more efficient RL models.

This week has been all about bridging the gap between theoretical knowledge and practical implementation. As I continue my research, I am excited to begin experimenting with different algorithms and testing reinforcement learning models in various environments. Stay tuned for more updates as I make progress in developing and refining my own RL system!

 

 

 

 

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    adam_b
    Hi Aarsh! Fascinating Project! What environment or algorithm are you most excited to test first?
    aarshdeep_singh_n
    Hi Adam! Thank you so much for your kind comment. The first environment I will be testing will be a simple cartpole model. The model will try to balance a stick on a moving cart. It's very similar to trying to balance the tip of your pencil on your finger. Although its not the most flashy of first environments, its a very basic one to understand, and would lay the framework to better understand future programs.

Leave a Reply

Your email address will not be published. Required fields are marked *