A Look Behind the Code (2/26/2024)

Sophia L -

Hi everyone!! Welcome back to my blog. This past week, I have continued my data analysis and mining. One difficulty I have encountered is making decisions based on what will be best for my AI model and what results I want. These decisions involve what code to use, what neural network to use, what database to use, what coding software to use, and many more. Even at the start, I wanted to specify my work to Arizona-specific high school dropouts because that is where I do the majority of my education policy work. However, the existing data surrounding Arizona is not well-suited for my project. Because of the way the Arizona Board of Education stored dropout data, the recent data was well-organized by specific variables (something I needed for my project), but the old data was not, and since I needed a wide range of data to be able to perform predictions, I was unable to continue with just Arizona specific data.

In regards to my code and memory networks, I wanted to explain my thought processes behind why I chose those as well. For my project, I decided to use Long Short-Term Memory (LSTM) networks to forecast high school dropout rates, but why? In my scenario, recognizing the complex, sequential nature of educational data—which often contains underlying patterns across time—makes using specific memory networks difficult to use, so my approach leverages LSTM’s capacity for capturing long-term dependencies, making it a fitting choice for this task.

For some background, Long Short-Term Memory (LSTM) networks are a type of Recurrent Neural Network (RNN) specialized in handling sequences of data for tasks such as time series forecasting, natural language processing, and sequence prediction problems. LSTMs were introduced by Hochreiter & Schmidhuber in 1997 to overcome the limitations of traditional RNNs, primarily their difficulty in learning long-term dependencies due to the vanishing gradient problem.

While traditional RNNs theoretically can learn to retain information over long periods, in practice, they struggle due to the vanishing gradient problem, where gradients become too small for the network to learn effectively. LSTMs mitigate this issue through their gating mechanism, allowing them to make more robust predictions based on long-term historical data–like my high school dropout data!

LSTMs are particularly well-suited for time-series forecasting due to their ability to capture long-term dependencies in sequential data. This characteristic is crucial for educational data analysis, where trends and patterns may unfold over extended periods. The structured approach of LSTMs, with their memory cells and gates, allows for effective learning from historical data to predict future outcomes, such as dropout rates.

I hope this explanation helped you better understand all the nuances of my specific decisions and how something as small as networks or code can impact the trajectory of research.

Can’t wait to share more with you throughout the weeks!

Best,

Sophia Lin

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    ethan_f
    Hey Sophia! This was super interesting! How might the choice of using LSTM networks for analyzing high school dropout rates impact the interpretation and application of the research findings, especially when considering the long-term trends and potential policy implications?
    Anonymous
    Sophia, great post explaining the nuances of your specific decisions. Great question, Ethan!
    jana_e
    Wow Sophia, this sounds complicated! The choice to use LSTM networks seems well thought out! Just curious, are there any other RNN types you might have considered using? How would using a different network have impacted your results?

Leave a Reply to ethan_f Cancel reply

Your email address will not be published. Required fields are marked *