Week 4: An Introduction to Polynomial Regression
Welcome back! This week, I have been studying polynomial regressions and how I can apply them to my research.
What is Polynomial Regression?
Last week, I introduced different types of regression models, including linear and logistic regression. However, real-world data is often more complex than a simple straight-line relationship. This is where polynomial regression becomes useful.
While the formula for linear regression is:
y = β₀ + β₁x + ε,
a polynomial regression takes the following form:
y = β₀ + β₁x + β₂x² + … + βₙxⁿ + ε
The Challenge: Finding the Right Degree (n)
One of the most challenging aspects of polynomial regression is determining the optimal degree (n).
Avoiding Underfitting
If we assume too low a polynomial degree when the actual trend is more complex, our model will underfit the data. This means it oversimplifies patterns and fails to capture important nuances in job displacement trends.
Avoiding Overfitting
On the other hand, using too high a polynomial degree can lead to overfitting, where the model captures noise instead of meaningful patterns. Overfitting makes predictions highly sensitive to small fluctuations in data, leading to unreliable results and losing generalizability.
How can we determine the optimal degree?
There are multiple ways to do this:
1. Visual Inspections
One simple and intuitive way to determine the degree is by visually inspecting the data. You can try different degrees and pick the one that best captures the general shape of the data without becoming overly complex.
2. AIC/BIC:
These information criteria are used to penalize models for having too many parameters. By incorporating this penalty, both AIC and BIC help prevent overfitting and encourage models that are simpler, while still fitting the data well.
3. Adjusted R-squared:
Adjusted R-squared adjusts for the number of predictors and penalizes unnecessary complexity. It ensures that the model strikes a balance between fit and simplicity, guiding you to the degree that captures the underlying trends without overfitting the data.
What’s next?
In the coming weeks, I will be running polynomial regressions, using the demographic as the independent and AI Exposure as the dependent variable. If you have any thoughts or questions about polynomial regression or my project, feel free to drop a comment below!

Comments:
All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.