Week 6: Analyzing AI Exposure Through Polynomial Regression

Akshita K - March 23, 2025 4:21 pm

Welcome back to another week of my senior project! After exploring linear regression in the previous post, this week I’ve been analyzing the relationship between various demographic factors and AI exposure using a polynomial regression to see if a more complex model reveals any non-linear patterns that we didn’t capture with linear regression.

Note: I applied polynomial regression only to quantitative variables (such as income, age, etc.)—not categorical variables (like sex, race, and level of education).

In this post, I’ll walk you through how I approached polynomial regression, how I selected the right degree for the model, and then I’ll dive into the results.

Polynomial Regression: How I Chose the Right Degree

As I explained in my Week 4 post, the most difficult part of polynomial regression is selecting the optimal degree:

1. Testing Different Degrees: I tried polynomial regressions of varying degrees (1, 2, 3, and 4) for each variable to test how well they fit the data. A higher degree allows for more flexibility in the model, but it also increases the risk of overfitting, where the model fits the noise in the data rather than the actual underlying trends.

2. Evaluating Model Performance: To decide on the best degree, I compared the Mean Squared Error (MSE) between the training and test datasets. A good polynomial model will have a low MSE on both the training and test sets, indicating that the model is both accurate and generalizable. I looked for the point where increasing the degree did not significantly lower the MSE.

3. Choosing the Right Degree: After testing different degrees, I found that a degree of 2 (quadratic) provided the best balance between fitting the data well and avoiding overfitting for all models. For most variables, a higher degree (3 or 4) didn’t improve the model much, so I chose to keep it simple with the quadratic model.

Regression Results

Now, let’s go over the results of the polynomial regressions for each demographic factor.

You can find the full regression results here.

AI Exposure and Age

The positive linear and quadratic terms indicate an upward sloping curve (concave up). This means AI exposure increases with age, but the rate of increase is faster for older workers.

AI Exposure and Wage

The positive linear and negative quadratic terms for the regression mean that though the slope of the curve is initially positive, it becomes less positive over time (the curve is concave down). This means that workers in low-wage jobs had very little exposure to AI, and exposure significantly increased as wages moved into the middle to upper ranges. However, as wages grew even higher, the effect of AI exposure on these workers plateaued.

AI Exposure and Duration of U.S. Residency

The negative linear and positive quadratic terms for the regression mean that though the slope of the curve is initially negative, it becomes less negative over time (the curve is concave up). This suggests that newer immigrants initially work in sectors less vulnerable to AI, but over time, as they move into higher-skilled positions, their exposure to AI grows. However, this increase slows down after a certain point, indicating a leveling effect over time.

What’s Next?

In the upcoming weeks, I’ll be exploring the relationship between occupation type (white-collar vs. blue-collar) and AI exposure. I’ll also group occupations by industry (e.g., healthcare, IT, manufacturing) to uncover sector-specific patterns.

Thanks again for following along, and feel free to share any questions or thoughts in the comments below!

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

camille_bennett

Hi Akshita, it's fascinating to see how the regression models allow you to analyze the data! Do you have any hypotheses as to why vulnerability to AI plateaus at certain point with both wages and duration of US residency?

March 25, 2025 at 10:42 am - Reply

akshita_k

Thank you for your question Ms. Bennett! The plateau in vulnerability to AI at higher income levels occurs because, while higher-income jobs initially face more exposure to automation, many of these roles require creativity and decision-making that AI cannot easily replicate. Once the more automatable tasks are replaced by AI, the vulnerability levels off as the remaining tasks are harder to automate, reducing the overall impact. As for the duration of U.S. residency, the impact of how long someone has lived here starts to level off over time. For example, someone who's been in the U.S. for one year will likely have very different job conditions compared to someone who's been here for five years, but the difference between 10 or 15 years becomes much less noticeable. Eventually, the occupation type and job conditions of immigrants become similar to those of natives, and exposure to AI plateaus. To sum it up, beyond a certain point, additional years of residency have a smaller impact.

April 2, 2025 at 11:07 am - Reply