Mahita v's Senior Project Blog

Project Title: Beyond the Diagnosis: Identifying Racial and Socioeconomic Inequalities in Breast Cancer Care
BASIS Advisor: Dr. Travis May
Internship Location: Purdue University
Onsite Mentor: Dr. Alina Arseniev-Koehler



Project Abstract

According to the CDC, in 2021, 272,454 new breast cancers were reported in females, and in 2022, 42,211 females died from breast cancer. Breast cancer outcomes are influenced not only by biological factors but also by access to timely and effective healthcare. Yet, there are persistent disparities in access to care and outcomes. My research aims to uncover these disparities in treatment and preventive care that breast cancer patients from marginalized racial groups face. To achieve this, I will use a unique database, All of Us, which prioritizes inclusivity and diversity—offering insights into healthcare access that have not been explored in previous studies. Understanding these influences with a representative database is crucial to identifying structural inequalities that affect patient outcomes, enabling targeted interventions to improve equitable access to care. A 2022 study identified 6 important themes for breast cancer patients: information, psychosocial support, health insurance, financial resources, timeliness, and emotions. I plan to conduct a retrospective observational study and focus my analysis specifically on insurance and psychosocial support. I will perform bivariate analyses and apply a multivariate logistic regression with socioeconomic status, insurance status, geographic location, and age as predictors to identify which groups have increased odds of facing issues with healthcare access. My methodology can also be established as a framework for public health professionals to evaluate access for a diverse range of conditions.

    My Posts:

  • At the end

    Hello, and welcome back to my last senior project blog post! Thank you to everyone who has read these posts, and I hope you’ve learned something along with me during this project.  For my final week, I first made a couple of adjustments to my logistic regression model by changing the variables for the poverty... Read More

  • Data Tables and Design

    Welcome back to Week 9 of my senior blog post! This week, I focused on preparing for my final presentation, which meant diving deep into both my data and my design skills. I spent the majority of my time working on creating clear, detailed slides that could effectively communicate my findings. One of the more... Read More

  • The Finishing Touches

    Welcome back to week 8 of my senior blog posts. This week I finished up my analysis for both of my survey themes. I also went back and conducted chi-square tests to clarify the results of my logistic regression. I used a Chi-Square Test of Independence to examine whether a patient's race was associated with... Read More

  • Ctrl C. Ctrl V.

    Hello and welcome back to my senior blog post! My post is a little shorter this week since I mostly just repeated the same methodology for my psychosocial category, and I didn’t want to bore you with repetition. This week, I focused on cleaning my data for psychosocial support. To quickly clarify, psychosocial support refers... Read More

  • Save yourself. And your work.

    Welcome back to my blog posts. This week, I aimed to modify my logistic regression (LR) to add more covariates. Initially, I was going to add education, income, and comorbidities. However, I chose to only include income for a couple of different reasons. First, I excluded comorbidities because the outcome of my project isn’t whether... Read More

  • Paying Attention to the Details: Unpacking the Numbers Behind Insurance Struggles

    Hello, and welcome back to Week 5 of my Senior Research Project! Last week, I made good progress with cleaning my data and creating my logistic regression for the insurance-related survey questions. But, before I get into interpreting the results and the odds ratios, I wanted to highlight the importance of reading carefully. The first... Read More

  • Dubious Data and Statistical Analysis

    Last week, I attempted to use MICE imputation to resolve the missing values in my insurance type column. However, after some research and advice from my mentor, I decided to switch to mode imputation for the categorical columns and mean imputation for the continuous ones (excluding the race column, since the NA values there had... Read More

  • Chasing MICE: My Struggle to Fill in the Blanks

    Here we are at week 3 of my SRP. Last week, I spent quite a bit of time on my ethics trainings and started to clean my data for the insurance theme. I ran into a problem with the insurance type column as around 40% of the data was missing. I could’ve just used mode... Read More

  • The many meanings of NA: Not Available, Not Applicable, or Not Answered?

    Thanks for checking back in for Week 2! ​​As a quick refresher, my goals last week were to complete my All of Us training and categorize the survey questions into themes: financial resources, psychosocial support, insurance, emotions, timeliness, and information. To my surprise, I completed both tasks and even began cleaning my data for the... Read More

  • (How) To be or Not to be Ethical when conducting research

    Hello and welcome to my first blog post! Since my project uses human data, I needed to complete some training courses prior to working with the All of Us database. The program strives to ensure that researchers don’t utilize too many direct identifiers in order to maintain the confidentiality of the participants. So, this week... Read More