Data Analysis

Priya V -

Hello everyone! 

This week has definitely been a frustrating one in terms of my senior project, as I am certainly no expert when it comes to statistics. I’ve mainly been focusing on data analysis for my survey data, which has involved a lot of numbers! I’m learning a lot about different types of correlations and what they’re used for, and I’ve had the toughest time deciding between Pearson’s and Spearman’s correlations for my data. 

My data doesn’t quite fit the requirements to use Pearson’s (Turney), as I ran a normality test, which came back non-normal (DataTab Team). But, it’s more standard in the field and provides information on a linear relationship, which is more specific and informative generally, and sources disagree on how much non-normality affects the accuracy of Pearson’s correlation (Kowalski 1972). Meanwhile, Spearman’s correlation can be used within the constraints of non-normal data, but shows a monotonic relationship between the two variables, which refers to any positive or negative relationship, not necessarily a linear one (Aerd Statistics).

I’ve ultimately decided to run and report both tests, and as I learn more, I may choose one or the other, or add further nuance to the way that I discuss each correlation. Ultimately, the role of inferential statistics is to demonstrate to the reader and the broader scientific community what conclusions and future questions can be drawn from your data. These correlations that I’m looking for are meant to find relationships among the variables that I’m studying, as well as representing them as accurately as possible.

Here’s a fun picture of my spreadsheet! It’s coming along nicely!

– Priya <3

 

Links:

(DataTab Team) – https://datatab.net/tutorial/test-of-normality 

(Aerd Statistics) – https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-statistical-guide-2.php 

(Kowalski 1972) – https://www.jstor.org/stable/2346598?read-now=1&seq=5#page_scan_tab_contents 

(Turney) – https://www.scribbr.com/statistics/pearson-correlation-coefficient/

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    camillebennett
    Very cool data analysis. Can you expand on what it means that the test came back "non-normal"?
    colin_k
    Hi Priya, I noticed that you use colors to break up certain areas of your spreadsheet, do they have any specific meaning like certain categories or are they just there to help break up the information?
    priya_v
    Hello Ms. Bennett! Non-normal in terms of data means that my data does not fall under a "normal distribution," which appears as a bell curve! The test that I ran essentially gives me the probability that the deviation in my data would still allow it to be considered like a normal dataset, but the results said that it was way too different from a normal distribution. Therefore, I'm not able to use the Pearson's correlation and guarantee that it will correctly portray my data.
    priya_v
    Hi Colin! I use the colors to let me know which main scale each column refers to. The green columns are the 4 subscales for burnout, and the blue columns are the 4 subscales of emotional labor. The purple columns are the main scale data, so emotional labor, burnout, and COVID-19 anxiety, and then the warm colors at the bottom are my descriptive statistics: mean, median, and standard deviation. This makes it much easier to find what I'm looking for when I'm trying to correlate two columns to each other, as well as making the whole sheet easier to look at.

Leave a Reply