Week 10: Correlations to Conclusions

May 6, 2026

What’s up, stats specialists! Following last week, I’ve now taken the next step in my analysis to determine whether the correlations I observed are truly statistically significant, or whether they could be the result of chance through a t-test.

Setting Up Hypotheses

Before running my calculations, I needed to define what I was testing. For each of the six comparisons in my dataset, namely airtime vs. search frequency and airtime vs. social media post frequency across three different drugs, I set the following hypotheses:

Null Hypothesis: There is no linear relationship between advertising airtime per month and the public awareness metrics.

Alternative Hypothesis: There is a linear relationship between advertising airtime per month and the public awareness metrics.

The alternative hypothesis in my study simply states that a relationship exists, rather than specifying whether it is a positive or negative relationship. I’ve used a two-tailed test, since my earlier analysis and observations indicated that some drugs showed positive correlations while others showed negative ones. Since I can’t assume the direction of the relationship for all three drugs I’ve analyzed, I’ve chosen this two-tailed test to apply a more rigorous approach.

Pearson Correlation and T-Test

For my statistical analysis, I’ve used the Pearson correlation coefficient (r) combined with a t-test for significance. The Pearson r measures the strength and direction of the relationship between airtime and the various awareness metrics, and the t-test will determine whether that correlation is statistically significant.

The t-statistic is calculated using the formula:

In this formula, r represents the Pearson correlation coefficient which can be found by square rooting the R values determined by the graphs from last week. N represents the number of data points, being 24 months across 2023 and 2024, and with additional degrees of freedom, it would be df = 22.

Once the t-statistic is calculated, it will be compared against a critical value from the t-distribution table. For a two-tailed test at a significance level of a = 0.05 with df = 22, the critical value is 2.074. If the absolute value of the t-statistic calculated is over 2.074, the result is statistically significant, and I reject the null hypothesis.

Bonferroni Correction

Since I’m running six separate t-tests at once, there may be an increased risk of getting a false positive by chance. To control for this, I applied a Bonferroni correction, which divides the significance threshold (a = 0.05) by the number of tests being conducted (6) to get 0.0083.

The closest critical value on the t-distribution table for df = 22 at this new threshold is 2.819. Any result that may exceed this higher value can be considered significant even after acknowledging that multiple comparisons are being made.

Airtime vs. Search Frequency Results

Drug	t (Search Frequency)	Outcome
Ozempic	2.2656	Sig at a = 0.05
SKYRIZI	-0.3339	Not Sig
DUPIXENT	-1.0630	Not Sig

For Ozempic, the t-statistic of 2.266 exceeds the critical value of 2.074, meaning that I can reject the null hypothesis where a = 0.05. There is a statistically significant positive relationship between Ozempic’s advertising airtime and search frequency. However, this does not clear the corrected threshold of 2.819.

For SKYRIZI and DUPIXENT, the absolute t-values of 0.334 and 1.063 do not exceed the critical value, meaning that I fail to reject the null hypothesis for these drugs as there is insufficient statistical evidence of any type of relationship, positive or negative, between airtime and search frequency.

Airtime vs. Reddit Post Frequency Results

Drug	t (Reddit Post Frequency)	Outcome
Ozempic	2.5587	Sig at a = 0.05
Skyrizi	-0.9325	Not Sig
Dupixent	-1.3258	Not Sig

The social media post frequency analysis yields similar results. Ozempic again has the only significant result, with a t-statistic of 2.559 exceeding the threshold of 2.074, being the strongest result amongst all six comparisons without clearing the corrected threshold of 2.819. SKYRIZI and DUPIXENT also similarly show no statistically significant relationship.

In the table, the negative t-values for SKYRIZI and DUPIXENT reflect their negative Pearson correlation coefficient r values, meaning that in the data, higher airtime months correlated with lower Reddit post activity for these drugs. This may suggest that Reddit discussion and search frequency for these drugs is driven more by patient discussion and word-of-mouth rather than by advertising exposure.

Airtime vs. Sales Results

For airtime vs. sales results, I cannot conduct a statistical analysis as there are only two datapoints on the graphs across the three drugs, meaning that a Pearson r can not be calculated. Instead, the main analysis I can conduct for the following data is the simple observational conclusions I came to on my last post.

Conclusions on Analyses

Across all six comparisons for airtime vs. search frequency and social media post frequency, only Ozempic produced statistically significant results. This aligns pretty well with what I observed visually in my previous post, as I concluded that Ozempic’s positive correlations appear to display a relationship between advertising activity and measurable public engagement, while SKYRIZI and DUPIXENT’s trends are not strong enough to rule chance out as an explanation.

However, overall, statistical significance by itself does not prove that advertising inherently caused increases in awareness. There may be a range of other factors, such as Ozempic’s celebrity endorsements, widespread media coverage, and the other broader conversations about weight loss drugs, especially as the oral version has become more popularized amongst other injection drug forms, that are contributing to its positive relationships. The t-tests in my study confirm that the positive relationship between airtime and awareness factors for Ozempic is real and not simply chance, even if advertising is not the sole factor causing these correlations.

Next time, I’ll be drawing the final conclusions surrounding my study, diving deeper into whether my results were what I anticipated in comparison to past methodologies. See you in the next one!

Sources Referenced

Bobbitt, Zach. “How to Perform a T-Test for Correlation.” Statology, 14 July 2021, www.statology.org/t-test-for-correlation/.

‌“Bonferroni Correction Explained: Managing Multiple Testing in Statistics | Amplitude.” Amplitude, 2024, amplitude.com/explore/experiment/what-is-bonferroni-correction.

“Programming Col Sig (Correlations): Classical Student Test.” Askia Help Centre – Automating Insight, 2022, support.askia.com/hc/en-us/articles/360000410117-Programming-Col-Sig-Correlations-Classical-Student-Test.

View more of Caitlin T.'s posts.

Week 10: Correlations to Conclusions

Reader Interactions

Leave a Reply Cancel reply