Week 8
May 18, 2026
This week I read a lot of interesting literature to inform my project. Below is my interpretation of the three most relevant papers I read. With the SAIGE-based associations coming next, I wanted to understand what a GWAS offers beyond a list of significant variants, since heritability, cross-trait correlation, and polygenic scoring would let me say more about the architecture of toxicity risk than any one hit on its own.
The first was Bulik-Sullivan et al. in Nature Genetics on LD Score Regression, which uses the slope of test statistics against LD scores to estimate SNP heritability and uses the intercept to separate true polygenic signal from confounding by population stratification or cryptic relatedness. Reading it shifted how I think about inflation in test statistics. I had been treating inflation as something to fix or apologize for, and LDSC reframes the intercept as an estimable quantity that tells me whether the inflation is mostly polygenic signal or mostly population structure, which moves the diagnostic question from whether the GWAS worked to what kind of signal it is showing. The heritability estimate also puts a ceiling on how much any germline predictor I build from a phenotype can ever explain, which is a more honest framing than reporting individual hits without anchoring how much of the disease they could plausibly account for.
The second was the companion Nature Genetics paper from the same group on cross-trait LD Score Regression and the atlas of genetic correlations across human diseases and traits, which extended the method to estimate genetic correlation between two phenotypes from summary statistics alone, without requiring overlapping samples. This reframed how I think about the relationship between my different toxicity phenotypes. I had been treating thyroiditis, hypophysitis, and adrenal insufficiency as separate GWAS endpoints that happen to share an organ system, and cross-trait LDSC lets me ask whether they share genetic architecture as a quantitative number rather than as an impression I form from overlapping hits. It also opens a comparison I had not been planning, between my irAE phenotypes and the spontaneous autoimmune diseases they resemble, since I can ask whether checkpoint-induced thyroiditis shares its genetic basis with Graves’ or Hashimoto’s at the polygenic level rather than only at the few loci that reach significance in either scan.
The third was Ge et al. in Nature Communications on PRS-CS, a polygenic risk score method that uses a Bayesian regression framework with a continuous shrinkage prior on SNP effect sizes and an external LD reference panel, and which improved prediction accuracy over previous methods in the simulations and Partners HealthCare Biobank evaluations they ran. This shifted what I think the deliverable of my project could look like in a clinical setting. I had been thinking in terms of a single large-effect variant, the kind of finding PREDICT-1 used for abacavir, and PRS-CS shows that for a polygenic toxicity the more realistic readout is a continuous score that aggregates many small effects across the genome. The implication for the translational end of my project is that the eventual germline test for irAE risk could end up being a polygenic score computed from the same panel data oncologists already have, which is a different kind of product from the single-allele template PREDICT-1 set and worth understanding the methodology for in advance.

Leave a Reply
You must be logged in to post a comment.