week 6: Temporality
May 5, 2026
This week I investigated the temporality of adverse events. Most of this project relies on analyses that ask about the type of patient that gets adverse events, as we are uniquely empowered to answer these questions with our dataset. However, another unique feature of our dataset is our temporal resolution. Because our RAG aggregates notes into one prompt per 90 days of clinical notes, we classify a patient as having an ae every 90 days, so we can estimate when that patient had an adverse event, not only if they had an adverse event. Naturally, the first question I tried to ask was “which adverse events occur earliest”, however answering it was not simple. There are two factors that confound this analysis: overall event occurrence, and censoring within our cohorts.
The first analysis to come to mind is a simple kaplan meier or cumulative incidence curve. These plots show the “survival rate” of a cohort over time ( the percent of the cohort that is adverse event positive or free at a given timepoint, depending on the curve). The reason why these cannot answer this question is that they are mainly confounded by overall occurrence. If one event happens at a greater frequency than another, that will alter the “shape” of a kaplan meier curve far more than temporal onset patterns will. Essentially, we need some analysis that normalizes event occurrence to the overall occurrence of that event, we need to plot “the percent of patients who have ever developed an specific adverse event, who developed that event at a given timepoint”.
The second problem is cohort censoring. Cohort’s “at risk” population decreases over time, as patients stop treatment, start new therapies, leave trials, die, or get events. This means that the denominator of our equation decreases over time, and we need to select for that. The standard way of doing this, subsetting our cohort to patients with a set minimum follow up time, leads to selection bias, as our cohort would be of patients who did well on therapy. So, our denominator needs to change over time and be a function of the at risk population at a given timepoint for that adverse event.
We ended up with the below plot, and there are very clear patterns. What interests me most is the ability to suggest mechanistic differences from the temporal onset of these adverse events. Because we normalized to get rid of occurrences in our signal, we can see that mathematical features of these curves, like concavity and maxima point, differ between immunotherapy patients and not for some adverse events and others. My preliminary hypothesis is that these mathematical features are a function of the biological mechanism behind these adverse events. For example, colitis stays the same in the ICI and non ICI cohort, suggesting that the mechanism is not immune related in the way adrenal insufficiency is immune related. The curves for adrenal insufficiency are very different. This looks like a promising new research direction to me, and i plan on looking into it next week

Leave a Reply
You must be logged in to post a comment.