9. Mitigation Tools

May 9, 2026

In the last post, we talked about the detection tools. Now we’ll talk about the next part of the framework, the mitigation, and how we choose when to select those tools.

Greedy decoding is just when we directly decode the output tokens (i.e. turn the numbers into words) without any intervention.
DoLa (Decoding by Contrasting Layers) is a truth-amplification technique used when the model is somewhat uncertain. It compares the mature final layer of the model with premature early layers. This is because it is hypothesized that lower layers track surface-level things like linguistic patterns and local syntax, while higher layers evolve to represent complex factual knowledge. To find the most effective contrast where the model’s internal representation is undergoing the most significant change, the system dynamically selects a premature layer that has the highest Jensen-Shannon Divergence from the final layer (our implementation uses a simpler version). By subtracting the early layer’s probability from the late layer’s probability, DoLa amplifies the factual signal and suppresses the stylistic noise.
In tree resampling, instead of choosing one word at a time, the model generates multiple potential “branches” of a sentence (a search tree). Then, TSV and Lookback Lens score each branch. The system then prunes the branches that look like hallucinations and selects the path with the highest truthfulness score.

D-DCD is the controller that combines the sensors in the previous post into a unified formula to determine which mitigation strategy should be used:

Risk = w1(1-LB)+w2(TSV)+w3(SAE) .

Based on the Risk Score, the model chooses one of three “Tiers”:

Tier	Strategy	When it triggers
Tier 1	Greedy Decoding	Low Risk: The sensors agree the model is being truthful. This is fast and efficient.
Tier 2	DoLa (Contrastive)	Medium Risk: The sensors detect a slight conflict. We trigger the DoLa method to help.
Tier 3	Tree Resampling	High Risk: The sensors detect a major hallucination. We stop to regenerate tokens.

We can modify the tier cutoffs and weights through adjusting parameters.

View more of Anna D.'s posts.

9. Mitigation Tools

Reader Interactions

Leave a Reply Cancel reply