contrastive circuit sharpening

inspiration

analogy

to human learning: when we get the answer wrong, it helps a lot to be told what we could've done differently.

procedure

for a given contrastive target, generate multiple samples.
rank them and use a continuous transform/approximation of that ranking as a contrastive loss. this should be structured such that only the single best or handful of really good samples are given good scores