▲ | carbocation a day ago | ||||||||||||||||
One thing that is confusing about this write-up is that "DeepConf-low" is only mentioned once and in a screenshot, but it seems to outperform DeepConf-high for several tasks. I guess I'll need to read the underlying paper, but that seems troublesome. | |||||||||||||||||
▲ | swores a day ago | parent | next [-] | ||||||||||||||||
Copied from the paper (halfway down page 6: https://arxiv.org/pdf/2508.15260 ) > "Specifically, DeepConf-low uses top η= 10% (corresponding to the 90th percentile) and DeepConf-high uses top η = 90% (corresponding to the 10th percentile) uniformly across all settings. This threshold ensures that during online generation, traces are terminated when their confidence falls below the level that retains the top η% highest-confidence traces from the warmup phase." I'm not sure if I'm parsing it right, but are they using "low" and "high" as descriptors of the number used as the %, meaning that the "low" 10 cuts anything outside the best 10%, while the "high" 90 leaves the best 90% ie high is less selective than low? | |||||||||||||||||
| |||||||||||||||||
▲ | cubefox a day ago | parent | prev [-] | ||||||||||||||||
It's likely confusing because it was written by an LLM. | |||||||||||||||||
|