| ▲ | xlayn 5 hours ago | |
Fair point on the writing style, I used Claude extensively on this project, including drafting. The experiments and ideas are mine though. On the prior art: you're right that layer duplication has been explored before. What I think is new here is the systematic sweep toolkit + validation on standard benchmarks (lm-eval BBH, GSM8K, MBPP) showing exactly which 3 layers matter for which model. The Devstral logical deduction result (0.22→0.76) was a surprise to me. If there are ComfyUI nodes that do this for image models, I'd love links, the "cognitive modes" finding (different duplication patterns that leads to different capability profiles from the same weights) might be even more interesting for diffusion models. | ||
| ▲ | abhikul0 an hour ago | parent [-] | |
I only know of this one: https://github.com/shootthesound/comfyUI-Realtime-Lora. Haven't played with any layer manipulation though. | ||