Remix.run Logo
NitpickLawyer 7 days ago

There's prior research that finds a connection between model depth and "reasoning" ability - https://arxiv.org/abs/2503.03961

A depth of 4 is very small. It is very much a toy model. It's ok to research this, and maybe someone will try it out on larger models, but it's totally not ok to lead with the conclusion, based on this toy model, IMO.