This is really cool. That's the good stuff. Did you notice any pattern in why models cluster? Shared training data or just similar architecture choices?