See ML research papers from Apple. Their researchers prefered small models over LLM. So they thought researchers' effort would make up the lack of compute. Then the scale law hit them hard.