Remix clone Hacker News

new | show | ask | jobs Github

	▲	aleksiy123 3 hours ago
		Yes, but I think Google was playing that strategy from essentially day 1 or very early in this AI race, where as the others are there now because of their lack of access of compute. The general narrative I would read on HN/others, was that Google would be able to outlast/outcompete OpenAI and Anthropic because Google had both more money and more compute. Playing the game of subsidizing their most capable models to capture market share longer than the VCs could. But instead I feel like Google opted out of that much earlier. Shifting their focus on efficiency and scaling much much earlier. Flash and Gemma being where Google was actually ahead of the competition while everyone was focused on bigger more capable models. In the last month the environment has changed, compute is constrained, costs for consumers are way higher than expected. Copilot pretty much imploded, and I'm guessing both Anthropic and OpenAI are starting to feel the squeeze. My personal opinion was this was necessary because integrating AI into products like AI overview, search meant scaling to billions of users was a requirement right out of the gate. And theres not enough money/compute no matter who you are to use frontier models for that.
	▲	throwaway219450 2 hours ago \| parent \| next [-]
		It benefits Google's bottom line to have very capable small models that can cheaply cache results for search queries, even if they're frequently wrong. But I wonder if they use Gemini for the top X% of search terms to try and get better retention? Also the TPU vertical gives a good advantage here. I've never been super impressed with Gemini out of the box, but surely, surely, Google is best positioned here. As a consumer, 24-32 GB VRAM is affordable ($1-2 k) and that's the frontier I'm most interested in. It's very "two papers down the line". Those models are also feasible to fine-tune, unlike the O(100+B) behemoths. The 4000 Pro Blackwell has very good TDP compared to people insisting on using 300-600W gaming cards. If I was freelancing, I would definitely consider getting a 6000 for work.
	▲	scottyah 2 hours ago \| parent \| prev [-]
		They also just have the resources- both in $$ to spend time optimizing, but the people like Jeff Dean who have already been focused on AI efficiency for a long time.