Remix.run Logo
hermit_dev 10 hours ago

The future of AI is specialization, not just achieving benevolent knowledge as fast as we can at the expense of everything and everyone along the way. I appreciate and applaud this approach. I am looking into a similar product myself. Good stuff.

reverius42 8 hours ago | parent | next [-]

Ironically that was also the past of AI. In 2016 it was all about specialized models (not just training data, everything including architecture and model class/type) for specific tasks and that's the way things had been for a long time.

Are you suggesting that it's an aberration that from ~2019 to ~2026 the AI field has been working on general intelligence (I assume this is what you mean by "achieving benevolent knowledge")?

Personally I think it's remarkable how much a simple transformer model can do when scaled up in size. LLMs are an incredible feat of generalization. I don't see why the trajectory should change back towards specialization now.

holoduke 8 hours ago | parent | prev [-]

I don't think that's true. Nothing points to specialized LLMs being better. General purpose LLMs are just much more useful in daily work.

hermit_dev 36 minutes ago | parent [-]

To be more specific, I think the future is local and specialized. IBM among others thought the same way with their giant mainframe centralized computers and the original way people would utilize software in the 70s. It's an interesting parallel to today's cloud if you think about it. It's just not scalable from a resource (hardware), energy, and cost perspective. I think we're living a unique time, but it's going to change. Without continued massive funding and a pivot to sustainable, things will (and should) change.

Don't get me wrong, general intelligence will always be important and should be a part of specialist models to a degree for understanding, but it doesn't make sense to use an 800B+ parameter model to help write an email or do research on company trends. Hell, look at what China has been able to do. Qwen 3.5 9B, exceeds Claude 3.5 Haiku and nears Sonnet 3.5 levels. The 27B variation of Qwen 3.5 is superior to both in many ways and even rivals newer models. There is obviously an inherit lag behind, but we will gradually see a shift as these models become more capable.

Right now we are chasing 1-2% improvements at the cost of billions. Local are already absurdly capable (more and more by the day - same with cloud ofcourse) and smarter than most people in specific areas. To do most jobs, can we honestly say it requires a PhD or higher level understanding to perform? We're chasing something that is becoming more and more not needed from a general day to day perspective. AGI is outstanding, but not practical (at least today). I think we'll get there anyway at our current trajectory (though dangerous), but I suspect things will shift.