Makes you think whether llama progress is not doing too well and/or perhaps we're entering a plateau for llm architecture development.

▲

butlike 5 days ago | parent [-]

The article got me thinking that there's some sort of bottle neck that makes scaling astronomical or the value just not really there.

1. Buy up top talent from other's working in this space

2. See what they produce over say, 6mo. to a year

3. Hire a corpus of regular ICs to see what _they_ produce

4. Open source the model to see if any programmer at all can produce something novel with a pretty robust model.

Observe that nothing amazing has really come out (besides a pattern-recognizing machine that placates the user to coerce them into using more tokens for more prompts), and potentially call it on hiring for a bubble.

	▲	aleph_minus_one 5 days ago \| parent [-]
		> Observe that nothing amazing has really come out I wouldn't say so. The problem is rather that some actually successful applications of such AI models are not what companies like Meta want to be associated with. Think into directions like AI boyfriend/girlfriend (a very active scene, and common usage of locally hosted LLMs), or roleplaying (in a very broad sense). For such applications, it matters a lot less if in some boundary cases the LLM produces strange results. If you want to get an impression of such scenes, google "character.ai" (roleplaying), or for AI boyfriend/girlfriend have a look at https://old.reddit.com/r/MyBoyfriendIsAI/