Remix.run Logo
monsieurbanana 3 days ago

Specialized is probably not the word I'd use, because llms are generally useful to understand more specialized / obscure topics. For example I've never randomly heard people talking about the dicom standard, llms have no trouble with it.

phil21 3 days ago | parent | next [-]

I think there is a sweet spot for the training(?) on these LLMs where there is basically only "professional" level documentation and chatter, without the layman stuff being picked up from reddit and github/etc.

I was looking at trying to remember/figure out some obscure hardware communication protocol to figure out enumeration of a hardware bus on some servers. Feeding codex a few RFC URLs and other such information, plus telling it to search the internet resulted in extremely rapid progress vs. having to wade through 500 pages of technical jargon and specification documents.

I'm sure if I was extending the spec to a 3.0 version in hardware or something it would not be useful, but for someone who just needs to understand the basics to get some quick tooling stood up it was close to magic.

i_cannot_hack 3 days ago | parent | prev | next [-]

The standard for obscurity is different for LLMs, something can be very widespread and public without the average person knowing about it. DICOM is used at practically every hospital in the world, there's whole websites dedicated to browsing the documentation, companies employ people solely for DICOM work, there's popular maintained libraries for several different languages, etc, so the LLM has an enormous amount of it in its training data.

The question relevant for LLMs would be "how many high quality results would I get if I googled something related to this", and for DICOM the answer is "many". As long the that is the case LLMs will not have trouble answering questions about it either.

aleph_minus_one 3 days ago | parent | prev [-]

> llms are generally useful to understand more specialized / obscure topics

A very simple kind of query that in my experiences causes problems to many current LLMs is:

"Write {something obscure} in the Wolfram programming language."

AlotOfReading 3 days ago | parent [-]

One tendency I've noticed is that LLMs struggle with creativity. If you give them a language with extremely powerful and expressive features, they'll often fail to use them to simplify other problems the way a good programmer does. Wolfram is a language essentially designed around that.

I wasn't able to replicate in my own testing though. Do you know if it also fails for "mathematica" code? There's much more text online about that.

aleph_minus_one 3 days ago | parent [-]

> Do you know if it also fails for "mathematica" code?

My experience concerning using "Mathematica" instead of "Wolfram" in AI tasks is similar.