Remix.run Logo
rollcat 4 hours ago

It's crazy how wildly inaccurate "top-of-the-list" LLMs are for straightforward yet slightly nuanced inquiries.

I've asked ChatGPT to summarize Go build constraints, especially in the context of CPU microarchitectures (e.g. mapping "amd64.v2" to GOARCH=amd64 GOAMD64=v2). It repeatedly smashed its head on GORISCV64, claiming all sorts of nonsense such as v1, v2; then G, IMAFD, Zicsr; only arriving at rva20u64 et al under hand-holding. Similar nonsense for GOARM64 and GOWASM. It was all right there in e.g. the docs for [cmd/go].

This is the future of computer engineering. Brace yourselves.

yomismoaqui 4 hours ago | parent | next [-]

If you are going to ask ChatGPT some specific tidbit it's better to force it to search on the web.

Remember, an LLM is a JPG of all the text of the internet.

dgfitz an hour ago | parent [-]

Wait, what?

Isn't that the whole point, to ask it specific tidbits of information? Are we to ask it large, generic pontifications and claim success when we get large, generic pontifications back?

The narrative around these things changes weekly.

wredcoll 43 minutes ago | parent [-]

I mean, like most tools they work when they work and don't when they fail. Sometimes I can use an llm to find a specific datum and sometimes I use google and sometimes I use bing.

You might think of it as a cache, worth checking first for speed reasons.

The big downside is not that they sometimes fail, its that they give zero indication when they do.

simonw 3 hours ago | parent | prev | next [-]

Did you try pasting in the docs for cmd/go and asking again?

Implicated 2 hours ago | parent [-]

I mean - this is the entire problem right here.

Don't ask LLMs that are trained on a whole bunch of different versions of things with different flags and options and parameters where a bunch of people who have no idea what they're doing have asked and answered stackoverflow questions that are likely out of date or wrong in the first place how to do things with that thing without providing the docs for the version you're working with. _Especially_ if it's the newest version, regardless if it's cutoff date was after that version was released - you have no way to know if it was _included_. (Especially about something related to a programming language with ~2% market share)

The contexts are so big now - feed it the docs. Just copy paste the whole damn thing into it when you prompt it.

pbronez 3 hours ago | parent | prev [-]

How was the LLM accessing the docs? I’m not sure what the best pattern is for this.

You can put the relevant docs in your prompt, add them to a workspace/project, deploy a docs-focused MCP server, or even fine-tune a model for a specific tool or ecosystem.

Implicated 2 hours ago | parent [-]

> I’m not sure what the best pattern is for this.

> You can put the relevant docs in your prompt

I've done a lot of experimenting with these various options for how to get the LLM to reference docs. IMO it's almost always best to include in prompt where appropriate.

For a UI lib that I use that's rather new, specifically there's a new version that the LLMs aren't aware of yet, I had the LLM write me a quick python script that just crawls the docs site for the lib and feeds the entire page content back into itself with a prompt describing what it's supposed to do (basically telling it to generate a .md document with the specifics about that thing, whether it's a component or whatever, ie: properties, variants, etc in an extremely brief manner) as well as build an 'index.md' that includes a short paragraph about what the library is and a list of each component/page document that is generated. So in about 60 seconds it spits out a directory full of .md files and I then tell my project-specific LLM (ie: Claude Code or Opencode within the project) to review those files with the intention of updating the CLAUDE.md in the project to instruct that any time we're building UI elements we should refer to the index.md for the library to understand what components are available and when appropriate to use one of them we _must_ review the correlating document first.

Works very very very well. Much better than an MCP server specifically built for that same lib. (Huge waste of tokens, LLM doesn't always use it, etc) Well enough that I just copy/paste this directory of docs into my active projects using that library - if I wasn't lazy I'd package it up but too busy building stuff.