▲ | mark_l_watson 5 days ago | |||||||||||||||||||||||||||||||
It is such a common pattern for LLMs to surround generated JSON with ```json … ``` that I check for this at the application level and fix it. Ten years ago I would do the same sort of sanity checks on formatting when I used LSTMs to generate synthetic data. | ||||||||||||||||||||||||||||||||
▲ | mpartel 5 days ago | parent | next [-] | |||||||||||||||||||||||||||||||
Some LLM APIs let you give a schema or regex for the answer. I think it works because LLMs give a probability for every possible next token, and you can filter that list by what the schema/regex allows next. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
▲ | viridian 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I'm sure the reason is the plethora of markdown data is was trained on. I personally use ``` stuff.txt ``` extremely frequently, in a variety of places. In slack/teams I do it with anything someone might copy and paste to ensure that the chat client doesn't do something horrendous like replace my ascii double quotes with the fancy unicode ones that cause syntax errors. In readme files any example path, code, yaml, or json is wrapped in code quotes. In my personal (text file) notes I also use ``` {} ``` to denote a code block I'd like to remember, just out of habit from the other two above. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
▲ | fumeux_fume 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
Very common struggle, but a great way to prevent that is prefilling the assistant response with "{" or as much JSON output as you're going to know ahead of time like '{"response": [' | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
▲ | Alifatisk 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I think this is the first time I stumped upon someone who actually mentions LSTM in a practical way instead of just theory. Cool! Would you like to elaborate further on how the experience was with it? What was your approach for using it? How did you generate synthetic data? How did it perform? | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
▲ | freehorse 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I had similar issues with local models, ended up actually requesting the backticks because it was easier this way, and parsed the output accordingly. I cached a prompt with explicit examples how to structure data, and reused this over and over. I have found that without examples in the prompts some llms are very unreliable, but with caching some example prompts this becomes a non-issue. | ||||||||||||||||||||||||||||||||
▲ | Alifatisk 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I do use backticks a lot when sharing examples in different format when using LLMs and I have instructed them to do likewise, I also upvote whenever they respond in that matter. I got this format from writing markdown files, it’s a nice way to share examples and also specify which format it is. | ||||||||||||||||||||||||||||||||
▲ | mejutoco 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
Funny, I do the same. Additionally, one can define a json schema for the output and try to load the response as json or retry for a number of times. If it is not valid json or the schema is not followed we discard it and retry. It also helps with having a field of the json be the confidence or a similar pattern to act as a cut for what response is accepted. | ||||||||||||||||||||||||||||||||
▲ | tosh 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I think most mainstream APIs by now have a way for you to conform the generated answer to a schema. | ||||||||||||||||||||||||||||||||
▲ | barrell 5 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||
Yeah, that’s infuriating. They’re getting better now with structured data, but it’s going to be a never ending battle getting reliable data structures from an LLM. This is maybe more maybe less insidious. It will literally just insert a random character into the middle of a word. I work with an app that supports 120+ languages though. I give the LLM translations, transliterations, grammar features etc and ask it to explain it in plain English. So it’s constantly switching between multiple real, and sometimes fake (transliterations) languages. I don’t think most users would experience this |