| ▲ | llbbdd 10 days ago | |
Two things that I would recommend trying out if you're interested in exploring this further: 1. If you're not paying for a model, the results will be worse. That sucks but the free access models are just not very good for anything where you need to trust the output, even for basic queries. 2. More important than #1 is access to tool use. If the LLM is just producing a nutritional breakdown from its weights, it's almost always going to be wrong. If the LLM is allowed to break the problem down into deterministic steps, it will do a lot better. In the nutritional breakdown case, an LLM with search + tool access can pretty easily break the problem down: - Searching the web for a recipe or ingredient breakdown for the food - Searching the web for nutritional qualities of each ingredient per some volume of the ingredient - Writing and running a script with e.g. Python that takes in the recipe's projected serving output, the desired serving size, the amount of each ingredient etc, and scales the ingredients to match the desired serving size, and sums the nutritional qualities of the scaled ingredients. I've tried this specific case with Claude + Gemini for my own purposes and they both handle it very well. The challenge currently is that the models will not always arrive at this approach when provided with an ambiguous prompt; sometimes they will, but sometimes they'll just vomit up a fully autocompleted response from their weights. Being more specific in the prompt or defining a skill that details the intended approach lets you get more useful + deterministic results while still taking advantage of the fuzzy glue that LLMs can provide here between steps. Same with the classic strawberry r-counting case. IIUC LLMs have trouble with this because of how training data is tokenized, but any LLM will have no trouble farming out to e.g. > echo -n "strawberry" | grep -o "r" | wc -l > 3 | ||