▲ | ijk 4 days ago | |
True - though in the actual case of your examples, calcpay, process_user_input, and ProcessUserInput all encode into exactly 3 tokens with GPT-4. Which is the exact kind of information that you want to know. It is very non-obvious which one will use more tokens; the Gemma tokenizer has the highest variance with process|_|user|_|input = 5 tokens and Process|UserInput as 2 tokens. In practice, I'd expect the performance difference to be relatively minimal, as input tokens tends to quickly get aggregated into more general concepts. But that's the kind of question that's worth getting metrics on: my intuition suggests one answer, but do the numbers actually hold up when you actually measure it? | ||
▲ | quuxplusone 3 days ago | parent [-] | |
Awesome! You should have written this blog post instead of that guy. :) |