| ▲ | fragmede 5 hours ago | |
What does it do when the model wants to return something else, and what's better/worse about doing it in llamafile vs whatever wrapper that's calling it? How do I set retries? What if I want JSON and a range instead? | ||
| ▲ | ekianjo 2 hours ago | parent [-] | |
There are no retries. The grammar enforces the output tokens accepted as part of llamacpp. | ||