Remix.run Logo
int_19h 2 days ago

But also, arguably, Lojban is the language you want to use for LLMs. Especially for the chain of thought.

And the interesting property of Lojban is that it has unambiguous grammar that can be syntax-checked by tools and enforced by schemas, and machine-translated back to English. I experimented with it a bit and found that large SOTA models can generate reasonably accurate translations if you give them tools like dictionary and parser and tell them to iterate until they get a syntactically valid translation that parses into what they meant to say. So perhaps there is a way to generate a large enough dataset to train a model on; I wish I had enough $$$ to try this on a lark.