This is a major issue with LLMs altogether, it probably has to do with the transformer architecture. We need another breakthrough in the field for this to become reality.