This code creates a JSON intermediate representation that LLMs could probably consume. You might want to simplify it to focus on content and reduce token usage.