Remix.run Logo
smuhakg 8 hours ago

> On one hand, you have Microsoft's (awful) Copilot integration for Excel (in fairness, the Gemini integration in Google Sheets is also bad). So you can imagine financial directors trying to use it and it making a complete mess of the most simple tasks and never touching it again.

Microsoft has spent 30 years designing the most contrived XML-based format for Excel/Word/Powerpoint documents, so that it cannot be parsed except by very complicated bespoke applications with hundreds of developers involved.

Now, it's impossible to export any of those documents into plain text that an LLM can understand, and Microsoft Copilot literally doesn't work no matter how much money they throw at it. My company is now migrating Word documents to Markdown because they're seeing how powerful AI is.

This is karmic justice imo.

QuantumGood 7 hours ago | parent | next [-]

Tim Berners-Lee thought pages would become machine-readable long ago, with "obvious" benefits, and that idea partly drove XML, RDF and HTML 5. Now the benefit of doing so seems even bigger (but are they?), and the time spent making existing documents AI readable seems to keep growing.

martinald 7 hours ago | parent | prev | next [-]

Totally agree, though ironically Claude code works way better with Excel than I expected.

I even tried telling Copilot to convert each sheet to a CSV on one attempt THEN do calculations. It just ignored it and failed miserably, ironically outputting me a list of files that it should have made, along with the broken python script. I found this very amusing.

mattmanser an hour ago | parent [-]

Think why an LLM might really struggle with csvs. You're asking a chat bot to make a weird sentence with tons of commas in it.

I've read that they're supposed to be great with XML as it's so structured, better than JSON, but haven't actually found that to be the case.

irishcoffee 5 hours ago | parent | prev [-]

> Microsoft has spent 30 years designing the most contrived XML-based format for Excel/Word/Powerpoint documents, so that it cannot be parsed except by very complicated bespoke applications with hundreds of developers involved.

I had interns use c++ to unzip, parse, and repackage to json a standardized visio doc. I had no say in the standard, but specific blocks meant specific things, etc. The project was successful. The xml was parse-able... at least for our needs. The overall project died a swift death and this tidbit will probably be forgotten forever in the depths of repo heirarchy.

fragmede 3 hours ago | parent [-]

what would you have used?