Remix.run Logo
Ukv a day ago

> Microsoft's Connected Experiences feature automatically gathers data from Word and Excel files to train the company's AI models. This feature is turned on by default, meaning user-generated content is included in AI training unless manually deactivated.

Not to say that Microsoft products respect privacy, but I don't see evidence that user Word/Excel files are being used for training.

The linked services agreement has had the same language (copy/transmit/etc. "to the extent necessary to provide the services") since at least 2015[0], and "connected experiences" seems to group a wide range of integrations; some like dictation/translation probably utilise ML, but that does not mean training on user content.

[0]: https://web.archive.org/web/20150608000921/https://www.micro...

itishappy a day ago | parent | next [-]

To play devil's advocate, I don't see any evidence they're NOT training on user content either. Compared to how explicitly they indicate they're not using user content for targeted advertising, this seems like a huge oversight. Given how carefully they've put together these documents, I'm doubtful it was an oversight.

cptskippy 16 hours ago | parent [-]

I think it's appropriate to be concerned and seek clarification. And I don't like people immediately seeking to vilify Microsoft as if they came over to their house and shot their dog in front of their kids.

ca_tech a day ago | parent | prev [-]

Agreed. This was raised within our corp the other week and we read through the privacy and security documentation as it relates to Connected Experiences. Microsoft has outlined specifically what Connected Experiences covers.[1] [2] You could argue that predictive text is a product of machine learning but there is no clause allowing for training any generalized large language models using this data. The confusion may have arisen, if they read an article about CoPilot. If the user had a Microsoft Copilot 365 license, then the data would be used as grounding for their personal interaction with CoPilot. But still not used to train any foundational LLMs. However, even this data is still managed in compliance with Microsoft's data security and privacy agreements.

[1] https://learn.microsoft.com/en-us/microsoft-365-apps/privacy...

[2] https://learn.microsoft.com/en-us/microsoft-365-apps/privacy...