Remix.run Logo
digitaltrees 4 hours ago

Because cursor gets some of the highest quality training data from the world's programmers and responses from the full ecosystem of model vendors and access to active code bases. XAI wants the data.

anukin 3 hours ago | parent | next [-]

Highest is extremely subjective in case of cursor. It’s not exactly used by the experienced programmers and caters mainly to neophytes

w10-1 3 hours ago | parent | next [-]

> caters mainly to neophytes

Perhaps that's where the money and strategy is. (a) stronger need; (b) if you can build systems without real expertise, you don't have to stomach their salaries or politics.

digitaltrees 2 hours ago | parent | prev [-]

Umm. It’s real development work in real settings with real model output. That is a high quality dataset. The fact that it isn’t good code from elite engineers is confusing what good means in the context of coding agents. First is how to respond to a range of prompts. For that you need diverse real world conversations. Second is the ability to respond with good code. That is about labeling or other data curation after the fact or other training methods. So it’s a downstream consideration

bdamm 3 hours ago | parent | prev [-]

Ah hah, this is it. I was also confused - the tool isn't the thing. It's the behavior analysis capability.

scottyah 3 hours ago | parent [-]

Plus user-funded model distillation lol.