Remix.run Logo
simoncion 2 days ago

> That's why I care so much about differentiating between the shady stuff that they DO and the stuff that they don't.

Ah, good. So you have solid evidence that they're NOT doing shady stuff. Great! Let's have it.

"It's unfair to require me to prove a negative!" you say? Sure, that's a fair objection... but my counter to that is "We'll only get solid evidence of dirty dealings if an insider turns stool pidgeon.". So, given that we're certainly not going to get solid evidence, we must base our evaluation on the behavior of the companies in other big projects.

Over the past few decades, Google, Facebook, and Microsoft have not demonstrated that they're dedicated to behaving ethically. (And their behavior has gotten far, far worse over the past few years.) OpenAI's CEO is plainly and obviously a manipulator and savvy political operator. (Remember how he once declared that it was vitally important that he could be fired?) Anthropic's CEO just keeps lying to the press [0] in order to keep fueling AGI hype.

[0] Oh, pardon me. He's "making a large volume of forward-looking statements that -due to ever-evolving market conditions- turn out to be inaccurate". I often get that concept confused with "lying". My bad.

simonw a day ago | parent [-]

So call them out for the bad stuff! Don't distract from the genuine problems by making up stuff about them ignoring robots.txt directives despite their documentation clearly explaining how those are handled.

simoncion a day ago | parent [-]

> So call them out for the bad stuff!

I am. I am also saying -because the companies involved have demonstrated that they're either frequently willing to do things that are scummy as shit or "just" have executives that make a habit of lying to the press in order to keep the hype train rolling- that's it's very, very likely that they're quietly engaging in antisocial behavior in order to make development of their projects some combination of easier, quicker, or cheaper.

> Don't distract from the genuine problems by making up stuff...

Right back at you. You said:

  > There are definitively scrapers that ignore your robots.txt file
  Of course. But those aren't the ones that explicitly say "here is how to block us in robots.txt"
But, you don't have any proof of that. This is pure speculation on your part. Given the frequency of and degree to which the major companies involved in this ongoing research project engage in antisocial behavior [0] it's more likely than not that they are doing shady shit. As I mentioned, there's a ton of theoretical money on the line.

The unfortunate thing for us is that neither of us can do anything other than speculate... unless an insider turns informant.

[0] ...and given how the expected penalties for engaging in most of the antisocial behavior that's relevant to the AI research project is somewhere between "absolutely nothing at all" and "maybe six to twelve months of expected revenue"...