Remix.run Logo
malfist 5 hours ago

I mean, if they've consumed all of human knowledge. What's left for them to train on? This pivot isn't only because it's cheaper and a way to juice the numbers for an IPO, it's survival because they can't improve more.

hasteg an hour ago | parent | next [-]

IIRC when they make a big enough architecture change to the model they will need to rerun pre training . So not like they’re feeding it more data (they will be but will be a drop in an s3 bucket compared to their dataset reserves) but rather training models with different architectures.

5 hours ago | parent | prev | next [-]
[deleted]
applicative 5 hours ago | parent | prev [-]

It did sound to me like they feel some sort of wall coming.