Remix.run Logo
raincole 8 hours ago

I don't even know what it means. "even without an invasion"? The author think China will destroy TSMC just because? To slow down AI progress?

> if we got to a situation where only the U.S. had the sort of AI that would give us an unassailable advantage militarily, then the optimal strategy for China would change to taking TSMC off of the board.

Lmao it's not. The author doesn't know what they're talking about at all. Let's be realistic: the current TSMC technology will be accessible to China, likely via espionage. The question is just how soon. It has already happened before. China's 7nm process was developed with the help from one of the highest level ex-TSMC researcher[0].

[0]: https://en.wikipedia.org/wiki/Liang_Mong_Song

fc417fc802 6 hours ago | parent | next [-]

If cutting edge were a hard requirement then given the lead times involved I think the author would be correct. However I think there's a fundamental error in failing to account for the fact that you don't need cutting edge chips to do AI. Sure it makes it cheaper and faster but it's absolutely not a requirement. You could train a state of the art model on cluster of 12+ year old boxes (ie Intel's 22 nm and DDR3) but if you want to get the job done in a similar timeframe you're going to pay out the ass for electricity. Your research pipeline would necessarily be narrower due to physical and monetary limitations but that's not the end of the world.

alex43578 5 hours ago | parent | next [-]

That’s like saying you could train a state of the art model by hand, and it’ll only cost you a lot of man-hours.

Realistically, to train a frontier model you’d need quite a lot of compute. GPT4, which is old news, was supposedly trained on 25,000 A100s.

There’s just no reasonable way of catching modern hardware with old hardware+time/electricity.

fc417fc802 4 hours ago | parent [-]

Training methods and architectures keep getting more efficient by leaps and bounds and scaling up was well into the realm of diminishing returns last I checked. The necessity of exceeding 100B seems questionable. Just because you can get some benefits by piling ever more data on doesn't necessarily mean you have to.

Also keep in mind we aren't talking about a small company wanting to do competitive R&D on a frontier model. We're talking about a world superpower that operates nuclear reactors and built something the size of the three gorges dam deciding that a thing is strategically necessary. If they were willing to spend the money I am absolutely certain that they could pull it off.

storystarling 5 hours ago | parent | prev [-]

I suspect the bottleneck on 12+ year old hardware wouldn't be power but the interconnects. SOTA training is bound by gradient synchronization latency. Without NVLink you hit a hard wall where the compute spends most of its time waiting on PCIe or ethernet.

fc417fc802 4 hours ago | parent [-]

Fair point. Though if this were actually attempted I imagine it would start with making changes to the model architecture, the physical hardware, or both.

My hypothetical is probably somewhat over the top given that isn't China somewhere in the vicinity of 7 nm at present?

mytailorisrich 5 hours ago | parent | prev | next [-]

Mainland China has no interest in destroying the fabs in Taiwan, quite the opposite.

Taiwan might go scorched Earth and destroy them but that sounds more like a threat to foster US' support.

For the US the threat is either destruction of the fabs or China leverage against them if they get to control the fabs.

stefan_ 7 hours ago | parent | prev [-]

Hiring people isn't espionage. Key talent leaving a dominant manufacturer for a paycheck at a struggling competitor and bringing their knowledge is just about the basis of capitalism.