| ▲ | gardnr 5 hours ago | ||||||||||||||||
> The training and deployment of LongCat-2.0 are built on large-scale clusters of tens of thousands of AI ASIC superpods. Compared to the mature Nvidia GPU ecosystem, the supporting software community is still less developed. We have therefore put significant effort into building a stable, secure, and scalable infrastructure. This is the real news story. It looks like they may have used Huawei Ascend 910C chips: https://nitter.net/teortaxesTex/status/2071708141037781407#m | |||||||||||||||||
| ▲ | BoorishBears 2 hours ago | parent [-] | ||||||||||||||||
If they really managed this from pre-training a 1.6 T parameter model through to post-training without NVIDIA, Dwarkesh Patel got what he wanted. | |||||||||||||||||
| |||||||||||||||||