If they really managed this from pre-training a 1.6 T parameter model through to post-training without NVIDIA, Dwarkesh Patel got what he wanted.
Who? What did he want?
Dwarkesh Patel has AI/ML guests on his podcast. BoorishBears may have been referring to the Jensen Huang episode where they discuss TPUs: https://youtu.be/Hrbq66XqtCo?t=982