you could probably train a gpt 2 sized model with sota architecture on a 2008 supercomputer. it would take a while though.