▲ | refulgentis 2 days ago | |
Right - I'd suggest the idea that 128 GB of GPU RAM gives you an 8K context shows us it may be worth revising priors such as "it has strong advantages for inference as well as training" As Mr. Hildebrand used to say, when you assume, you make... (also note the article specifically frames this speccing out as about training :) not just me suggesting it) |