| ▲ | measurablefunc 2 hours ago | |||||||||||||||||||||||||||||||
I'm correct on the technical level as well: https://chatgpt.com/s/t_698293481e308191838b4131c1b605f1 | ||||||||||||||||||||||||||||||||
| ▲ | refulgentis 2 hours ago | parent [-] | |||||||||||||||||||||||||||||||
That math is for comparing all n-grams for all n <= N simultaneously, which isn't what was being discussed. For any fixed n-gram size, the complexity is still O(N^2), same as standard attention. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||