| ▲ | spwa4 2 days ago | |
Ah ... the temptation of the optimizer. It's such a simple algorithm, it has far more impact on back-propagation calculations than ... the actual backprop calculation, never mind details like model architecture. So tempting to work on it. But so very, very, very, very hard to make progress on it. Even at PhD level. Just don't try ... | ||