▲ | bigdict 2 days ago | |||||||
What's the point of the relu in the loss function? Its inputs are nonnegative anyway. | ||||||||
▲ | Nevermark 2 days ago | parent | next [-] | |||||||
Let's try to keep things positive. | ||||||||
▲ | GolDDranks 2 days ago | parent | prev | next [-] | |||||||
I wondered the same. Seems like it would just make a V-shaped loss around the zero, but abs has that property already! | ||||||||
| ||||||||
▲ | andy_ppp 2 days ago | parent | prev | next [-] | |||||||
In reality it’s probably not a RELU modern LLMs use GeLU or something more advanced. | ||||||||
▲ | meindnoch 2 days ago | parent | prev | next [-] | |||||||
Sometimes a cosmic ray might hit the sign bit of the register and flip it to a negative value. So it is useful to pass it through a rectifier to ensure it's never negative, even in this rare case. | ||||||||
| ||||||||
▲ | fancyfredbot 2 days ago | parent | prev [-] | |||||||
I thought the belt and braces approach was a valuable contribution to AI safety. Better safe than sorry with these troublesome negative numbers! | ||||||||
|