| ▲ | Tiny hackable CUDA language model implementation(github.com) | |||||||
| 25 points by markusheimerl 3 days ago | 2 comments | ||||||||
| ▲ | yobbo 3 hours ago | parent [-] | |||||||
Looks very nice, but I can't find numerical gradient checks, which is helpful when verifying that backward pass is correct: https://github.com/markusheimerl/gpt/blob/main/transformer/a... | ||||||||
| ||||||||