Remix.run Logo
cubefox 2 hours ago

This is the only paper which really does this:

https://proceedings.neurips.cc/paper_files/paper/2024/hash/7...

They train directly in the 1 bit domain, without any floating point weights. They don't use the classical Newton-Leibniz derivative (which operates on approximations of real numbers) for gradient descent / backpropagation. Instead they invented a binary version called "Boolean variation".

I don't know why this paper didn't get more attention.