Remix.run Logo
esafak 3 days ago

Great writeup. Are there any libraries that implement some of the methods described?

gdiamos 2 days ago | parent [-]

ScalarLM uses tokenformer adaptors by default, which have learnable key/values

https://www.scalarlm.com/blog/tokenformer-a-scalable-transfo...