| ▲ | esafak 3 days ago | |
Great writeup. Are there any libraries that implement some of the methods described? | ||
| ▲ | gdiamos 2 days ago | parent [-] | |
ScalarLM uses tokenformer adaptors by default, which have learnable key/values https://www.scalarlm.com/blog/tokenformer-a-scalable-transfo... | ||