Remix.run Logo
i000 4 hours ago

Would it make sense to embed such single-purpose network with fixed weights within a LLM before pre-training?