Remix.run Logo
Tuna-Fish 20 hours ago

HBF is not that. The paper you linked is about how to use flash memory that exists to boost LLM performance, with all kinds of optimization tricks. HBF is about making flash memory that doesn't require any of those tricks, and just has the read throughput that's needed for inference.