Remix.run Logo
Public Runtime for Convera for LLM's(github.com)
2 points by cjparadise 14 hours ago | 4 comments
cjparadise 14 hours ago | parent | next [-]

Don't Quantize Use CONVERA Instead of focusing only on faster hardware or larger models, it focuses on:

> Reusing work that has already been done.

In its current public form, CONVERA:

- runs LLMs locally (HuggingFace)

- executes prompts through a controlled runtime

- caches repeated prompt results

- detects reuse opportunities

- returns measurable latency improvements on repeat runs

cjparadise 11 hours ago | parent | next [-]

[dead]

cjparadise 5 hours ago | parent | prev [-]

[dead]

cjparadise 11 hours ago | parent | prev [-]

[dead]