Generate per-session LoRA adapters in <1s for agentic inference efficiency

	▲	Generate per-session LoRA adapters in <1s for agentic inference efficiency(github.com)
		2 points by Facingsouth 12 hours ago \| 1 comments

	▲	Facingsouth 12 hours ago \| parent [-]
		Quick Start Generate LoRA Adapters From metadata (JSON string or file): tessera generate \ --from-metadata '{"task": "classification", "domain": "medical"}' \ --base-model mistralai/Mistral-7B-Instruct-v0.2 \ --rank 16 \ --save ./adapter.safetensors From text description: tessera generate \ --from-text "Medical diagnosis assistant" \ --base-model mistralai/Mistral-7B-Instruct-v0.2 \ --rank 16 \ --save ./adapter.safetensors From document: tessera generate \ --from-doc ./document.txt \ --base-model mistralai/Mistral-7B-Instruct-v0.2 \ --rank 16 \ --save ./adapter.safetensors Base Model Management Download a base model from HuggingFace Hub: tessera model pull mistralai/Mistral-7B-Instruct-v0.2 tessera model pull meta-llama/Llama-3.1-8B-Instruct tessera model pull deepseek-ai/DeepSeek-R1-Distill-Qwen-7B Start vLLM with a base model: tessera model serve-model mistralai/Mistral-7B-Instruct-v0.2 --port 8000 tessera model serve-model mistralai/Mistral-7B-Instruct-v0.2 --gpu-memory-utilization 0.9 tessera model serve-model mistralai/Mistral-7B-Instruct-v0.2 --quantization awq List cached base models: tessera model list-models Remove a cached model: tessera model remove mistralai/Mistral-7B-Instruct-v0.2 Start Tessera Server Start the hypernetwork server (with auto vLLM): tessera serve --port 8080 --base-model mistralai/Mistral-7B-Instruct-v0.2 Start the hypernetwork server (standalone): tessera serve --port 8080 --host 0.0.0.0 Check Server Health tessera health --url http://localhost:8080 List Available Models tessera list