| ▲ | trueno 2 days ago | ||||||||||||||||||||||
when you say "same prompt" are you saying its similar prompt and something in the middle determines that "this is basically the same question" or is it looking for someone who for whatever reason prompted, then copied and pasted that prompt and prompted it again word for word? | |||||||||||||||||||||||
| ▲ | kaliades 2 days ago | parent [-] | ||||||||||||||||||||||
Exact match, word for word. agent-cache takes everything that defines an LLM request - which model you're calling (gpt-4o, Claude, etc.), the full conversation history (system prompt + user messages + assistant responses), sampling parameters like temperature, and any tool/function definitions the model has access to - serializes it all into a canonical JSON string with sorted keys, and hashes it with SHA-256. That hash is the cache key in Valkey. Same inputs down to the last character = cache hit, anything different = miss. If you want the 'basically the same question' behavior, that's our other package - @betterdb/semantic-cache. It embeds the prompt as a vector and does similarity search, so 'What is the capital of France?' and 'Capital city of France?' both hit. The trade-off is it needs valkey-search for the vector index, while agent-cache works on completely vanilla Valkey with no modules. In practice, agent-cache hits its cache less often than semantic-cache would, but when it does hit, you know the result is correct - there's no chance of returning a response for a question that was similar but not actually the same. | |||||||||||||||||||||||
| |||||||||||||||||||||||