Do you think there's a path where you can pregenerate popular paths of dialogue to avoid LLM inference costs for every player? And possibly pair it with a lightweight local LLM to slightly adapt the responses? While still shelling out to a larger model when users go "off the rails"?

▲

themanmaran 2 hours ago | parent [-]

Not the founder, but having run conversational agents at decent scale, I don't think the cost actually matters much early on.

It's almost always better to pay more for the smarter model, than to potentially give a worse player experience.

If they had 1M+ players there would certainly be room to optimize, but starting out you'd certainly spend more trying engineer the model switcher than you would save in token costs.

	▲	tom_0 an hour ago \| parent [-]
		I agree, trying to save on costs early on is basically betting against things getting better. Not only that but in almost every case people prefer the best model they can get! Not only that but I think our selling point is rewarding creativity with emergent behavior. I think baked dialogue would turn into traditional game with worse writing pretty quick and then you got a problem. For example, this AI game here does multiple choices with a local model and people seem a bit mild about it. We could use it to cache popular QA, but in my experience humans are insane and nobody ever says even remotely similar things to robots :) [1] https://store.steampowered.com/app/2828650/The_Oversight_Bur...