is that because of more deterministic AI flows like llm-as-judge, rag reranker, post-eval, etc?
do you think something like langgraph state is sufficient?