The FP math is deterministic. However, the environments in which inference is run and specifically batching make current LLM services practically non-deterministic.