| ▲ | pzo 7 hours ago | ||||||||||||||||
The did explain a little bit: > We’ll be able to do things like run fast models on the edge, run model pipelines on instantly-booting Workers, stream model inputs and outputs with WebRTC, etc. Benefit to 3rd party developers is reducing latency and improving robustness of AI pipeline. Instead of going back and forth with https request at each stage to do inference you could make all in one request, e.g. doing realtime, pipelined STT, text translation, some backend logic, TTS and back to user mobile device. | |||||||||||||||||
| ▲ | weird-eye-issue 6 hours ago | parent | next [-] | ||||||||||||||||
You are seemingly answering something that they did not ask at all | |||||||||||||||||
| ▲ | badmonster 6 hours ago | parent | prev [-] | ||||||||||||||||
Does edge inference really solve the latency issue for most use cases? How does cost compare at scale? | |||||||||||||||||
| |||||||||||||||||