| ▲ | Wan Streamer v0.1: End-to-End Real-Time Interactive Foundation Models(wan-streamer.com) | |
| 2 points by davedx 8 hours ago | 1 comments | ||
| ▲ | davedx 8 hours ago | parent [-] | |
Wan Streamer is a native-streaming, end-to-end interactive foundation model, designed from the ground up for real-time, low-latency, full-duplex audio-visual interaction. It models language, audio, and video as both input and output within a single Transformer: the sequence is an interleaving of visual, audio, and text input tokens with visual, audio, and text output tokens, coordinated by block-causal attention for incremental streaming. | ||