| ▲ | Wan Streamer v0.1: End-to-End Real-Time Interactive Foundation Models(wan-streamer.com) | |
| 16 points by smusamashah 2 days ago | 1 comments | ||
| ▲ | kgeist a day ago | parent [-] | |
I tried a few SOTA realtime avatar systems from Chinese labs and the actual quality was far worse than the amazing (cherrypicked) videos on their demo pages I ran an analysis on hundreds generated videos featuring various races/ethnicities and found that Chinese models are overfitted on East Asian faces (predictable though) and have trouble properly animating many European/most African faces (bad lipsync). They all had accumulating artifacts over the long term (the video stops being stable after N seconds, for example the image gets more and more washed out) So I don't have high hopes here, everyone on the demo page is predictably East Asian and the output quality doesn't look better than prior art. I guess the innovation here is that it's end-to-end but we need to see if it's any good. WAN-derived image-audio-to-video systems used to be notoriously slow, here they boast 25 FPS for 192p but it's pretty slow actually, I managed to reach similar FPS for 720p with prior art. | ||