▲ | tomcam 3 days ago | ||||||||||||||||||||||||||||||||||||||||||||||
Incredibly, no videos linked in an article about a video newscast. I think this is an example. The AI doesn't even pronounce "AI" correctly. Interestingly, it looks slightly offscreen just the way real newsreaders do when they're on prompter. https://www.youtube.com/watch?v=Aa7Q2S7VWUk They do a lot right. There's interaction between the bots. They look kind of professional but not Los Angeles/New York quality, which is what you'd expect from a smallish market. Their movement is also kind of stiff and amateurish, which I believe is intentional. | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | dylan604 3 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Newscast teleprompters are directly in front of the camera lens specifically to not have them looking away from the lens. This has been a solved technology for decades. Perhaps you're thinking of cue cards or the teleprompters speakers use in a speech live audience type of setting? | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | joe_the_user 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
I could only find this video. James' arms go up and down in an alarming manner. Rose has more natural movements but the voice you hear when her mouth moves is worse than the worst foreign film voice-over. Somehow the person and the voice mismatched in "tone" in a way that's hard to describe. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | beepbooptheory 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Love this so much, not in the way intended. Its just so strange! I can't put my finger on it, but feels like something Tim and Eric, or Tim Robinson, or even Alan Resnick would have a hand in. There is a kind of aesthetic immanence to whole thing, everything is right on the surface. The voices are only just embodied "enough," their unearned confidence, their "affectations." The deadpan delivery on an absurd stage. The colors all feel like a cake that is too sweet. Like approximating a memory of a broadcast. It is hilarious and beautiful. No notes. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | syncsynchalt 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
I was surprised at how game the AI was to pronounce the Hawaiian place names, it was confident enough that I assumed the pronunciation was correct. The article notes that it is butchering the placenames though. To me this illustrates a common cognitive mismatch when evaluating AI, it can be confident in a way that most humans can't, and that misleading social cue is another reason we trust its output. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | yardstick 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
The first thing I thought of when I saw this is that some mid-tier dictatorships could replace a lot of their newscasters with this approach. Can always guarantee they’ll say what they need to say, and a lack of emotion is a plus maybe? Except with the dear leader passes then you bring out a real person for the emotions. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | vbarrielle 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Looks like they're using something like motion matching to recover fragments of the presenter's motion that match the pronounced phonemes. The actors were probably instructed to avoid almost all movement to make sure it was blendable. That would explain why the guy's hand have such erratic and non-natural movement. | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | 1024core 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
James' lips don't seem to move at all. The problem with such "videocasts" (as opposed to "podcasts") is that there is another channel that the AI has to control: the video. Generating convincing video is much harder than generating convincing audio. | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | glandium 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Watching this, I'm left wondering why my brain doesn't want to blend the visual and the audio. I don't think it's the bad lip sync. I have this weird feeling that these persons, were they real, wouldn't have these voices. But I can't quite put my finger as to why. I haven't watched movies dubs in a while, but maybe that's the same kind of phenomenon that makes dubs sound bad. Or maybe we grow an intuitive sense of what a person's voice might sound like based on the appearance of their face's bone structure and muscles? | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | cowsandmilk 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
The male host’s hands are literally on a loop, it is disturbing. And the female host had several nonsensical sentence fragments. The script isn’t even up to par with what you would see in a college news show. | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | chrononaut 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
The way the mouths move are so far off from the words they're speaking that my first impressions would be they're just playing a video loop of these people talking about other random things and dubbing over it. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | onemoresoop 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
What problem are they trying to solve though? | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | pbronez 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Thanks for the primary source. Concur the quality is poor. Google’s NotebookLM podcast summary is way more natural sounding. The video was even worse than the audio. The lip sync was off. The girl looked like someone else’s mouth was mapped onto her face. | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | rightbyte 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
The guy locks like a deceised used as a marionette and the girl speaks like a tenor. But I guess it will become better. TV will turn so soul less, even when compared to today. Imagine Rakuten Dog Does Funny Stuff channel with this added as some filler. Dystopic. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | itronitron 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
>> The AI doesn't even pronounce "AI" correctly. You can call me Al. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | bpm140 2 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
NGL, the pronunciation of Waikiki as “Why, Kiki?” made me laugh out loud. | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | shepherdjerred 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Not great, but it's surprisingly good if you can make this with just a text prompt. | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | mock-possum 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Ooh wow I hate this. Totally soulless appearance and delivery - and the robot fidgeting the dude is doing with his hands completely distracts from everything else. It’s totally normal to do that movement while speaking for emphasis - but whatever he’s doing does not look normal. (The mouths look nightmarish as well) | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | insane_dreamer 3 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||
"James" arm motion in a loop. |