Remix.run Logo
bpanahij 2 hours ago

You should be skeptical, and try it out. I selected 28 long conversations for our evaluation set, all unseen audio. Every turn taking model makes tradeoffs, and I tried to make the best tradeoffs for each model by adjusting and tuning the implementations. I’m certainly not in a position as the creator of Sparrow to be totally objective. However we did use unaltered real conversational audio to evaluate. I tried to find examples that would challenge Sparrow-1 with lots of variation in speaker style across the conversations.