| ▲ | nextworddev 7 hours ago | ||||||||||||||||||||||
Can I connect this to Twilio | |||||||||||||||||||||||
| ▲ | kwindla 6 hours ago | parent | next [-] | ||||||||||||||||||||||
One easy way to build voice agents and connect them to Twilio is the Pipecat open source framework. Pipecat supports a wide variety of network transports, including the Twilio MediaStream WebSocket protocol so you don't have to bounce through a SIP server. Here's a getting started doc.[1] (If you do need SIP, this Asterisk project looks really great.) Pipecat has 90 or so integrations with all the models/services people use for voice AI these days. NVIDIA, AWS, all the foundation labs, all the voice AI labs, most of the video AI labs, and lots of other people use/contribute to Pipecat. And there's lots of interesting stuff in the ecosystem, like the open source, open data, open training code Smart Turn audio turn detection model [2], and the Pipecat Flows state machine library [3]. [1] - https://docs.pipecat.ai/guides/telephony/twilio-websockets [2] - https://github.com/pipecat-ai/pipecat-flows/ [3] - https://github.com/pipecat-ai/smart-turn Disclaimer: I spend a lot of my time working on Pipecat. Also writing about both voice AI in general and Pipecat in particular. For example: https://voiceaiandvoiceagents.com/ | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | ldenoue 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
I developed a stack on Cloudflare workers where latency is super low and it is cheap to run at scale thanks to Cloudflare pricing. Runs at around 50 cents per hour using AssemblyAI or Deepgram as the STT, Gemini Flash as LLM and InWorld.ai as the TTS (for me it’s on par with ElevenLabs and super fast) | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | VladVladikoff 6 hours ago | parent | prev [-] | ||||||||||||||||||||||
Technically yes, twilio has sip trunks. | |||||||||||||||||||||||