Remix clone Hacker News

new | show | ask | jobs Github

	▲	icelancer a year ago
		Yeah, I've had no issues sending tokens up to the context limit. I cut it off with a 10% buffer but that's just to ensure I don't run into tokenization miscounting between tiktoken and whatever tokenizer my actual LLM uses. I have had little success with Gemini and long videos. My pipeline is video -> ffmpeg strip audio -> whisperX ASR -> groq (L3-70b-specdec) -> gpt-4o/sonnet-3.5 for summarization. Works great.