| ▲ | ldenoue a year ago | |
My repo doesn't re process the audio track: instead it makes the raw ASR text transcript better by feeding it additional info (title and description) and asking the LLM to fix errors. It is not perfect, it'd sometimes replace words with a synonym, but it is much faster and cheaper. The low cost of Gemini 1.5 Flash-8B costs $1 per 500 hours of transcript. | ||