Just tried it with B.E.D - Walk Away[0], unfortunately it lost track of the lyrics after 30 secs (Model is "large-v3"). Will play around a bit more, as it would be great to have a working karaoke generator.

Some quick feedback:

  - Needs a way to skip for-/backwards during playback to validate the result
  - Sentences seem to be recognized (first letter has uppercasing), but periods aren't added
  - Needs an option to edit results from the track analysis

Thanks for keeping it FOSS!

[0]: https://www.youtube.com/watch?v=_MFT4H3VoNE

▲ djtango 7 hours ago | parent | next [-]

Periods in song lyrics?

	▲	gaudystead 2 hours ago \| parent [-]
		I'm guessing they mean punctuation in general?

▲ rzzzzru 6 hours ago | parent | prev [-]

hey mate! thanks for your feedback.

indeed, I'm running to two problems on the analyzer side: 1. align model sliding off (especially w/ chorus/back vocals present) 2. transcript skipping parts of lyrics in lyrics-heavy tracks (I tried a lot of russian rap, lol)

happy for contributions as I'm not that experienced w/ machine learning side of the project, mostly it was emperical "tweak the parameters and look what is changed"

	▲	rzzzzru 6 hours ago \| parent [-]
		also model only affects the transcript job (I need to make it clearer in the UI). For the alignment, it's a single model provided by whisperx