Does the 169M include the ~90M params for the Mimi codec? Interesting approach using FiLM for speaker conditioning.
No, it doesn’t.