Remix.run Logo
Rochus 4 days ago

> experiment with Large Language Models (LLMs) a la GPT-4 or Llama3 in the area of music generation

LLMs are/were indeed used for music generation, but none of these results were convincing from my perspective as a practicing musician. Language is just too different from music, so that results of (symbolic) music generation based on DNNs with linguistic embeddings are only good by chance, if at all. Convincing systems like Udio rather use an architecure for music generation as described e.g. in this article: https://towardsdatascience.com/audio-diffusion-generative-mu.... An LLM is only used to interpret text input and map it to musical features, not for the actual music generation.