Remix.run Logo
ahofmann 2 days ago

While this might be a technical breakthrough, none of the examples sounded any good. Every aspect of the provided sounds are bad. The music sounds muffled and badly mixed. The generated beat isn't a beat that grooves, or has anything interesting in it. The barking saxophone sounded just bad. The voices sounded somewhat convincing.

In general I think that with ai generated audio it is much more noticeable how utterly bad everything is, that ai generates. I already absolutely hate the two AI voices that are in a lot of YouTube videos and are a reason for me to close the Video immediately most of the time.

leopoldj 2 days ago | parent | next [-]

While I agree with you, this is the release of a research paper [1] and some accompanying demos on GitHub [2]. This is not a finished product fine tuned for high quality output.

[1] https://d1qx31qr3h6wln.cloudfront.net/publications/FUGATTO.p...

[2] https://fugatto.github.io/

RobinL 2 days ago | parent | prev | next [-]

With apologies for the X link, here is an example from Suno which felt very musical to me: https://x.com/sunomusic/status/1857501332560818342

Here's another example on the Suno website: https://suno.com/song/fc991b95-e4e9-4c8f-87e8-e5e4560755e7

ben_w 2 days ago | parent | next [-]

There are things I like from Suno, but, having used it to make quite a lot, I also get vibes of something subtly wrong that I can't put my finger on, which I assume is somewhere between the audio version of bad kerning and Cronenberg fingers. Too many examples of vocoder/autotune in the training set, perhaps?

That said, I mostly prefer AI over "real" human-made recordings (pop, classical, metal, bardcore, whatever) because I tend to learn the patterns too fast to enjoy, or really even tolerate, any recording more than about 3 times* — I assume I'd like live jazz for longer, but have only been to one place that ever had it so I don't know if it breaks that pattern.

* sole exception: TV theme tunes, though the point of them isn't to listen to them

numpad0 2 days ago | parent | prev | next [-]

I don't find any problems whatsoever in those audio, but I'm not an avid music listener, so out of intuition I'm making a guess that there's same underlying issue as image generation happening: AI makes technically horrible and rage-inducing fillers that lack high level semantic structure, but average people has no words nor experience to assess and describe what's going on.

ahofmann 2 days ago | parent [-]

> I don't find any problems whatsoever in those audio

I think this is why there is no real, powerful protest against all that generated stuff. Only the people, that care, are able to articulate what's wrong with it. To me, all of AI generated content sounds horrible. To almost everyone else, this sounds ok. So we will see and hear more of this generated stuff. We are in the middle of the enshittification of all consumable media.

ben_w 2 days ago | parent | next [-]

I think there's a lot of different reasons all simultaneously going on.

Most human musicians have very little power; that's been going away for a long time, even since "canned music" "robots" pushed live bands out of cinemas a century ago: https://www.smithsonianmag.com/history/musicians-wage-war-ag...

Most popular music already feels, and to an extent is, fake. Not only because mere recording allows repeated takes until it's inhumanly "perfect".

When I played an MP3 of Britney Spears to my mum around the turn of the century, she thought it was a robot singing because of the autotune.

The Monkees was famously an attempt at a manufactured band whose members just happened to not feel like playing that game and did it for real, Gorillaz is even more obviously manufactured. Parasocial relationships are inherently different from "real" relationships, but the performers have to pretend that it's personal when they address a crowd or a camera.

Axis of Awesome demonstrated the similarity of most modern hits with their "4 Chords": https://youtu.be/oOlDewpCfZQ?feature=shared

Those with the power were, possibly still are, the record labels; but if the AI are trained on the works of small musicians that can't afford the copyright cases or the political influence, but also whose works are not under the umbrella of the labels who do have those resources but not the right or short term motivation to intercede on their behalf, the big labels themselves may lose the consumer market to free AI output, while professionals will dismiss both the AI output and the label's output as "just different kinds of slop but both slop" (or whatever the current insult de jour is for AI).

numpad0 2 days ago | parent | prev | next [-]

Agreed. To me AI generated images look horrible, and AI generated audio is still somewhat gut twisting but less painful. AI generated code works for HTML/CSS+JS, but not that great for others. AI generated e-commerce reviews ... on par with human reviews?

I'm starting to think that what AI might be replacing is high ends of consumption, not low ends of generation. Arts has followers that are often less historically significant than genre pioneering works. Doesn't that seem like what AI is doing?

com2kid 2 days ago | parent | prev | next [-]

People were happy with the included wired iPhone earbuds for years, even though they were terrible.

Listening on a laptop speaker Sumo sounds fine. Listening on my wireless earbuds it is... ok. I am too lazy anymore to pull out any of my high quality wired headphones, and if somebody who used to care about sound quality enough to purchase multiple HQ headphones can't be arsed, then the general pubic really is going to think everything is just fine.

numpad0 2 days ago | parent [-]

That kind of quality limitations is not the point. Nor extra digits in images.

Generative AI outputs trigger uncanny valley discomfort that professionals and connoisseurs are better equipped to verbalize.

The quality question is what's the point of stuffs like that, or is it good, even safe, for us to consume.

anonzzzies 2 days ago | parent | prev [-]

I find almost all popular music that's made in the past 20+ years quite terrible. This is not worse. For people who enjoy this chewing gum stuff, which seems most of the population of earth, this is fine. And as such, this will be all popular music in the future; upload your voice, pick a style, generate 13 songs, go on tour to make money.

codedokode 2 days ago | parent [-]

> go on tour

Why go on tour if you can send an AI singer instead and if you cannot sing as good as it anyway?

anonzzzies 2 days ago | parent [-]

They want to believe it is human; however when the robots get good enough... That's further away though maybe.

ahofmann 2 days ago | parent | prev [-]

Suno is by far the best generated AI music I've heard. That said, it is hot garbage.

I've listened to both songs on my Bose QC ultra headphones, which are far from perfect headphones. But even on them, the female voice has unbearable resonances in the higher frequencies. The male voice sounds mostly ok, but has also something that sounds like compression artifacts (like mp3 compression, not loudness compression). All instruments in these songs have these problems. They sound somewhat like the real thing, but really badly recorded. Also, the mixing isn't any good.

It is still very impressive that AI can generate that. But if I would record my band and someone would create such a mix out of it, I would fire them immediately. Heck, I would be furious that they fucked up so bad and would try to get my money back.

So the two links you provided just confirm what I said.

CraftingLinks 2 days ago | parent | next [-]

I use Suno like a producer in a music studio hires musicians to bring ideas to life. I wish more features in Suno would empower music producers. I sample pieces, re-mix doodles, get ideas to continue my tracks... I can see the future, and as an amateur, it's just liberating and a lot of fun.

snapcaster 2 days ago | parent | prev [-]

Really interesting, haven't listened to their output with high quality speakers or anything like that. Do poorly made human recordings have this problem or is this currently a signal of AI generation?

codedokode 2 days ago | parent | prev [-]

This might be because of dataset quality, because most of high-quality content is in commercial music and sample libraries.

squarefoot 2 days ago | parent [-]

This. And the world isn't ready for that, including copyright laws that must be radically changed in a way that doesn't harm innovation. Suno v4 has become a complete disaster for some genres, and that could be due to the lawsuit that is forcing them to retrain the model using non copyrighted works, which in my opinion is pure bollocks. Imagine forcing an artist to unlearn what they listened to in their young years and contributed to forge their personal style. Sorry, but I'm pessimistic. If we don't change how copyright works, pretty much every development in the field will be ruined by greedy copyright holders and their lawyers as soon as it shows any capability to produce decent music that barely resembles something else.

codedokode 2 days ago | parent | next [-]

Should not the author be able to decide if his work may be used for generative AI?

> Imagine forcing an artist to unlearn

Mathematical models cannot learn. What happens in fact is the owner of generative AI takes a bunch of copyrighted works which took a lot of effort and money to produce (instruments, mics and other equipment is super expensive), puts it into computer and sells whatever the computer has calculated from those recording. Do you see any learning or any creativity here?

There were cases when suno (or udio) was reproducing producer tags almost verbatim (but in lower quality) for example. This shows that the model was not simply calculating some probabilities of patterns of pitches, durations etc, but was storing the copyrighted content almost unmodified.

Also, personally I have no interest in a service that generates a song for you because it takes away all the fun. Maybe something that helps to find mistakes in composed music and help learning would be much more useful.

jojo_ 2 days ago | parent | next [-]

Lots of artists can reproduce existing content, should we get rid of them entirely, or just restrict them from publishing such content? If anything, it's the responsibility of the publisher to avoid copyright infringement.

> Also, personally I have no interest in a service that generates a song for you because it takes away all the fun. Maybe something that helps to find mistakes in composed music and help learning would be much more useful.

You are not forced to use the full raw output, you could use it sparingly in your new composition, the same way you might use ChatGPT to improve your lyrics.

All non-musician friends where thrilled to generate music. It's already extremely fun and will keep getting better. I think it lowers the barrier of entry and will increase the total amount of performers, the "real" musicians. I am sure musicians playing instruments back in the days had the same idea about digital music: "Not playing with the physical instruments takes away all the fun. You can't touch, smell, feel it. It has a negative impact on the music and on the people. I am so smart, I am a democrat, you guys are nazis, you want to destroy humanity while I want to restrict the majority of the people from having fun.".

> lot of effort and money to produce Mathematical formulas too, and you can't copyright them.

If a new device is invented to replay memory "almost verbatim (but in lower quality) for example". Should its use be restricted with regard to copyrighted content? It's your memory, your unique interpretation, shouldn't it belong to you?

AI will get better and you'll be able to easily go up the tree from which content was derived (intentionally or not) based on the similarity and the publication date.

Artists don't need more protections than mathematicians.

squarefoot 2 days ago | parent | prev [-]

> Do you see any learning or any creativity here?

Of course not if we take it to the extreme, ie only copyrighted work reproduced almost identical, but I've used the platform with my own music and it reorganized it in a very interesting way, actually inspiring new songs and arrangements which I'll probably play with real instruments. I haven't the slightest interest in replicating top chart garbage; however lawsuits by major labels are ruining also the creative aspect where no copyrighted work is involved. Suno is now quite likely retraining their model only on free music because of the lawsuits, and despite the hype, for some genres last version turned out awful.

Arainach 2 days ago | parent | prev [-]

>Sorry, but I'm pessimistic. If we don't change how copyright works, pretty much every development in the field will be ruined by greedy copyright holders and their lawyers

Sorry, but I'm pessimistic. If we don't change how AI regulation works, pretty much every creative field will be ruined by greedy tech companies and their planet-burning plagiarism devices.