This is a very interesting thought. I'm not super experienced with low level audio and basically completely ignorant of telephony.

I feel like most people doing audio in music are not working at the low level. Even if they are creating their own plugins, they are probably not integrating with the audio interface. The point of JACK or Pipewire is to basically abstract all of that away so people can focus on the instrument.

The latency in music is a much, much bigger issue than in voice, so any latency spike would render network audio completely unusable. I know Zoom has a "real time audio for musicians" feature, but outside of a few Zoom demos during lockdown, I'm not sure anybody uses this.

Pipewire supports audio channels over network, but again I'm not entirely sure what this is for. Certainly it's useful for streaming music from device A to device B, but I'm not sure anybody uses it in a production setting.

I could see something like a "live coding symphony", where people have their own livecoding setups and the audio is generated on a central server. This is not too different than what, say, Animal Collective did. But while live coding is a beautiful medium on its own, it does lack the muscle memory and tactile feedback you get from playing an instrument.

I would love to see, as you said, these fields collaborate, but these, to me, are the immediate blockers which make it less practical.

▲

qwertox an hour ago | parent | next [-]

"Even if they are creating their own plugins, they are probably not integrating with the audio interface".

The audio interface is abstracted away in exchange for some metadata about the buffer's properties and the buffer itself, and that is true for basically everything related to audio: the buffer is the lowest level the OS offers you, and you are free to implement lower-level stuff in your dsp/instrument, like using assembly, maybe also functions for SSE, AVX or NEON based acceleration.

You get chunks of samples in a buffer, you read them, do something with them and write the result out into another buffer.

"Pipewire supports audio channels over network" thanks for reminding me: I'm planning to stream the audio out of my Windows machine to a raspi zero to which I will then connect my bluetooth headphones. First tests worked, but the latency is really bad with shairport-sync [0] at around 400 ms. This is what I would use Pipewire for, if my workstation were Linux and not Windows.

Maybe Snapcast [1] could be interesting for you: "Snapcast is a multiroom client-server audio player, where all clients are time synchronized with the server to play perfectly synced audio. It's not a standalone player, but an extension that turns your existing audio player into a Sonos-like multiroom solution."

"I could see something like a "live coding symphony", where people have their own livecoding setups and the audio is generated on a central server." Tidal Cycles [2] might interest you, or the JavaScript port named Strudel [3]. Tidal can synchronize multiple instances via Link Synchronization. Then there's Troop [4], which "is a real-time collaborative tool that enables group live coding within the same document across multiple computers. Hypothetically Troop can talk to any interpreter that can take input as a string from the command line but it is already configured to work with live coding languages FoxDot, TidalCycles, and SuperCollider."

[0] https://github.com/mikebrady/shairport-sync

[1] https://github.com/snapcast/snapcast

[2] https://tidalcycles.org

[3] https://strudel.cc

[4] https://github.com/Qirky/Troop*

▲

harvey9 5 hours ago | parent | prev [-]

Regarding Zoom, music lessons 1:1 online are still pretty common. I would guess this won't hold up with multiple musicians.

▲

NikolaNovak 4 hours ago | parent [-]

Music lessons online are common (I've been in them) because they're largely single duplex. Student plays, teacher listens. Then teacher comments and demonstrates, student listens.

There are projects that aim to provide synced multi player jamming, but last I checked they are all based around looping. Human ear SHOCKINGLY does not lend itself to being fooled and will noticed surprisingly small sync issues.

I always compare it with photo editing where you can cheat and smudge some background details with no one the wiser, whereas any regular non-audiophile will notice similar smudging or sync in audio.

	▲	ssl-3 2 hours ago \| parent [-]
		Sonobus is a software project that tries to accomplish live, audio-only multi-player jamming over the public network. It's still limited to whatever latency the network has, but it can be useful for some things. If that means it's mostly useful for loops, then that's up to the musicians. :) (I myself have used it for remote livestream participants, but only for voice. I was able to get distinct inputs into my console just like folks in the studio had, and I gave them a mix-minus bus that included everyone's voice but their own, for their headphones. It worked slick. Interaction was quick and quality was excellent. And unlike what popularly passes for teleconferencing these days: It all flowed smoothly and sounded like they were in the room with us, even though they were a thousand miles away.)