Remix.run Logo
mohsen1 9 hours ago

Fun fact: This video was made accessible to sighted people because no blind person would ever listen to voice at that speed. Honestly if you ever observe a blind person using computers you'd impressed how they can listen to audio at unimaginable speeds.

asimovDev 8 hours ago | parent | next [-]

https://youtu.be/wKISPePFrIs?si=ahGfFp0U7-pTU9w6&t=43

my go to example of this is this talk by Saqib Shaikh (a blind software engineer at Microsoft) giving a talk about Visual Studio. Link is timestamped

isityettime 8 hours ago | parent | next [-]

I think it takes quite a lot of practice to reach this speed. It's not rare among blind developers, but I think it still takes a lot of work to get there. Pretty impressive!

I wish more people would watch videos like this just because having a realistic idea of how blind people do certain tasks can help you move from pity or even compassion to a more productive kind of understanding. I think sometimes when you haven't seen it, you can't really even imagine how it can be done.

Aboutplants 7 hours ago | parent | next [-]

I listen to a lot of podcasts and listen at 1.5-2.0 speed and it’s to the point that I literally cannot stand listening to 1.0 speed anymore as they go too slowly (depending on the content of course).

simondotau 7 hours ago | parent | next [-]

Same. Returning to 1x speed makes people sound (to my 2x-abused ears) drunk and slurring their works. If I want to listen to something slowly and carefully, I will just about tolerate 1.25x.

What really frustrates me is watching/listening to discussion of music, because I am forced to listen to the talking at 1x because the music sounds wrong (and is wrong) at anything other than 1x.

kevin_thibedeau 6 hours ago | parent | next [-]

The funny thing is that slow talkers sound normal at 2x speed. It's jarring when you hear their actual speech.

ghaff 6 hours ago | parent [-]

I listen to podcasts at 1x. But there are a few people I've done podcasts with that I do various audio tricks to speed up.

BurningFrog 6 hours ago | parent | prev [-]

Playing music at 1x should be a pretty simple feature to add to those apps.

Ideally it should be done while encoding.

ebiester 7 hours ago | parent | prev | next [-]

I'm so glad YouTube and other podcast players have moved to support 3.0 speed. As I get comfortable with one, I move it up some. For things like sports and "did you know" content, I can go 2.5 if I'm not multitasking. For technical content, sometimes I'm stuck at 1.0.

kevin_thibedeau 6 hours ago | parent [-]

You can get browser extensions to do it for all media controls on any site. YouTube's "Premium" for 3x is laughable when it's an internal browser function.

webstrand 6 hours ago | parent | next [-]

Another fun thing is if you use an extension you can fast-forward through the advertisements too. For some channels I use around 3.5x playback speed.

gregoryl 4 hours ago | parent [-]

Ublock origin blocks the ads entirely on Firefox.

satvikpendem 3 hours ago | parent [-]

They're talking about in video sponsor ads, and those can be skipped using SponsorBlock or similar.

LoganDark 4 hours ago | parent | prev | next [-]

Premium is for up to 4x, not just 3x

thrownthatway 6 hours ago | parent | prev [-]

That’s an amusing observation.

Likewise, YouTube’s “premium” feature of not displaying ads is laughable when displaying content is literally an internal browser function.

I pay anyway, because I was going to pay for an on-demand streaming music service anyway.

michaelbuckbee 7 hours ago | parent | prev | next [-]

Something that the Overcast podcast player does (and probably others) is silence removal, which in some ways is even better than the raw speedup.

runjake 6 hours ago | parent | prev | next [-]

I am jealous. I can't listen and retain most podcasts at more than 1.0x. I even disable the podcast player functionality that eliminates pauses and silent sections.

bilater 4 hours ago | parent | prev | next [-]

Same haha. But for me 1.5x is the sweet spot. Anything more and I find myself rewinding a lot. I want to feel comfortable absorbing info and not on constant alert.

lowercased 6 hours ago | parent | prev | next [-]

I do the opposite in a few. There's some I follow weekly and it's only an hour or so each. I drop it to .7 or .8 because I want to get a bit more time with the hosts. Possibly stupid but I've sort of got used to some of these folks at that speed, and the normal speed is 'weird'. One is a political podcast, and when they play clips of Trump, he does always sound very drunk, but the hosts themselves (to me) don't sound drunk, just... measured. Some of it may be audio quality - I'm getting their microphone directly, often the audio clips are from field recordings.

thrownthatway 6 hours ago | parent | prev [-]

Except Marc Andreessen, I can’t decode his speech at 2x

Maybe it’s just a matter of practice.

miki123211 7 hours ago | parent | prev | next [-]

> It's not rare among blind developers

It's not rare among the blind in general.

Unless you're completely technologically illiterate, the kind of person who has no idea how to install an app or sign up for an online account, you're probably doing something of the sort.

gostsamo 7 hours ago | parent | prev [-]

If you are dedicated, few weeks to few months of usage with regular ramp up. You should be careful with adjusting which symbols are read though and sometimes the programing languages matters because different symbols have different significance for understanding the code.

dijit 8 hours ago | parent | prev | next [-]

Ho-ly cow. That is very impressive.

I'm not even sure what to say, but discoveries like this are why I use hackernews, I'd never have known this otherwise.

miki123211 7 hours ago | parent [-]

To be fair, the acoustics of the room that talk was given in are... not too great, to put it mildly.

I can easily understand Eloquence (the speech synthesizer he's using) at that speed, but I struggled a bit with this one.

peab 5 hours ago | parent | prev [-]

Woah, this is really cool to see

throwatdem12311 7 hours ago | parent | prev | next [-]

I did IT for a community Center way back in the day and the director was blind. I was blown away by how fast his screen reader read things out to him - completely incomprehensible to me - and his efficiency with keyboard shortcuts would put even vim/emacs elitists to shame.

miki123211 6 hours ago | parent [-]

The way (Windows) screen readers handle web navigation is basically Vim in disguise.

You have two modes: "focus mode", where you can edit text in text fields and keys are passed straight to the browser, and "browse mode", where keys move a virtual cursor around the page.

In browse mode, navigating with just arrow keys all the time would be just as slow as you might imagine, so you use single-key keyboard shortcuts to move by role, E.G. to the next heading, button, table or unvisited link.

The keyboard layout is optimized for memorizability and not efficiency, you use the actual arrow keys instead of hjkl for example, but the concepts are eerily similar.

There are a couple of other approaches to solve this problem, Mac OS's Voice Over is much more Emacs-like for example, and each approach has its own pros and cons, but that's definitely one way to do it.

isityettime 8 hours ago | parent | prev | next [-]

Probably because it's an advertisement, and super fast robot voices can feel extremely harsh and annoying. Even blind people who rely on them find them overstimulating sometimes, lol.

freedomben 8 hours ago | parent | prev | next [-]

Indeed, and not just fast, but often heavily robotic (which many sighted people struggle to understand even at 1.5x). I remember reading about a blind person who learned how to do echo-location using sound, and it seemed like such a cool superpower, that one of these days I'm going to take the plunge and unplug my monitor and start learning how to really use the tools. I worked with a blind person a few years back who got almost double the battery life from his laptop as the rest of us by having the screen off all the time, so that alone would be a nice feature. I may never get to the epic level of echo-location, but if I get even half-way there it would be awesome. With a bonus of being able to actually QA a11y changes.

Barbing 6 hours ago | parent | next [-]

> blind person who learned how to do echo-location using sound

RIP kid https://youtu.be/fnH7AIwhpik

thrownthatway 6 hours ago | parent [-]

I’m not gonna watch that as I’d rather stick to my head-canon that he had an altercation with a dolphin.

Barbing 5 hours ago | parent [-]

:) IIRC that video would have been fully produced/published during his lifetime (but 100% would have to avoid the comments!)

If he’d like your humor I like it too :dolphin:

thrownthatway 6 hours ago | parent | prev [-]

> echo-location

We all do that, I mean unless you’re hearing impaired.

Everyone’s familiar with dropping a coin or such and knowing exactly where it landed without looking.

That’s more passive sonar though.

Do I recall seeing videos of guys mountain biking and making a hissing sound for an active sonar style echo location?

Or am I making that memory up.

thrownthatway 6 hours ago | parent | prev | next [-]

Twenty years ago I took a level 1 tech support call from a visually impairment guy and it took about 3.2 seconds to realise his condition was no impediment for using a computer because of the screen reader tech he was using.

satvikpendem 6 hours ago | parent | prev | next [-]

I listen to a lot of podcasts and YouTube videos at 3x or 4x speed now, having slowly built up the skill over a few years. It's pretty nice now and saves time, and it's remarkable how well the human brain can adapt to such input.

Rendello 3 hours ago | parent | next [-]

I watch most talks at 2x speed or 1.5x if it's a really technical topic. Bryan Cantrill excepted!

a012 6 hours ago | parent | prev | next [-]

I’m the opposite, I can’t stand the fast speaking videos. But I also speed up 1.2x to 1.5x if the videos were too slow.

thrownthatway 6 hours ago | parent [-]

I’m struggling to understand your definition of opposite here.

Wouldn’t opposite mean you listen at sub 1x speed.

Whereas as your definition seems to be ”I’m the same, but less so.”

brador 6 hours ago | parent | prev [-]

You recall nothing and you know it. You're just wasting time you could use for something useful or meaningful in your life. Kids call it "Anxiety cope" but I don't agree.

ragazzina 6 hours ago | parent | next [-]

Can you recall 3 lines of dialogue from the latest movie you watched?

brador 5 hours ago | parent [-]

If you're listening to a podcast at 3x you're trying to learn something. No one is trying to learn watching a movie.

satvikpendem 4 hours ago | parent | prev | next [-]

Maybe you can't but I can recall whatever I need to.

thrownthatway 6 hours ago | parent | prev [-]

[flagged]

embedding-shape 8 hours ago | parent | prev | next [-]

> Honestly if you ever observe a blind person using computers you'd impressed how they can listen to audio at unimaginable speeds.

Even better, fire up Orca (or whatever screenreader application your OS comes with) yourself and try to use your computer while shutting your eyes, kind of eye-opening (no pun intended) what kind of experience these sort of users typically get. And also, you quickly start to understand why they set the speech rate for their voice synthesizer to be so fast, it's almost unbearable navigating applications (and particularly lists) otherwise.

jchw 8 hours ago | parent | next [-]

When I was at Google, I'd periodically test our (internal-only) app with Chromevox with the display off. It's not that it sounded like it would be easy, but it really is a challenge, and I can only imagine the muscle memory built up over time of trying to work around accessibility bugs and strange behaviors.

Unfortunately it seems impossible to get all that much funding for accessibility work :/ I wonder what ever happened to the Newton accessibility bus intended to supplement Wayland...

kridsdale1 6 hours ago | parent | next [-]

I’ve worked at Apple Facebook and Google. Apple was the only one that made a11y bugs and a face to face consultation with a blind developer to show you how your app sucked, mandatory before you could launch.

embedding-shape 6 hours ago | parent | prev | next [-]

> I wonder what ever happened to the Newton accessibility bus intended to supplement Wayland...

Hm, never heard about it, but now I'm wondering too. I just finished implementing proper accessibility support for my native app toolkit for Linux, macOS and Windows, but only done it for X11 so far, I was just gonna get started with Wayland. What is the accessibility story on Wayland, couldn't people rely on the same protocols as with X11? That was my impression, but haven't really dig into yet.

RobMurray 2 hours ago | parent [-]

It's still AT-SPI for wayland, the main difference is how screen readers grab keyboard input events.[0] I don't think there is a big difference from a toolkit point of view. I don't personally have experience with Wayland because most blind people recommend Mate as being the most accessible desktop still.

Thanks for considering a11y for your toolkit - it really makes a difference to those of us who are disabled. Are you implementing a11y separately for each platform? If you use accesskit[1] you only have to implement it once for all platforms. I recently vibe coded accessibility for the Swell toolkit[2] used by Reaper. I have a branch using accesskit and a branch implementing at-spi. Accesskit made things a lot easier and more performant.

Let me know if you would like a screen reader user to help with testing your toolkit.

[0] https://lwn.net/Articles/1025127/

[1] https://github.com/AccessKit/accesskit

[2] https://github.com/RDMurray/WDL/tree/accesskit

and my fork of accesskit with some features and fixes for unix: https://github.com/RDMurray/accesskit/tree/swell-fixes

miki123211 6 hours ago | parent | prev [-]

The muscle memory build-up is definitely real.

There are apps I use semi-regularly that less-experienced screen reader users thought were inaccessible, and I couldn't even explain what they were doing wrong from memory. The ways of working around accessibility issues are just so ingrained in me that all I can usually remember is "yeah I did this somehow, but it was six months ago and I have absolutely no idea which specific tricks I needed for this one."

seviu 8 hours ago | parent | prev | next [-]

That time my Mac display broke and I had to log in taught me much about how important learning accessibility is even for non blind people.

isityettime 8 hours ago | parent | prev [-]

> you quickly start to understand why they set the speech rate for their voice synthesizer to be so fast, it's almost unbearable navigating applications (and particularly lists) otherwise.

I imagine that for coding it also helps deal with the fundamental problem of an ephemeral stream rather than a persistent document that you can navigate visually in multiple dimensions. Working memory is limited, and getting more text in in a short period of time probably helps you work within that better. I also imagine that working with text via audio all the time gradually stretches and improves memory.

miki123211 6 hours ago | parent [-]

It's not the ephemeral stream that's the problem, it's the limited bandwidth.

You can show a lot more info on a screen than you can transmit through speech in a short period of time. That doesn't mean you read faster than you listen, just that sighted people essentially use their eyeballs as an "input device" to decide what information to look at.

If there's an object on the screen that you want to examine but that you don't need to click, you can just "navigate to it" with your eyeballs, without ever touching a mouse or keyboard. We don't have that luxury.

This means we need a much more efficient system for navigating what's on the screen, but that only gets you so far. Eventually, the easiest way to deal with this problem is just to increase the bandwidth of your channel, and you do that by increasing the speech rate.

dempedempe 5 hours ago | parent | prev | next [-]

The difference is that the voice in the video is a natural, human voice. It's the robotic sounding voices that always pronounce the same letters the exact same way (mostly the Eloquence family of voices) that enable blind people to listen at superhuman speeds. You can't listen to a real voice that fast.

RobMurray 8 hours ago | parent | prev | next [-]

I know plenty of blind people who have their voice speed unbearably slow and barely scratch the surface of what technology can do for them. I think an interface where you can tell your phone what to do in natural language will really help a lot of less technical people.

I'm not getting my hopes up though given apple's history with Siri, which is truly awful.

chipotle_coyote 8 hours ago | parent | next [-]

Apple's history with accessibility is, on the whole, pretty good. I strongly suspect that the "coming soon" part of this means "after we integrate Google Gemini models into the system," so I don't think you should use the current state of Siri as a yardstick. (I actually have decent luck with the current Siri, but I don't push it very much and have sort of adapted myself to its limitations; on the flip side, I have a lot of skepticism around LLMs, but they're really a quantum leap in natural language processing capability over what came before, and the use cases they're showing here seem to be right in the LLM wheelhouse -- with the asterisk of "you're still always going to have to check its work.")

alwillis 4 hours ago | parent | next [-]

> I strongly suspect that the "coming soon" part of this means "after we integrate Google Gemini models into the system…"

I don’t think the Google's tech has anything to do with these features.

This would had to have been in the works long before the Google announcement. Also, these are enhancements of existing iOS and macOS features. They don’t require an LLM anyway; these features use Apple’s Machine Learning models.

For example, creating subtitles for videos? iOS 16 introduced Live Captions for FaceTime calls in 2022 [1].

[1]: https://www.apple.com/newsroom/2022/05/apple-previews-innova...

miki123211 6 hours ago | parent | prev [-]

Coming soon very likely means iOS 27.

This has been the typical pattern for Apple for the last few years. The flashy features are announced at WWDC, accessibility has a dedicated, earlier press release. Before this practice, accessibility announcements would usually be tucked in some WWDC slide that most people wouldn't even notice.

duskwuff 4 hours ago | parent | next [-]

> accessibility has a dedicated, earlier press release

IIRC, it's timed to land around Global Accessibility Awareness Day (May 21).

https://accessibility.day/

Barbing 6 hours ago | parent | prev [-]

The thing that disappointed me about this amazing announcement was “coming later this year“. They should probably give us dates for a little while at least until we get the (<)$95 checks.

I just would not wanna promise anything. Except “available for download this Friday“ once the gold master is passing tests.

alwillis 4 hours ago | parent [-]

The "coming later this year" language is disappointing to some people, but that's just Apple propriety. Allow me to explain.

"Coming later this year" means it's part of a publicly committed release — iOS 27, macOS 27, etc. — not vaporware.

The annual pre-WWDC accessibility announcement is a tradition, and with the conference less than a month away, expect more detail then. New a11y features have a good chance of appearing in the 10am PT keynote or the Platforms State of the Union, the developer-focused follow-up at 1pm PT.

That said, things are still fluid with three weeks to go — features can be added or pulled at any time. If something gets bumped from the main presentations, there will almost certainly be a dedicated video session covering it.

As for availability: some of these features will land in the iOS 27 and macOS 27 betas, which drop during WWDC for Apple Developer Program members. The public beta follows in July, and there's a free tier of the developer program if you want early access.

Don't expect everything at once, though. Some features won't arrive until the September release candidates — and even then, a few may ship labeled "beta" or "experimental," or hold for a future dot release.

thrownthatway 6 hours ago | parent | prev | next [-]

Being able-bodied and sighted is probably the biggest disadvantage for using iOS.

Twenty years and text input & manipulation on iPhone sucks a big fat hair pair of dogs balls still.

The last time I daily drove Android was over two years ago and it was immeasurably less God-damn-I-wanna-dig-Jobs-corpse-up-n-give-the-guy-a-piece-of-my-mind, only problem is his grave is unmarked. Arsehole!

isityettime 8 hours ago | parent | prev [-]

Whenever my sister (blind) and I (visually impaired) visit my mom (blind) we secretly turn up the reading speed on her TV just a little because we can't stand how unbearably slow she keeps it, but if we turn it up quickly, she'll freak out.

After a few more years of Thanksgivings and Christmases and Mothers' Days, we'll finally train her up to a reasonable speed lmao.

kridsdale1 6 hours ago | parent [-]

This is heartwarming. The audio equivalent to the practice of sighted people fixing the bad default settings on boomers’ televisions each Thanksgiving.

ShinyLeftPad 8 hours ago | parent | prev | next [-]

Blind people can't change video speed? The control is available right there.

kochb 8 hours ago | parent | next [-]

Yes, the audio speed can be adjusted.

Whether that control you see visually is actually accessible to a blind user is a different matter entirely. Further, it maxes out at 2x, but a blind person would typically screen read at the equivalent of 3-6x.

ShinyLeftPad 7 hours ago | parent [-]

Huh, 2x is low even sometimes for sighted people.

Related, it seems like YouTube recently paywalled speed increase beyond 2x. Another way in which it's not cheap to lose sight, I guess.

the_other 7 hours ago | parent | next [-]

> Another way in which it's not cheap to lose sight, I guess.

True.

We can frame it even more strongly: "default societal practices actively discriminate against people with disabilities; they intentionally, consciously choose to make life harder for people who're disadvantaged".

thrownthatway 5 hours ago | parent [-]

[flagged]

entrope 7 hours ago | parent | prev [-]

> Another way in which it's not cheap to lose sight, I guess.

Seems like it would be a win-win to have a user setting to opt out of video in exchange for ungating that feature.

jofzar 8 hours ago | parent | prev [-]

No they are saying that the audio playing for tts would be at like 2.4x what's in the commercial.

ShinyLeftPad 8 hours ago | parent [-]

I don't get it. The speed of TTS can be adjusted, right?

Pretty sure there's enough blind people who don't listen to voice at insane speeds, because they listen in their non-native second language or for whatever other reason. What's wrong in using lowest common denominator that's 100% accessible to those people as well as people who want faster speeds? Unlike "too fast", "too slow" doesn't get entirely inaccessible, it's just boring.

Such a random reason to criticize for.

superchink 8 hours ago | parent | next [-]

I don’t think it’s meant to be criticism. It’s an interesting piece of information that gives a peek into how those with vision impairment consume content. There’s nothing wrong with it; but it was enlightening to consider the experience for those of us who have not been forced to.

ShinyLeftPad 8 hours ago | parent [-]

Seems like I brought my own negativity into this...

hombre_fatal 7 hours ago | parent [-]

I don't think you did.

Some blind people listen to things at superhuman speeds, but not all blind people. Using a normal reading speed is a sensible choice for an ad trying to appeal to blind people since you don't want to intimidate those who don't use superhuman speeds.

Going from that to "heh a sighted person made this because it's normal speed" is simply incorrect.

It was the sort of statement an HNer might make to showcase some trivia they have about some other group, but they oversold it.

isityettime 8 hours ago | parent | prev | next [-]

> Pretty sure there's enough blind people who don't listen to voice at insane speeds, because they listen in their non-native second language or for whatever other reason.

Yes, for lots of reasons. It takes practice to get up to a high speed with a given TTS. People who go blind later in life are just beginning, and it can take a long time for them to get up to really high speeds. You may also need to reset somewhat when you change from one TTS to another. And blind people's ears are subject to problems just like anyone else's; if your hearing isn't great you may need slower speeds or higher volumes or both. That's why even though most people use screenreaders at much higher speeds, the defaults when you turn on a new device are painfully slow. You have to set a conservative default so people with less experience/worse ears/whatever can get by.

Anyway I don't think it's a criticism. It's just noting that it doesn't depict how most people will use end up using it, and if you're curious about what typical usage sounds like, you should look for another example.

stavros 7 hours ago | parent | prev [-]

No. It's not criticism. What they're saying is that the video was shot with a default that a sighted person could understand, because any blind person would naturally have their speed set to much higher than that.

It's like how in videos that teach people a foreign language, everyone speaks slowly and uses simple words, even though native speakers don't talk like that at all. The GP is simply saying that an actual blind person would be way more efficient at it, but they made the video with inefficient settings so sighted people could understand what was going on.

UltraSane 6 hours ago | parent | prev | next [-]

I briefly worked at a call center and I would hear supervisors listening to recorded calls at warp speed.

thrownthatway 5 hours ago | parent [-]

> boiler call center

What does this mean?

js2 2 hours ago | parent [-]

https://en.wikipedia.org/wiki/Boiler_room_(business)

bitwize 8 hours ago | parent | prev | next [-]

I've heard textual description tracks on television programs before. They come fast, but not screen-reader fast. To the untrained ear a blind person's screen reader sounds like when you somehow get the TI-99/4A's speech synthesizer to read from invalid memory.

isityettime 8 hours ago | parent | next [-]

The audio description tracks are a different genre than screenreadera perform. They're acting, by actors, carefully written and performed to fit into the gaps in the dialogue while preserving the mood and flow of the show. I think speeding them up or making them robotic would ruin them, while both of those traits are actually desirable for screenreaders.

RobMurray 2 hours ago | parent [-]

Ideally that is what AD should be like. too often you set the volume right for a movie so the characters can be heard, then the AD is like an insanely boomy voice that shakes the room. Plus for some reason the also turn the movie audio down, as if that would be necessary.

Barbing 6 hours ago | parent | prev [-]

How did you come across those tracks? Never have myself.

bitwize an hour ago | parent [-]

My in-laws once misconfigured their television and it came blaring through.

Sweepi 8 hours ago | parent | prev [-]

dont you worry, as a sighted person I am also infuriated by apples slooow reading speed, e.g. for "Announce Notifications".

hightrix 7 hours ago | parent [-]

Also as a sighted person, this is why I hate the modern trend of using the video format to show 3-4 bullet points. Just give me the text.