Remix.run Logo
jqpabc123 5 days ago

Engineering reliability is primarily achieved through redundancy.

There is none with Musk's "vision only" approach. Vision can fail for a multitude of reasons --- sunlight, rain, darkness, bad road markers, even glare from a dirty windshield. And when it fails, there is no backup plan -- the car is effectively driving blind.

Driving is a dynamic activity that involves a lot more than just vision. Safe automated driving can use all the help it can get.

Someone1234 5 days ago | parent | next [-]

I agree with everything you're saying; but even outside of Tesla, I'd just like to remind people that LIDAR as a complement to vision isn't at all straightforward. Sensor fusion adds real complexity in calibration, time sync, and modeling.

Both LIDAR and vision have edge cases where they fail. So you ideally want both, but then the challenge is reconciling disagreements with calibrated, and probabilistic fusion. People seem to be under the mistaken impression that vision is dirty input and LIDAR is somehow clean, when in reality both are noisy inputs with different strengths and weaknesses.

I guess my point is: Yes, 100% bring in LIDAR, I believe the future is LIDAR + vision. But when you do that, early iterations can regress significantly from vision-only until the fusion is tuned and calibration is tight, because you have to resolve contradictory data. Ultimately the payoff is higher robustness in exchange for more R&D and development workload (i.e. more cost).

The same reason why Tesla needed vision-only to work (cost & timeline) is the same reason why vision+LIDAR is so challenging.

ethbr1 5 days ago | parent | next [-]

The primary benefit of multiple sensor fusion from a safety standpoint isn't an absolute decrease in errors.

It's the ability to detect sensor disagreements at all.

With single modality sensors, you have no way of truly detecting failures in that modality, other than hacks like time-series normalizing (aka expected scenarios).

If multiple sensor modalities disagree, even without sensor fusion, you can at least assume something might be awry and drop into a maximum safety operation mode.

But we'd think that the budget config of the Boeing 737 MAX would have taught us that tying safety critical systems to single sources of truth is a bad idea... (in that case, critical modality / single physical sensor)

AnIrishDuck 5 days ago | parent | next [-]

> With single modality sensors, you have no way of truly detecting failures in that modality, other than hacks like time-series normalizing (aka expected scenarios).

"A man with a watch always knows what time it is. If he gains another, he is never sure"

Most safety critical systems actually need at least three redundant sensors. Two is kinda useless: if they disagree, which is right?

EDIT:

> If multiple sensor modalities disagree, even without sensor fusion, you can at least assume something might be awry and drop into a maximum safety operation mode.

This is not always possible. You're on a two lane road. Your vision system tells you there's a pedestrian in your lane. Your LIDAR says the pedestrian is actually in the other lane. There's enough time for a lane change, but not to stop.

What do you do?

esafak 5 days ago | parent | next [-]

> Two is kinda useless: if they disagree, which is right?

They don't work by merely taking a straw poll. They effectively build the joint probability distribution, which improves accuracy with any number of sensors, including two.

> You're on a two lane road. Your vision system tells you there's a pedestrian in your lane. Your LIDAR says the pedestrian is actually in the other lane. There's enough time for a lane change, but not to stop.

Any realistic system would see them long before your eyes do. If you are so worried, override the AI in the moment.

AnIrishDuck 5 days ago | parent | next [-]

> They don't work by merely taking a straw poll. They effectively build the joint probability distribution, which improves accuracy with any number of sensors, including two.

Lots of safety critical systems actually do operate by "voting". The space shuttle control computers are one famous example [1], but there are plenty of others in aerospace. I have personally worked on a few such systems.

It's the simplest thing that can obviously work. Simplicity is a virtue when safety is involved.

You can of course do sensor fusion and other more complicated things, but the core problem I outlined remains.

> If you are so worried, override the AI in the moment.

This is sneakily inserting a third set of sensors (your own). It can be a valid solution to the problem, but Waymo famously does not have a steering wheel you can just hop behind.

This might seem like an edge case, but edge cases matter when failure might kill somebody.

1. https://space.stackexchange.com/questions/9827/if-the-space-...

mafuy 4 days ago | parent | next [-]

Voting is used when the systems are equivalent, e.g. 3 identical computers, where one might have a bit flip.

This is completely different from systems that cover different domains, like vision and lidar.

sfifs 4 days ago | parent | prev [-]

Isn't the historical voting pattern something more of a legacy thing dictated by limited edge compute of the past vs necessarily a best practice.

I see in many domains a tendency to oversimplify decision making algorithms for human understanding convenience (eg vote rather that develop a joint probability distribution in this case, supply chain and manufacturing in particular seem to love rules of thumb) rather than use better algorithms that modern compute enables higher performance, safety etc

AnIrishDuck 4 days ago | parent [-]

This is an interesting question where I do not know the answer.

I will not pretend to be an expert. I would suggest that "human understanding convenience" is pretty important in safety domains. The famous Brian Kernighan quote comes to mind:

> Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?

When it comes to obscure corner cases, it seems to me that simpler is better. But Waymo does seem to have chosen a different path! They employ a lot of smart folk, and appear to be the state of the art for autonomous driving. I wouldn't bet against them.

ImPostingOnHN 2 days ago | parent [-]

Seatbelt mechanisms are complicated, airbag timing is complicated, let's just do away with them entirely in the name of simplicity?

No, when it comes to not killing people, I'd say that safer is usually better.

Remember the core function of the system is safety, simplicity is nice to have, but explicitly not as important.

That said, beware of calling something 'complicated' just because you don't understand it, especially if you don't have training and experience in that thing. What's more relevant is whether the people building the systems think it is too complicated.

qingcharles 5 days ago | parent | prev [-]

We're trying to build vehicles that are totally autonomous, though. How do you grab the wheel of the new Waymos without steering wheels? Especially if you're in the back seat staring at Candy Crush.

esafak 5 days ago | parent [-]

Waymos are safer, and drive more defensively than humans. There is no way a Waymo is going to drive aggressively enough to get itself into the trolley problem.

terribleperson 5 days ago | parent | prev | next [-]

This situation isn't going to happen unless the vehicle was traveling at unsafe speeds to begin with.

Cars can stop in quite a short distance. The only way this could happen is if the pedestrian was obscured behind an object until the car was dangerously close. A safe system will recognize potential hiding spots and slow down preemptively - good human drivers do this.

AnIrishDuck 5 days ago | parent [-]

> Cars can stop in quite a short distance.

"Quite a short distance" is doing a lot of lifting. It's been a while since I've been to driver's school, but I remember them making a point of how long it could take to stop, and how your senses could trick you to the contrary. Especially at highway speeds.

I can personally recall a couple (fortunately low stakes) situations where I had to change lanes to avoid an obstacle that I was pretty certain I would hit if I had to stop.

terribleperson 4 days ago | parent [-]

At the driving school I attended, they had us accelerate to 50 mph and then slam on the brakes so we'd have a feel for the distance (and the feel).

While it's true they don't stop instantaneously at highway speeds, cars shouldn't be driving highway speeds when a pedestrian suddenly being in front of you is a realistic risk.

AnIrishDuck 4 days ago | parent [-]

What if the obstacle is not a person? What if something falls off a truck in front of the vehicle? What if wildlife spontaneously decides to cross the road (a common occurrence where I live)?

I don't think these problems can just be assumed away.

cameldrv 4 days ago | parent | prev | next [-]

You don't really ever have "two sensors" in the sense that it's two measurements. You have multiple measurements from each sensor every second. Then you accumulate that information over time to get a reliable picture. If the probability of failure on each frame were independent, it would be a relatively simple problem, but of course you're generally going to get a fairly high correlation from one frame to the next about whether or not there's a pedestrian in a certain location. The nice thing about having multiple sensing modalities is that the failure correlation between them is a lot lower.

For example, say you have a pedestrian that's partially obscured by a car or another object, and maybe they're wearing a hat or a mask or wearing a backpack or carrying a kid or something, it may look unusual enough that either the camera or the lidar isn't going to recognize it as a person reliably. However, since the camera is generally looking at color, texture, etc in 2D, and the Lidar is looking at 3D shapes, they'll tend to fail in different situations. If the car thinks there's a substantial probability of a human in the driving path, it's going to swerve or hit the brakes.

consumer451 5 days ago | parent | prev | next [-]

> > If multiple sensor modalities disagree, even without sensor fusion, you can at least assume something might be awry and drop into a maximum safety operation mode.

> This is not always possible. You're on a two lane road. Your vision system tells you there's a pedestrian in your lane. Your LIDAR says the pedestrian is actually in the other lane. There's enough time for a lane change, but not to stop.

> What do you do?

Go into your failure mode. At least you have a check to indicate a possible issue with 2 signals.

Mentlo 4 days ago | parent | next [-]

I came here to write the same comment you did. What I’d suspect (I don’t work in self driving but I do in AI) is the issue is that this mode of operation would happen more often than not as the sensors disagree in critical ways more often than you’d think. So going “safety first” every time likely critically diminishes UX.

The issue is not recognising that optimising for Ux at the expense of safety here is the wrong call, motivated likely by optimism and a desire for autonomous cars, more than reasonable system design. I.e. if the sensors disagree so often that it makes the system unusable, maybe the solution is “we’re not ready for this kind of technology and we should slow down” rather than “let’s figure out non-UX breaking edge case heuristics to maintain the illusion of autonomous driving being behind the corner”.

Part of this problem is not even technological - human drivers tradeoff safety for UX all the time - so the expectation for self driving is unrealistic and your system has to have the ethically unacceptable system configuration in order to have any chance of competing.

Which is why - in my mind - it’s a fools endeavour in personal car space, but not in public transport space. So go waymo, boo tesla.

ethbr1 4 days ago | parent | prev [-]

Exactly my point. That you know the systems disagree is a benefit, compared to a single system.

People are underweighting the alternative single system hypothetical -- what does a Tesla do when its vision-only system erroneously thinks a pedestrian is one lane over?

ranger_danger 5 days ago | parent | prev | next [-]

> This is not always possible. You're on a two lane road. Your vision system tells you there's a pedestrian in your lane. Your LIDAR says the pedestrian is actually in the other lane. There's enough time for a lane change, but not to stop.

This is why good redundant systems have at least 3... in your scenario, without a tie-breaker, all you can do is guess at random which one to trust.

Someone1234 5 days ago | parent [-]

That's a good point, but people do need to keep in mind that many engineered systems with three points of reference have three identical points of reference. That's why it works so well, a common frame of reference (i.e. you can compare via simple voting).

For example jet aircraft commonly have three pitot static tubes, and you can just compare/contrast the data to look for the outlier. It works, and it works well.

If you tried to do that with e.g. LIDAR, vision, and radar with no common point of reference, solving for trust/resolving disagreements is an incredibly difficult technical challenge. Other variations (e.g. two vision + one LIDAR), does not really make it much easier either.

Tie-breaking during sensor fusion is a billion+ dollar problem, and will always be.

abraae 5 days ago | parent | prev [-]

> Never go to sea with two chronometers; take one or three.

leoc 5 days ago | parent | prev [-]

> If multiple sensor modalities disagree, even without sensor fusion, you can at least assume something might be awry and drop into a maximum safety operation mode.

Also, this is probably when Waymo calls up a human assistant in a developing-country callcentre.

ethbr1 4 days ago | parent [-]

Saw that happen a week ago, actually. Non-sensor problem, but a Waymo made a slow right turn too wide, approached the left turning lane of cars, then safed itself by stopping, then remote assistance came online and extricated it.

jqpabc123 5 days ago | parent | prev | next [-]

The same reason why Tesla needed vision-only to work (cost & timeline)

But vision only hasn't worked --- not as promised, not after a decade's worth of timeline. And it probably won't any time soon either --- for valid engineering reasons.

Engineering 101 --- *needing* something to work doesn't make it possible or practical.

ra7 5 days ago | parent | prev | next [-]

The complexity argument rings hollow to me. It’s a bit like saying distributed databases are complex because you have to deal with CAP guarantees. Yes, but people still develop them because it has real benefits.

It was maybe a valid argument 10 years ago, but in 2025 many companies have shown sensor fusion works just fine. I mean, Waymo has clocked 100M+ miles, so it works. The AV industry has moved on to more interesting problems, while Tesla and Musk are still stuck in the past arguing about sensor choices.

leoc 5 days ago | parent [-]

Well, it's more like sensor fusion plus extensive human remote intervention, it seems: https://www.nytimes.com/interactive/2024/09/03/technology/zo... . Mind you, if it takes both LiDAR and call-centre workers to make self-driving work in 2025 and for the foreseeable future, that makes Tesla's old ambition to achieve it with neither look all the more hopeless.

ra7 5 days ago | parent | next [-]

Tesla robotaxis have teleoperation [1], which is worse than remote assistance others use because the operators have direct control. Yet they cannot fully remove safety personnel from the car.

The old ambition is dead.

[1] https://electrek.co/2025/05/16/tesla-robotaxi-fleet-powered-...

Narciss 5 days ago | parent | prev [-]

V interesting, thanks for sharing, I didn’t know this

microtherion 5 days ago | parent | prev | next [-]

> but then the challenge is reconciling disagreements with calibrated, and probabilistic fusion

I keep reading arguments like this, but I really don't understand what the problem here is supposed to be. Yes, in a rule based system, this is a challenge, but in an end-to-end neural network, another sensor is just another input, regardless of whether it's another camera, LIDAR, or a sensor measuring the adrenaline level of the driver.

If you have enough training data, the model training will converge to a reasonable set of weights for various scenarios. In fact, training data with a richer set of sensors would also allow you to determine whether some of the sensors do not in fact contribute meaningfully to overall performance.

overfeed 5 days ago | parent | prev | next [-]

> cost & timeline

It's really hard to accept cost as the reason when Tesla is preparing a trillion dollar package. I suppose that can be reconciled if one considers the venture to be a vehicle (ha!) to shovel as much money as possible from investors and buyers into Elon's pockets, I imagine the prospect of being the worlds first trillionare is appealing.

Earw0rm 5 days ago | parent | prev | next [-]

There's no particular reason to use RGB for this kind of machine vision - cognition problem either.

Infra-red of a few different wavelengths as well as optical light ranges seems like it'd give a superior result?

jqpabc123 5 days ago | parent [-]

You've just described some of the rationale for using LIDAR.

overfeed 5 days ago | parent | prev | next [-]

> cost & timeline

It's really hard to accept cost as the reason when Tesla is preparing a trillion dollar package. I suppose that can be reconciled if the venture is a vehicle (ha!) to shovel money from investors and buyers into Elon's pockets, I imagine the prospect of being the worlds first trillionare is appealing.

atcon 5 days ago | parent | prev | next [-]

Your comments on sensor fusion seem to describe the weird results of 2 informal ADAS (lidar, vision, lidar + vision, lidar + vision + 4d imaging radar, etc.) “tournaments” conducted earlier this year. There was an earlier HN post about it <https://news.ycombinator.com/item?id=44694891> with a comment noting “there was a wide range of crash avoidance behavior even between the same car likely due to the machine learning, and that also makes explaining the differences hard. Hopefully someone with more background on ADAS systems can watch and post what they think.”

Notably, sensor confusion is also an “unsolved” problem in humans, eg vision and vestibular (inner ear) conflicts possibly explaining motion sickness/vertigo <https://www.nature.com/articles/s44172-025-00417-2>

The results of both tournaments: <https://carnewschina.com/2025/07/24/chinas-massive-adas-test...> Counterintuitively, vision scored best (Tesla Model X)

The videos are fascinating to watch (subtitles are available): Tournament 1 (36 cars, 6 Highway Scenarios): <https://www.youtube.com/watch?v=0xumyEf-WRI> Tournament 2 (26 cars, 9 Urban Scenarios): <https://www.youtube.com/watch?v=GcJnNbm-jUI>

Highway Scenarios: “tests...included other active vehicles nearby to increase complexity and realism”: <https://electrek.co/2025/07/26/a-chinese-real-world-self-dri...>

Urban Scenarios: “a massive, complex roundabout and another segment of road with a few unsignaled intersections and a long straight...The first four tests incorporated portions of this huge roundabout, which would be complex for human drivers, but in situations for which there is quite an obvious solution: don’t hit that car/pedestrian in front of you” <https://electrek.co/2025/07/29/another-huge-chinese-self-dri...>

maxlin 5 days ago | parent | prev | next [-]

I think you hit the nail on the head - Obviously when Tesla have saturated the potential of vision, they should bring in LiDAR if it can be reasonably added from a hardware point of view. Their current arguments make this clear - it would be surface-level thinking to add LiDAR and the kitchen sink now, complicating the system's evolution and axing scalability.

But we're far from plateauing on what can be done with vision - Humans can drive quite well with essentially just sight, so we're far from extinguishing what can be done with it.

baby 5 days ago | parent | prev | next [-]

Sure but if you see something in front of you but LIDAR says "nope I can see 500m away" then you know LIDAR is right

anthem2025 3 days ago | parent | prev [-]

Are people under that impression or are you just repeating the sort of nonsense musk pushes about how sensor fusion is bad?

jmpman 3 days ago | parent | prev | next [-]

Tesla has redundant front facing cameras on their cars. In my 2019 Model 3, there are three front facing cameras, each with varying angles of view, all three behind the rear view mirror, all encased in a small area lined with anti reflective material. Living in an extremely hot climate, that small area, with its anti reflective fuzz have degraded, depositing a film on the window, only in front of the cameras, obscuring all three cameras at the same time. Now, my Tesla just recently started complaining when the sensors were obscured with this deposit, but that wasn’t always the case. I used to be driving down the freeway with autopilot on, and it could barely track. Eventually I looked at the saved video footage and discovered my Tesla was virtually blind, while driving me down the freeway at 85mph. At least now, with recent updates, it warns me that it can’t see very well. However I question the resolution of the sensors. To drive legally in my state, you must have 20/40 vision. When I move my head around, I effectively have 20/40 vision all around my car. If I close 1 eye, I still have 20/40 vision. Does Tesla have effectively 20/40 vision in all 360 degrees? Maybe one of the front facing cameras has optical resolution equal to 20/40, but do the rest of them? I’m skeptical, and expect I’m being driven by what’s equivalent to a human who couldn’t pass the vision test, or at best, a human with just one eye that can pass the vision test. This isn’t even getting into redundancy in the electronics boards, connectivity from the electronics to the CPU, and redundancy in the processsing. We are being asked to put our faith/lives in these non redundant systems, but they’re not designed like Class-A flight critical systems on airplanes.

SoftTalker 5 days ago | parent | prev | next [-]

> Vision can fail for a multitude of reasons --- sunlight, rain, darkness, bad road markers, even glare from a dirty windshield. And when it fails, there is no backup plan

So like a human driver. Problem is, automatic drivers need to be substantially better than humans to be accepted.

tarsinge 4 days ago | parent [-]

Humans have a brain though. Current AI is nowhere near that as every engineer know it but common people seem to forget it with all the PR.

brandonagr2 4 days ago | parent | prev | next [-]

Lidar is not a backup to vision, in a waymo both lidar and vision must be working, so you actually have less reliability as now you have two single points of failure.

Ocha 5 days ago | parent | prev [-]

Yeap. Same mistake that Boeing did with making redundancy optional upgrade on max8.

jqpabc123 5 days ago | parent [-]

Another example of what happens when management starts making engineering decisions.