Autonomous vehicles enter highways

Waymo stands alone in deploying unsupervised, autonomous vehicles to paying customers across several geographies and both city and highway driving in the US.

Waymo also adds 200 vehicles to the bay area, expanding around San Jose, CA.

Waymo is offering freeway rides to select riders in San Francisco, Los Angeles, and Phoenix, starting Wednesday, positioning the Alphabet company as the only fully autonomous ride-hailing service in the US taking public passengers on high-speed roads.

The move is a critical milestone for Waymo.

The robotaxi company has been testing fully autonomous rides on public freeways with employees for more than a year. Opening freeway routes to its paid service signals that the company believes its autonomous system is safe enough for public riders.

the company will increase its fleet from over 800 robotaxis to more than 1,000 cars to accommodate the expansion [in the bay area]

8 Likes

With all those sensors, Waymo must be overwhelmed with “sensor contention.”

Or, maybe, just maybe, Alphabet-backed Waymo has some idea of how to work with a massive amount of real-time, highly multivariate data.

Back in August, Tesla CEO Elon Musk said Waymo can’t drive on highways because of the mix of sensors on the Alphabet company’s robotaxis.

Waymo’s fifth-generation autonomous driving platform has five lidars, six radar sensors, and 29 cameras.

It’s this mix of lidar, radar, and cameras, Musk wrote on X, that leads to “sensor contention” in which the information from the lidar and radars “disagree” with the cameras.

“This sensor ambiguity causes increased, not decreased, risk,” Musk wrote. “That’s why Waymos can’t drive on highways.”

3 Likes

Musk’s objection is just silly. There are lots of things that have multiple simultaneous inputs and prioritize. The most obvious is airliners which (not to put too fine a point on it) have lift, drag, weight, altitude, speed, ice and other considerations all banging away at the computer(s) at the same time (and which actually use line-by-line code from an earlier era) to decide what to do, to override the human controls, to warn the cockpit crew, or to release control back to the humans.

[If you enjoy this sort of thing, as I do, check out Mentour Pilot, an event-by-event description of aircraft disasters and near misses, along with some of the engineering that goes into modern avionics.]

To name just one or two: sit with a virtual reality device. It can take inputs from your head movements, or your hand movements, or sometimes spoken word commands - all at the same time, and prioritize which to follow depending. Accelerators, gyroscopes, GPS readings and more are used in rockets, and can decide to alter thrust, redirect engines, alter fuel flow or a host of other functions depending on what’s needed (again, based on old-time line-by-line programming.)

Saying that an automobile using both “vision” and “lidar” will overwhelm its ability to make decisions is just idiotic. Full stop.

6 Likes

I think the point in this context is that the different sources are all providing input on “what is out there” and that there is significant potential for differences exactly because the sources of the information are so different. It is easy to imagine contexts in which resolving those differences could be very difficult.

1 Like

It’s even easier to imagine contexts in which one input gets it wrong while the other gets it right.

Distance, for instance, at which Lidar is superior. Vision is pretty bad at detecting a dark overpass with a dark sky, but lidar works quite well. Why would you insist on having one but not the other?

There are lots of scenarios where having both is a good idea. If I’m hurtling down a road at 50mph, I’d rather have 2 things doing what they do best, and if somebody came up with a 3rd, I’d want that too.

2 Likes

Of course, the problem with having 2 or 3 visions of what is out there is knowing which one is right.

Try search on discriminant function analysis. Then how modern ML might extend that kind of approach, including to highly multivariate data such as sensor data.

2 Likes

Good for you for staying committed to the cause.

There’s a group that thinks color video has too much sensor contention so they still watch black and white.

Things I have mostly given up on:

  • option pricing explanation (thank me later)
  • why higher dimensional data contains more information than lower dimensional data
  • machine learning (including neural networks) just uses formulas and functions

Don’t really need to search … my dissertation was a new form of multivariate analysis … point being that here one will have two versions of what is being looked at - pole/no pole for example, one presumably accurate and the other not. One can imagine some rules for deciding, but it seems like a valid problem.

1 Like

A multivariate classifier can combine ALL of the DIFFERENT sensor data to estimate the relative probability of different outcomes (pole/no pole/tree/skinny person) and select the ONE best (most likely) outcome based on ML training.

Those “rules” are multivariate classifiers.

Do you believe that a company like Waymo is incapable of developing such a ML tool for its AI driver?

Also, I give up.

I’m sure it’s a valid problem and I’m just as sure it’s not an insolvable one. I say that because it’s been faced before and solved.

I keep referring to the airline industry because I had a few years exposure to it, but it happens tens of thousands of times every single day in every airliner that takes to the sky on every flight.

Just for fun: every airliner has 3 computers, separately powered (for redundancy). Each takes inputs from everything from airspeed to pitch angle, icing indicators, data from every aileron on every wing and tail section, radar indications from other aircraft and ground proximity, fuel levels and cabin pressure, and more. (Government regulations require a minimum of 88 data inputs, some aircraft manufacturers include up to 1,000.)

And each computer sifts those inputs into usable data to communicate with the pilot - although it’s autopilot controlling the flight more than 90% of the time (often 99%) when you’re in the air.

Technically speaking, the autopilot can fly and land without human assistance using all the simultaneous data and deciding what to do with its - so yeah, it’s a problem that is tough. (It would be trivial to add “take off” but there’s a fair amount of taxiing and following verbal ground instructions from controllers, so that doesn’t exist. Yet. But yes, it’s a “problem” already solved.

Oh, and for the record, other things where AI or algorithmic data uses multiple simultaneous inputs include predictive maintenance in industrial processes, air traffic control, route logistics (to include weather, congestion, etc.), high speed market trading, and I’m sure I could think of a few others if I put my mind to it.

4 Likes

You state that like it is a bad thing.

I recently watched House of Dynamite on Netflix and was reminded that the United States uses MULTIPLE sources of detection to VERIFY an incoming missile.

How foolish we would be to rely on just one? Why would we want cars to necessarily be different?

4 Likes

The problem here is that proposed counter examples are not actually comparable. One of the issues is proposed examples where the output of the multiple devices are simple numbers, which obviously can be subjected to some kind of equation or comparison. Whereas, here the output is an image. I.e., no simples numbers to put in a multivariate classifier … not that things like multivariate classifiers have any place in this kind of image processing. Likewise, the sort of thing in airplanes where one has three measurements of X, possibly using different technologies, and then “votes” on the correct value, just makes no sense when it is a rapidly changing pair of images that one is trying to compare. As for scanning for incoming missiles, sure one looks with different technologies, but one is focused on the point source, not the image as a whole.

1 Like

I don’t understand what you are saying here. The different tech would be used to determine, for example, if that “obstacle” in the road is a shredded tire or if it is just a discoloration of the pavement. How the system “votes” could lead to a decrease in accidents where as relying on just one input that might have judged that obstacle incorrectly.

To bring it back to the movie (and factual history), Russia once shot down a passenger jet because they incorrectly thought it was an incoming missile. Russia also came close to ordering a nuclear attack on the US due to sun reflecting off the clouds but a human, Stanislov Petrov, thankfully dismissed the warning(s) as false alarms.

Seems like having multiple ways to determine and correctly identify danger would be more wise than relying on just one.

3 Likes

Let’s start at step 1.

All inputs into machine learning models become numbers in such models. Text, image, video, lidar, radar, audio.

That’s how such models work, because they are just a bunch of mathematical formulas that take in numbers and return numbers.

Bullet 3 from above:

1 Like

Right, I get why it seems desirable … the issue is whether it is feasible. In your Petrov story there seems to be one image source and a human doing the interpretation.

Understand too that it isn’t as if someone pinned two images, one from the camera and one from the LIDAR, with a marker circle around something in each image and a note that said “shredded tire?” and one had to compare them and decide. For starters, it is questionable that you could easily see the tire in the LIDAR image since it consists of distances, not something visual. Plus a shredded tire would return a confusing mix of distances since there is no solid surface. About the best one is likely to get is that there is something returning pings corresponding to that shredded thing in the image so there is probably something there.

Sure … and are you going to plug the numbers corresponding to the digitized photo into some kind of multivariate analysis?

Yes.

That’s how it works.

That classifier may be a neural network or something else, but it is multivariate and a classifier.

Search multilabel image classifier.

2 Likes

Yes. That’s how that works. With numbers. The image is going to go into one layer of code that detects objects. Trees, cars, people, signs, whatever. Each of those objects will get assigned a number (an enumeration of an object type) along with other information, such as location. All that information gets fed into the next layer, and so on.

It’s not like a raw image gets fed into an algorithm and out pops “turn slightly right” or whatever. It’s layers of code, each one doing something different, spitting out a much smaller abstracted data set to the next layer, until finally your get a small set of output that directs what to do with throttle, brake and steering. And its numbers at every step along the way.

3 Likes

You keep pretending it doesn’t already exist.

Waymo’s self driving taxis use both video and Lidar.

There are self driving taxicabs in Beijing (Baidu). You’ll be surprised to find that they use both video and Lidar?

There are also self driving taxicabs in Singapore (nuTonomy). Guess what! They use both video and Lidar.

Zoox, which is operating in Las Vegas and San Francisco uses a combination of radar, Lidar, and visual cameras.

In Amsterdam the company is Elmo, operating driverless mini-shuttles. Yes, they use Lidar, as well as a suite of other sensing devices like high speed stop motion video, and traditional cameras.

In Shanghai the company pioneering self driving is AutoX, and it uses multiple sensing devices including Lidar to control the vehicle.

The initiative in Abu Dhabi is a joint venture of Bayanat (Chinese company) and local operators WeRide and ApolloGo. By now you are tired of me saying it, but it’s true: they use Lidar as well as several other technologies to paint 3D maps to control the car’s speed and direction.

In Hamburg they’re using VW Buzz vans for automated self driving. As you might expect the Germans are a bit heavy handed: the vehicles have NINE Lidar sensors in addition to a full complement of other sight and distance measurements to insure safety. The service is called MOIA Ride Sharing.

This is about half the list, although several of the others not included are in China, and presumably use the same tech platform as Beijing or the other one in Shanghai. I don’t know, I got tired of researching them to prove a point.

But the point is IT EXISTS. It exists in multiple cities on multiple platforms and it is working. It is not theoretical. It is not having problems deciding which input to listen to. Sometimes you just need to let go of a prejudice, and in this case, the sooner the better.

Caveat: maybe it’s possible to do it without Lidar. Maybe someday it will be. At the moment every other tech group that is working on the solution seems to think Lidar and multiple inputs is the way to go, at least for now.

Just because Musk says “it won’t work” doesn’t mean it won’t work, especially because it’s already working.

5 Likes