Advanced driver-assistance systems (ADASs) can use forward-looking radars, lidars, cameras, and smart control software to anticipate and avoid potential collisions with other vehicles, pedestrians, cyclists, animals and debris. Increasingly they can also detect and identify roadway markings, lanes, road boundaries, and barriers—as well as read traffic signs and traffic lights. Today’s ADAS technology makes possible adaptive cruise controls and emergency braking functions; tomorrow’s will permit hands-free, semi-autonomous highway driving capabilities.
The onboard sensors’ street-view can also be augmented with geo-location data from GPS units and high-resolution road maps, and soon, updates from wireless vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2X) networks.
The Boston Consulting Group predicted earlier this year that by 2025 the market for self-driving cars will have grown $42 billion. ADAS research expertise currently resides in smaller specialists like Nvidia and Canada’s QNX, Tier 1 auto suppliers such as Continental, Delphi, and Bosch—and even Google.
A different viewpoint
One competitor that’s taking a different stance in the budding ADAS sales scrum is Mobileye NV, an Israeli software firm that was co-founded by Amnon Shashua, Sachs Professor of Computing Science at the Hebrew University of Jerusalem. A leading artificial intelligence researcher and Mobileye’s chief technology officer, Shashua thinks that autonomous cars won’t need radars, lidars, or even the usual binocular cameras to safeguard themselves. He believes that the job can be done with a cheap, monocular camera and system-on-a-chip running deep-learning neural networks.
Even though Mobileye issued an IPO just last year, the company has been around since 1999. And after an extended product incubation process, its EyeQ line of camera/system-on-a-chip (SOC) driver-assist units today seem well-positioned to grab a lion’s share of the OEM market during the next few years, having already raised its annual shipments of chips from a million in 2012 to an estimated 5 million this year.
The collision-avoidance capabilities of Tesla’s new Model S electric sedan, for example, are based on Mobileye’s EyeQ3 system. By mid-2016 the company’s EyeQ4 unit will enable “highway-to-highway driving” wherein the car can navigate the connecting off- and on-ramps without driver intervention, he predicted.
“When we started back in 1999, the camera wasn’t the primary forward-looking sensor,” Shashua recalled. “Radar was primary; it protected the vehicle with ranging, adaptive cruise controls, and emergency braking. Cameras were useful only for detecting lane markings.”
“Then in 1998,” he continued, “Subaru came out with a stereo camera that did what a radar did—ranging—using parallax and other cues.” At the time Shashua was selling a 3D camera-based system to carmakers for quality-assurance inspection and surface mapping.
The Subaru binocular system came up during a visit to a car manufacturer, he said. “They asked me whether I, as an expert in 3D vision, might want to consult with them on developing vision-based safety systems. I agreed, but I didn’t tell the company my initial thought, which was: Why two cameras?”
People’s reasons for using stereo vision are often somewhat mistaken, the 3D vision expert explained. The main goal of two cameras is system redundancy; the secondary is to provide depth perception for gauging object size in space. “But it turns out that triangulation using parallax is useful only for manipulating objects that are within reach.”
Binocular disparity, the difference in image location of an object seen by the left and right eyes results from the eyes’ separation. The brain uses binocular disparity to extract depth information from the 2D retinal images. “But disparity is inversely related to the depth, and the error is proportional to the square of the distance,” Shashua observed. Since the baseline will never be large enough to have an effect, binocular vision is not useful for objects out around 90 to 100 meters, he asserted. It turns out that the human vision system does depth perception at distance using other cues that are more linearly dependent on range.
“I told the car company this, but they didn’t believe me,” he remembered. “But I told my business partner: 'This will be big someday; eventually the camera will be the primary sensor.' Cameras are cheap, so regulators will favor them; plus, they provide the highest density of information. “Nothing can beat a camera, not today or in the future.”
“The industry stayed with two cameras, or a camera and a radar, for forward collision warning and lane departure, which allowed us to pursue monocular systems for 8 years under the radar,” Mobileye’s CTO reported. “But we felt that we can do both with a single camera that detects patterns and recognizing them, judging depth using perspective and other cues very much like humans do.”
Understanding the road
Mobileye’s EyeQ technology relies on its smarts to avoid collisions; its ability to discern objects and understand the road automatically. “What’s needed is to divide up the images of a road scene by classifying the important objects—cars, trucks, pedestrians, traffic signs, traffic lights,” he explained. “Over time we’ve added layers of discrimination that has increased the fidelity of the detecting objects, making the process more and more accurate.”
“The second family of capabilities you need relates to understanding the scene, and that’s a lot harder,” Shashua continued. “We’re not yet where we need to be; today’s limit is reading lane markers to stay in lane. Object detection still needs improvement, but that shouldn’t require some big research leap. It’s in understanding the scene that the big leap is needed. To handle city traffic and the variability of the road environment, the system needs to see things in context. For instance, it has to learn and recognize where the free space lies—those areas that are open to it.”
Deep learning, which emerged some three years ago, offers great possibilities, he stated. “Neural networks are sort of a caricature of how neurons work,” Shashua said. Like neurons, the nodes take inputs from the layers of nodes below in the sensory hierarchy. Although the technology’s not new, it turns out that the layers can stack up to perform certain tasks well. Recently, new levels of computing power and the availability of huge quantities of training data—for instance, the millions of images available online—have made neural nets much more effective in certain narrow tasks. Using neural nets, the success rate for face-recognition, for instance, now surpasses those of humans.
Based on finely tuned statistical analysis of huge numbers of example road images, a neural net autonomously figures out a representation of the object classification based on recurrent, diagnostic features that it measures, he explained. “Then other neural nodes provide certain basic levels of reasoning that lets the system recognize these features and objects reliably to provide end-to-end processing from raw image to output recognition.”
“A human has no problem in interpreting a road scene with no lane markers because humans consider the entire context,” the AI researcher said. “A machine-learning system doesn’t look locally for the lane markers, its deep networks look for the entire safe path through, taking everything into consideration.”
“We want the machine to decipher the image of the road at a level that is similar to human perception,” he said. “Our goal is to reach (and then surpass) human perception in the auto environment over the next five years. And today all indications look positive.”