By using artificial intelligence to detect pedestrians, other cars and possible impediments, an autonomous vehicle is able to navigate city streets and other less crowded areas. Artificial neural networks, trained to “see” the environment around the automobile and imitate the human visual perception system, are used to do this.
But unlike humans, cars utilizing artificial neural networks have no recollection of the past and are in a permanent state of experiencing the world for the first time — no matter how many times they have gone down a specific route before. This is especially problematic when the automobile cannot safely rely on its sensors in bad weather.
To bypass this restriction by giving the car the capacity to “memories” prior experiences and utilize them in future navigation, researchers from Cornell’s Ann S. Bowers College of Computing and Information Science and the College of Engineering have concurrently published three publications.
Yurong You, is a doctoral student and principal author of the paper “HINDSIGHT is 20/20: Leveraging Past Traversals to Aid 3D Perception.” The paper was delivered virtually by You in April at the International Conference on Learning Representations (ICLR 2022). Deep learning, a subset of machine learning, is referred to as “learning representations.”
The fundamental question is, can we learn from repeated traversals? “For example, a car may mistake a weirdly shaped tree for a pedestrian the first time its laser scanner perceives it from a distance, but once it is close enough, the object category will become clear. So the second time you drive past the very same tree, even in fog or snow, you would hope that the car has now learned to recognize it correctly.
Kilian Weinberger, Study Senior Author and Professor, Computer Science, Bowers College of Computing and Information Science, Cornell University
Katie Luo, a doctoral student in the research team and co-author of the study added, “In reality, you rarely drive a route for the very first time. Either you yourself or someone else has driven it before recently, so it seems only natural to collect that experience and utilize it.”
A dataset was created by driving an automobile equipped with LiDAR (Light Detection and Ranging) sensors frequently over a 15-kilometer loop mostly around Ithaca, 40 times over an 18-month period. This was done by a team headed by doctorate student Carlos Diaz-Ruiz. Various locations (highway, urban, university), weather patterns (sunny, wet, snowy), and times of day are depicted in the traversals.
The resulting dataset, known as Ithaca365 by the group and the focus of one of the other two studies, has more than 600,000 scenes.
It deliberately exposes one of the key challenges in self-driving cars: poor weather conditions. If the street is covered by snow, humans can rely on memories, but without memories a neural network is heavily disadvantaged.
Carlos Diaz-Ruiz, Study Co-Author and Doctoral Student, Cornell University
HINDSIGHT is a method that computes object descriptors as the automobile passes them using neural networks. These descriptions, which the team has named SQuaSH (Spatial-Quantized Sparse History) characteristics, are then compressed and saved on a digital map like how a human brain stores “memory.”
The self-driving car may “remember” what it learned the previous time it drove through the same area by querying the nearby SQuaSH database of all the LiDAR points along the route. The database is shared across cars and updated regularly, enhancing the data that may be used for recognition.
“This information can be added as features to any LiDAR-based 3D object detector. Both the detector and the SQuaSH representation can be trained jointly without any additional supervision, or human annotation, which is time- and labor-intensive,” You stated.
The third publication, MODEST (Mobile Object Detection with Ephemerality and Self-Training), goes even farther than HINDSIGHT, which continues to assume that the artificial neural network is already trained to recognize things and enhances it with the capacity to form memories.
The authors in this case allowed the automobile to completely learn the perceptual pipeline from scratch. The artificial neural network onboard the car was never initially exposed to any streets or objects. It can discover which elements of the environment are stationary and which are moving by repeatedly traversing the same route. It gradually learns what other traffic participants are and what may be safely ignored.
Therefore, even on roads that were not included in the first repeated traversals, the system can reliably recognize these items.
The researchers believe that these strategies might significantly lower the development costs of autonomous cars (which still rely largely on expensive human-annotated data) and increase the efficiency of such vehicles by teaching them to navigate the areas where they are utilized the most.
Ithaca365 and MODEST will both be discussed at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2022), which will take place in New Orleans from June 19th–24th, 2022.
Other authors include Wei-Lun Chao, a former postdoctoral researcher who is now an assistant professor of computer science and engineering at Ohio State, Mark Campbell, the John A. Mellowes ‘60 Professor in Mechanical Engineering in the Sibley School of Mechanical and Aerospace Engineering, assistant professors Bharath Hariharan and Wen Sun from Bowers CIS, and doctoral students Cheng Perng Phoo, Xiangyu Chen, and Junan Chen.
Grants from the National Science Foundation, the Office of Naval Research, and the Semiconductor Research Corporation helped to fund the research for all three articles.
What do self-driving cars dream of?
Cornell researchers led by Kilian Weinberger, professor of computer science, have produced three recent papers on the ability of autonomous vehicles to use past traversals to “learn the way” to familiar destinations. Image Credit: Ryan Young/Cornell University.