Posted in | News | Industrial Robotics

MIT Method Lets Search-and-Rescue Robots Navigate Unexpected Terrain Quickly

Download PDF Copy

Reviewed

Reviewed by Bethan DaviesNov 6 2025

MIT researchers have developed a new method that could enable search-and-rescue robots to quickly create accurate maps of unpredictable environments, improving their ability to navigate and respond in real time.

The artificial intelligence-driven system incrementally creates and aligns smaller submaps of the scene, which it stitches together to reconstruct a full 3D map, like of an office cubicle, while estimating the robot’s position in real-time. Image Credit: Courtesy of the researchers, MIT.

When a robot is deployed to locate workers trapped in a collapsed mine, it must act quickly - mapping unfamiliar terrain and pinpointing its own position in real-time. But today’s best machine-learning models for navigation can only process a handful of images at once, which doesn’t cut it in disaster zones where seconds matter and thousands of images may need to be analyzed.

To address this, MIT researchers have created a new system that dramatically speeds up the process.

Drawing from both modern AI vision models and classic computer vision techniques, their method can generate accurate 3D maps of complex environments - like a cluttered office corridor - in just seconds, using only images from a robot’s onboard camera.

The system works by incrementally building small submaps as the robot moves. These submaps are then aligned and stitched together into a full 3D reconstruction, all while estimating the robot’s location in real time. Unlike many current approaches, this method doesn’t require calibrated cameras or expert fine-tuning. Its simplicity, combined with fast and high-quality results, makes it more practical for real-world deployment.

In addition to aiding search-and-rescue missions, the technique could power applications in extended reality (XR) for devices like VR headsets, or help warehouse robots quickly locate and move items.

For robots to accomplish increasingly complex tasks, they need much more complex map representations of the world around them. But at the same time, we don’t want to make it harder to implement these maps in practice. We’ve shown that it is possible to generate an accurate 3D reconstruction in a matter of seconds with a tool that works out of the box.

Dominic Maggio, Study Lead Author and Graduate Student, Massachusetts Institute of Technology

Maggio collaborated with postdoc Hyungtae Lim and senior author Luca Carlone, associate professor in MIT’s Department of Aeronautics and Astronautics, principal investigator at the Laboratory for Information and Decision Systems (LIDS), and director of the MIT SPARK Lab. The research will be presented at the Conference on Neural Information Processing Systems.

Revisiting SLAM with a Fresh Perspective

The core challenge the team tackled is a well-known robotics problem: simultaneous localization and mapping, or SLAM. SLAM allows a robot to map an unknown environment while keeping track of its own position within it.

Traditional optimization-based SLAM methods often struggle in visually complex scenes or require pre-calibrated cameras. In contrast, machine-learning-based methods, while easier to implement, are limited by memory and can only process about 60 images at a time - far too slow for high-speed exploration.

The MIT team’s solution sidesteps this limitation by having the robot create many small submaps instead of one large map. These smaller chunks can be processed quickly, then combined into a full 3D reconstruction.

“This seemed like a very simple solution, but when I first tried it, I was surprised that it didn’t work that well, ” says Dominic Maggio.

While searching for a solution, Maggio revisited computer vision research from the 1980s and ’90s. That deep dive revealed a key issue: machine-learning models often introduce subtle distortions when processing images, which makes aligning submaps much trickier than expected.

Traditional approaches rely on basic geometric operations like rotating and translating submaps until they line up. But with modern models, that’s often not enough. A submap might show one side of a room with slightly warped or stretched walls, meaning standard alignment techniques fall short. These small deformations introduce ambiguity that simple transformations can't resolve.

We need to make sure all the submaps are deformed in a consistent way so we can align them well with each other.

Luca Carlone, Associate Professor, Department of Aeronautics and Astronautics, Massachusetts Institute of Technology

A More Flexible Alignment Strategy

To solve this, the researchers developed a more adaptable mathematical method that models how submaps might be warped. By applying these transformations, the system can reliably align even slightly distorted submaps.

Using a stream of input images, the system produces a 3D reconstruction of the environment along with estimates of the camera’s positions - crucial data that enables the robot to localize itself within the scene as it navigates.

“Once Dominic had the intuition to bridge these two worlds – learning-based approaches and traditional optimization methods – the implementation was fairly straightforward. Coming up with something this effective and simple has potential for a lot of applications,” Carlone added.

Their system outperformed other methods both in speed and accuracy, all without the need for specialized cameras or extra processing tools. In tests, the researchers were able to generate near-real-time 3D reconstructions of intricate environments, such as the interior of the MIT Chape, using nothing more than short cell phone videos.

The results were impressively precise, with an average reconstruction error of less than 5 centimeters.

Looking ahead, the team aims to further improve the system’s reliability in especially complex or cluttered environments and eventually deploy it on real robots operating in demanding, real-world conditions.

Knowing about traditional geometry pays off. If you understand deeply what is going on in the model, you can get much better results and make things much more scalable.

Luca Carlone, Associate Professor, Department of Aeronautics and Astronautics, Massachusetts Institute of Technology

Source:

Massachusetts Institute of Technology

Download PDF Copy

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback

(Logout)

Public Comment

Private Feedback to AZoRobotics.com

Submit

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.