Duke University researchers have developed WildFusion, a unique framework that combines vision, vibration, and touch to allow robots to “sense” complicated outside surroundings in the same way that humans do. The study was just approved for the IEEE International Conference on Robotics and Automation (ICRA 2025), which will be held in Atlanta, Georgia, from May 19 to 23, 2025.
“WildFusion” uses a combination of sight, touch, sound and balance to help four-legged robots better navigate difficult terrain like dense forests. Image Credit: Duke University
The senses give incredible information, allowing the brain to navigate the world around us. Touch, smell, hearing, and a good sense of balance are essential for navigating what appear to be simple situations, such as a peaceful trek on a Saturday morning.
An intuitive awareness of the canopy above aids in determining where the trail leads. The sharp snap of branches or the soft cushion of moss tells us about the stability of our footing. The thunder of a tree falling or limbs dancing in severe winds alerts the potential risks.
In contrast, robots have traditionally depended entirely on visual input, such as cameras or lidar, to navigate the world. Outside of Hollywood, multisensory navigation has long been a challenge for machines. The forest, with its gorgeous mess of dense foliage, fallen logs, and ever-changing terrain, is a maze of uncertainty for conventional robots.
WildFusion opens a new chapter in robotic navigation and 3D mapping. It helps robots to operate more confidently in unstructured, unpredictable environments like forests, disaster zones and off-road terrain.
Boyuan Chen, Dickinson Family Assistant Professor, Mechanical Engineering and Materials Science, Electrical and Computer Engineering, and Computer Science, Duke University
“Typical robots rely heavily on vision or LiDAR alone, which often falter without clear paths or predictable landmarks. Even advanced 3D mapping methods struggle to reconstruct a continuous map when sensor data is sparse, noisy or incomplete, which is a frequent problem in unstructured outdoor environments. That is exactly the challenge WildFusion was designed to solve,” added Yanbaihui Liu, Study Lead Student Author and Second-Year Ph.D. Student in Chen’s General Robotics Lab.
WildFusion, based on a quadruped robot, incorporates various sensing modalities, including an RGB camera, LiDAR, inertial sensors, and, most significantly, contact microphones and tactile sensors. The camera and LiDAR capture the geometry, color, distance, and other visual aspects of the environment in the same way that previous techniques do. What distinguishes WildFusion is its utilization of sonic vibrations and touch.
Contact microphones collect the unique vibrations caused by each step as the robot walks, catching even small changes like the crunch of dry leaves against the soft squish of mud.
Meanwhile, tactile sensors assess the force applied to each foot, allowing the robot to perceive stability or slipperiness in real time. These additional senses are supplemented by an inertial sensor, which captures acceleration data to determine how much the robot is swaying, pitching, or rolling as it navigates difficult terrain.
Subsequently, each type of sensory data is processed by specialized encoders and combined into a single, comprehensive picture. WildFusion is built around a deep learning model that uses implicit neural representations.
Unlike traditional methods, which consider the world as a collection of discrete points, this approach continuously models complex surfaces and features, allowing the robot to make wiser, more intuitive decisions about where to walk, even when its vision is obscured or ambiguous.
Chen added, “Think of it like solving a puzzle where some pieces are missing, yet you are able to intuitively imagine the complete picture. WildFusion’s multimodal approach lets the robot ‘fill in the blanks’ when sensor data is sparse or noisy, much like what humans do.”
WildFusion was tested at Eno River State Park in North Carolina, near Duke’s campus, and effectively assisted a robot in navigating dense woodlands, grasslands, and gravel paths.
Liu added, “Watching the robot confidently navigate terrain was incredibly rewarding. These real-world tests proved WildFusion’s remarkable ability to accurately predict traversability, significantly improving the robot’s decision-making on safe paths through challenging terrain.”
Looking ahead, the team intends to develop the system by adding more sensors, such as heat or humidity detectors, to improve a robot's capacity to grasp and adapt to complicated settings.
WildFusion’s versatile modular design allows for a wide range of applications outside forest trails, such as disaster response over uncertain terrains, remote infrastructure inspection, and autonomous exploration.
Chen concluded, “One of the key challenges for robotics today is developing systems that not only perform well in the lab but that reliably function in real-world settings. That means robots that can adapt, make decisions and keep moving even when the world gets messy.”
This research was funded by DARPA (HR00112490419, HR00112490372) and the Army Research Laboratory (W911NF2320182, W911NF2220113).
WildFusion: Multimodal Implicit 3D Reconstructions in the Wild
Video Credit: Duke University