Can AI Make Rehabilitation Robots Feel More Natural?

Researchers have demonstrated that training a humanoid controller on real human gait data can significantly reduce torque variability and produce more natural, stable robotic movement for rehabilitation.

An orthopedist analyzing the biomechanical footprint study of an athlete while running on a treadmill. This is referred to as a gait analysis test.

Study: A humanoid control strategy based on deep reinforcement learning for enhanced comfort in lower limb rehabilitation robots. Image Credit: Yiistocking/Shutterstock.com

In an article published in Scientific Reports (a Nature Portfolio journal), researchers developed an intelligent humanoid controller for lower limb rehabilitation robots. By using real human gait data to plan anthropomorphic trajectories and combining deep reinforcement learning (DRL) with a proportional-differential (PD) controller, they created an adaptive trajectory tracking system based on learned gait characteristics.

Simulation results indicate that this approach enhances robot intelligence, promotes more human-like motion, and improves patient comfort during training.

Why This Matters

As the number of patients with lower limb impairments, particularly following stroke, continues to rise, robot-assisted rehabilitation has become increasingly important for restoring motor function. However, controlling rehabilitation robots remains challenging. Their dynamics are highly nonlinear, which makes precise mathematical modeling difficult.

Traditional control strategies such as PD control, sliding mode control, and fuzzy control are widely used, but they often struggle to adapt to individual patient variability. PD control, in particular, is valued for its simplicity, yet it lacks the flexibility needed for complex, human-centered movement.

Recent progress in DRL has shown promise in handling nonlinear, high-dimensional control problems. Still, most systems rely either on learning-based methods or traditional controllers alone. This study brings the two together, using DRL for intelligent adaptation while retaining PD control for stable, precise tracking.

How the Controller Works

At the core of the system is the deep deterministic policy gradient (DDPG) algorithm, chosen for its effectiveness in continuous control tasks. DDPG uses an actor–critic structure:

  • The actor network generates control actions.
  • The critic network evaluates those actions and guides learning.

This setup allows the system to learn directly from experience without requiring a fully defined environmental model.

The overall control architecture is layered. The PD controller operates at a low level, ensuring accurate tracking of joint angles. Above it, the DRL agent acts as a high-level decision-maker, learning human gait characteristics and adapting control signals accordingly. The system focuses on two degrees of freedom (hip and knee joints) in a fully actuated configuration.

A central strength of the study lies in its reward function design. Rather than simply encouraging forward motion, the reward incorporates multiple safety and comfort factors. It promotes:

  • Stable walking speed
  • Upright trunk posture
  • Consistent hip height
  • Adequate foot clearance
  • Reduced joint torque

A specific penalty term, 'peffort', discourages excessive torque output to limit discomfort during rehabilitation. The agent is also penalized for deviating from reference human joint angles, reinforcing natural gait patterns.

Training was conducted in MATLAB’s DRL Toolbox over 10,000 episodes. Key hyperparameters included a learning rate of 1e-4, a discount factor of 0.99, and a replay buffer size of 1e6. Lyapunov stability analysis further confirmed the theoretical stability of the combined DRL-PD control system.

Data Collection and Experimental Setup

To ground the controller in realistic movement, researchers collected gait data from healthy subjects using a Nokov motion capture system. Reflective markers were placed on anatomical landmarks to capture joint motion as participants walked at self-selected speeds. A synchronized three-dimensional force plate recorded ground reaction forces.

All procedures were ethically approved, and participants provided informed consent.

What the Results Showed

After training, the DRL-PD controller produced stable, periodic joint motions resembling human gait. The hip, knee, and ankle joints moved within realistic ranges:

  • Hip: −0.3 to 0.3 rad
  • Knee: −0.3 to 0.4 rad
  • Ankle: 0 to 0.1 rad

Closed-loop limit cycles confirmed system stability.

Tracking accuracy was strong. Errors converged rapidly to below 0.1 rad. Across 10 stable gait cycles, root mean square errors were:

  • 0.028 ± 0.003 rad (hip)
  • 0.035 ± 0.004 rad (knee)

Joint torques remained within controlled ranges of 0–50 N·m for the hip and −10–60 N·m for the knee.

The most notable improvement appeared in motion smoothness. Measured using the root mean square of torque derivatives, the DRL-PD controller achieved:

  • 5.3 ± 0.3 N·m/s (hip)
  • 6.6 ± 0.4 N·m/s (knee)

By comparison, conventional PD controllers reported values of 12.5 and 15.8 N·m/s, respectively. This represents roughly a 60 % reduction in torque variability, which is a significant improvement that directly supports patient comfort during rehabilitation exercises.

Robustness testing further strengthened the findings. With ±5 % mass variation, tracking errors increased by less than 10 %, indicating stability under parameter uncertainty.

Although training required approximately 12 hours on an RTX 3060 GPU, the trained policy was able to run efficiently once deployed. The authors note that future work will focus on optimizing real-time implementation.

Conclusion

This study demonstrates that combining DRL with traditional PD control can produce more natural, stable, and comfortable movement in lower limb rehabilitation robots. By grounding the controller in real human gait data and embedding comfort directly into the reward structure, the researchers created a system that balances adaptability with reliability.

While the results are currently based on simulations and controlled testing, the next critical step will be clinical validation with patients. If successful, this approach could strengthen the role of intelligent control systems in personalized robot-assisted therapy.

Journal Reference

Jin, Y., Zhang, J., Li, W., Yu, J., Wang, Z., & Sun, S. (2026). A humanoid control strategy based on deep reinforcement learning for enhanced comfort in lower limb rehabilitation robots. Scientific Reports. DOI:10.1038/s41598-026-39011-7. https://www.nature.com/articles/s41598-026-39011-7

Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Nandi, Soham. (2026, February 20). Can AI Make Rehabilitation Robots Feel More Natural?. AZoRobotics. Retrieved on February 20, 2026 from https://www.azorobotics.com/News.aspx?newsID=16339.

  • MLA

    Nandi, Soham. "Can AI Make Rehabilitation Robots Feel More Natural?". AZoRobotics. 20 February 2026. <https://www.azorobotics.com/News.aspx?newsID=16339>.

  • Chicago

    Nandi, Soham. "Can AI Make Rehabilitation Robots Feel More Natural?". AZoRobotics. https://www.azorobotics.com/News.aspx?newsID=16339. (accessed February 20, 2026).

  • Harvard

    Nandi, Soham. 2026. Can AI Make Rehabilitation Robots Feel More Natural?. AZoRobotics, viewed 20 February 2026, https://www.azorobotics.com/News.aspx?newsID=16339.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.