New Framework Uses AI to Simplify Robot Control

Researchers have introduced a new framework that integrates large language model (LLM)-based AI agents with a robot operating system (ROS), enabling more flexible and user-friendly robot programming through natural language.

Study: A robot operating system framework for using large language models in embodied AI. Image Credit: GarryKillian/Shutterstock.com

In an article published in Nature, researchers present a framework that brings together LLM-based AI agents and ROS for embodied AI. The system combines automatic behavior execution, multiple execution modes, imitation learning, action optimization, and feedback-driven reflection.

Experiments show that it performs reliably across tasks such as long-horizon planning and tabletop rearrangement, using only open-source models.

Background

Traditional robotic development depends on expert engineers to decompose tasks into atomic actions and assemble them into behaviors. While this method works well, it is inherently rigid and struggles to adapt to dynamic environments like homes or healthcare settings, where non-experts often need to update capabilities quickly.

Frameworks such as ROS offer a modular foundation, but they still require expert input for defining tasks and expanding skills. At the same time, advances in LLMs have made natural language interaction with robots increasingly practical.

Even with these developments, a gap remains.

Current systems do not fully support intuitive task composition by non-experts, nor do they make it easy to extend action libraries through demonstration or refine behaviors iteratively. This work addresses that gap by combining LLM-based reasoning with ROS, alongside imitation learning and feedback-driven adaptation.

System Architecture

To bridge this divide, the proposed framework allows non-expert users to program robots through natural language interaction. It separates responsibilities between:

  • Experts, who define an initial library of pre-trained atomic actions (e.g., “pick,” “navigate”)
  • Non-experts, who interact with the robot through a chat interface without needing technical expertise

At its core, the system is organized around four tightly connected components.

First, the atomic action library serves as the foundation, storing primitive robot skills along with their textual descriptions and executable code. Building on this, the imitation learning module allows non-experts to expand the library by physically guiding the robot or demonstrating tasks, removing the need for manual coding.

Once actions are defined, the atomic action optimizer refines them. It uses LLMs to identify key parameters within the action code and improves them through Bayesian optimization, enhancing performance without requiring expert intervention.

Tying everything together is the AI agent, which acts as the decision-making core. It processes user instructions alongside environmental observations, converted into text, and selects appropriate actions. Depending on the task, it can operate in different modes: executing single actions in dynamic settings, chaining sequences for multi-step tasks, running custom code, or using behavior trees for more complex logic.

Throughout this process, prompts are carefully structured to include task goals, available actions, and user feedback.

Rather than relying on continuous retraining, the system improves iteratively through interaction, allowing non-experts to shape robot behavior over time. While the framework supports both open-source and commercial LLMs, only open-source models were used in these experiments.

Experimental Validation

The researchers evaluated the framework across a range of robots and task environments, with results that highlight both its flexibility and consistency.

In a kitchen setup using a UR5 arm, the robot completed a 12-step coffee-making task from a single natural language prompt, demonstrating strong long-horizon planning without human intervention. Building on this, non-experts introduced new actions like stirring and pouring through demonstration, which the system then used to carry out a “cook me pasta” task.

This progression illustrates how the framework naturally expands its capabilities through imitation.

As tasks became more complex, the role of feedback became clearer. In tabletop rearrangement experiments, performance dropped when relying solely on the language model.

However, when human corrections were incorporated, success rates remained consistently high. Importantly, the system was able to reuse feedback. For example, after being instructed to verify object locations before grasping, it applied that correction independently in later trials.

The framework also proved effective in distributed settings. An operator in Europe successfully controlled a robot in Asia using natural language, completing pick-and-place tasks despite a 2–3 second delay. In a laboratory scenario, the system interpreted unstructured textbook instructions to perform a pH test, showing its ability to work with less formal inputs.

Further experiments reinforced these findings.

Bayesian optimization improved air hockey performance from 30 % to 52 %, while a quadruped robot demonstrated real-time failure recovery by resolving issues such as gripper obstructions in an office environment.

Insights and Challenges

The results show that LLMs can make robotic systems far more accessible, allowing non-experts to define complex tasks through natural language. The framework performs particularly well in long-horizon planning and modular action sequencing.

At the same time, several challenges affect reliability. Performance is highly sensitive to prompt wording, with even small phrasing changes sometimes leading to failure. The model can also be misled by examples, attempting actions involving objects mentioned only in passing. In some cases, it generates actions that do not exist in the library, although few-shot prompting helps reduce this behavior.

Despite these issues, the system remains effective when paired with clear, actionable human feedback, especially as task complexity increases. Still, there is no single correction strategy that works across all scenarios, and the framework continues to rely on careful prompt design and human-in-the-loop guidance for more demanding tasks.

Conclusion

Overall, this work presents a cohesive framework that integrates LLM-based AI agents with ROS, making robot programming more accessible through natural language. By combining imitation learning, action optimization, and iterative feedback, the system supports flexible and adaptive task execution across a wide range of applications.

While challenges such as prompt sensitivity and action illusion remain, the framework marks steady progress toward more usable and adaptable robotic systems without yet fully achieving general-purpose autonomy.

Journal Reference

Mower et al. (2026). A robot operating system framework for using large language models in embodied AI. Nature Machine Intelligence. DOI:10.1038/s42256-026-01186-z. https://www.nature.com/articles/s42256-026-01186-z

Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Nandi, Soham. (2026, March 24). New Framework Uses AI to Simplify Robot Control. AZoRobotics. Retrieved on March 24, 2026 from https://www.azorobotics.com/News.aspx?newsID=16364.

  • MLA

    Nandi, Soham. "New Framework Uses AI to Simplify Robot Control". AZoRobotics. 24 March 2026. <https://www.azorobotics.com/News.aspx?newsID=16364>.

  • Chicago

    Nandi, Soham. "New Framework Uses AI to Simplify Robot Control". AZoRobotics. https://www.azorobotics.com/News.aspx?newsID=16364. (accessed March 24, 2026).

  • Harvard

    Nandi, Soham. 2026. New Framework Uses AI to Simplify Robot Control. AZoRobotics, viewed 24 March 2026, https://www.azorobotics.com/News.aspx?newsID=16364.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.