This project will leverage LLMs in various roles to improve robotics, aiming to harness their full potential within practical general-purpose deployment scenarios, and leveraging their advanced reasoning capabilities for the effective planning of complex tasks. As pre-trained language models have expanded in both model size and data volume, current LLMs can now function as a versatile solution for language-related tasks that involve comprehending instructions and applying the internalized knowledge to solve practical problems and engage in coherent, accurate, and human-aligned dialogues. The field of robotics research can benefit significantly from the use of LLMs. The natural language understanding and commonsense reasoning capabilities of LLMs can significantly enhance a robot’s ability to comprehend contexts and execute commands. Through conversation, natural language instructions can be translated from text prompts into machine-understandable code that triggers corresponding actions, thereby rendering robots more adaptive and flexible. The FOMO-HODOR project will advance robotic task planning grounded in large language models, aiming at the development of a computational framework for large-model-based robot control that is capable of addressing tasks of long duration and elevated complexity, such as those associated to the General-Purpose Service Robot (GPSR) test within RoboCup@Home. We argue that one such framework can constitute a FOundation MOdel for HumanOid DOmestic RObots (FOMO-HODOR), which can potentially be re-used in multiple robot deployments.
