Natural Language Processing (NLP) is a well-established area of multi-disciplinary research including Linguistics, Cognitive Science, Artificial Intelligence, Psychology, and others. Similarly, Robotics has also been a well-established and active research area. Yet until recently, the two areas have been quite independent of each other. Robots are now increasingly more capable in terms of both navigation and perception and are finding applications in areas that require humans and robots to work together in teams. NLP can provide robots with the ability to interact more naturally with their human team-mates. Robots, being embodied agents in the real world, can provide a way to ground language and reasoning. This intersection of interests has seen increased activity recently in both NLP and Robotics. The recent advances in Large Language Models (LLMs) have facilitated the development of natural language interfaces, but much work remains to be done to obtain effective human-robot natural language interfaces.
A recent project in CSIRO Data61’s Robotics and Autonomous Systems group started to explore the development of a spoken language interface for human-robot interaction. The system was built around the capabilities provided by the robots employed in the 2021 DARPA Subterranean Challenge. Small neural networks were used for both speech-to-text and text-to-speech tasks. Text was produced using ChatGPT with some careful prompting, resulting in a system that allowed the human operator to “converse” with the robot and instruct the robot to perform tasks naturally using speech and language. This was a very promising early result and warrants further research into this exciting area, which we will now undertake, in collaboration with the CSIRO Data61 Language Technology team.