Introduction to Cognitive Robotics

Site:	Integrated E-Learning 4 Cognitive Robotics
Course:	Introduction to Cognitive Robotics
Book:	Introduction to Cognitive Robotics

Printed by:	Guest user
Date:	Saturday, July 12, 2025, 5:04 PM

1. Introduction
- 1.1. Why cognition-enabled robots?
- 1.2. Robot agents
2. The challenge of realizing robot agents
3. A glimpse at modeling robot agents

1. Introduction

Consider the robot depicted in Figure 1. This robot has a camera that can be pointed into different directions and orientation, a wheeled navigation system two arms that can be moved with a gripper that can be opened and closed. If you had a game controller with which you can navigate the robot, orient the camera, and control the arms and hand of the robot you could remote control the robot to accomplish various manipulation tasks such as setting and cleaning the table, loading and unloading the dishwasher, making popcorn, assembling toy airplane, replenishing shelves in a retail store, to name only a few.

Fig. 1: PR2 body parts and sensors

In this interactive text book and the accompanying learning environment we want to acquire the competence of understanding, designing, and even implementing control programs that enable robots to accomplish such tasks autonomously.

This control program has to solve the body motion problem: given a task request such as "set the table'' determine a body motion of the robot that accomplishes the desired effects and avoids unwanted side effects. Desired effects are that all items needed are intact, placed on the table, and arranged in the proper way. Unwanted side effects are that the robot breaks objects, damages the environment, or spills stuff.

To reach the competence level of humans the control programs have to solve the body motion problem under challenging conditions.

First, the body motions generated by control program must be perception guided. The program has only incomplete, inaccurate, and possibly faulty knowledge about the environment. Therefore, it has to search and find the objects it needs to manipulate. The execution of body motions is inaccurate and yield some nondeterminism and as a consequence the program is uncertain about the physical effects that the body motions cause. To act robustly despite of these complications has to perceive the environment and monitor its actions to fill in knowledge gaps and revise its beliefs given new perceptual evidence. The control program has to specify how the robot is to respond to what it perceives in order to accomplish its tasks.

Second, the action categories such as pouring are general and can be applied in many contexts. We can appreciate this generality by looking at the breadth and depth of skill with which humans accomplish tasks, such as pouring substances: humans can pour water out of a pot and pancake mix into a pan; they can separate egg yolk from the egg white, extinguish fire, neutralize acid, and pour beer into a glass, to name only a few variations. These pouring tasks involve different substances being poured, different containers, and different tools. They serve different purposes and have different effects. Each variation of the pouring task requires its own specific behavior patterns.

Third, performing actions at the competence level of humans is knowledge intensive and the knowledge has to be turned into body motions.

Fourth, to achieve good performance it is often necessary to learn and be prospective.

Fifth, we require the robot control system to "understand'' what it is doing, by which we mean that it can answer questions about what it is doing, why, how, what it expects to happen, what might go wrong, etc.

1.1. Why cognition-enabled robots?

Taks, environments and robots
Cambrian evolution to robotics
Tackling grand research challenges requires cognition-enabled robot agents
Contributing to social challenges

1.2. Robot agents

In this section we consider how robot agents accomplish different kinds of tasks.

... making popcorn

As an example for a task we consider a sequence of pick and place actions, e.g., making popcorn. To make popcorn, the robot needs to:

Pick up an empty pot and place it on the hotplate.
Turn the hot plate on.
Pick up the bowl with corn, put the corn into the empty pot, and place the bowl back on the table.
Pick up the lid and place it on the pot.
Shake the pot after some time so the popcorn doesn't burn.
Wait until the popcorn is done.
Pick up the warm pot, put the finished popcorn into a bowl, and place the pot on a cold area of the hotplate.

Fig. 1: Grasping a pot.

While all these actions pick and place actions, they are dependent on the corresponding context and vary in how exactly they have to be executed. For instance, the bowl and the pot need to be grasped in different ways and they need to be placed in different orientations and locations. The location and the state of each individual object has an impact on how the pick and place action has to be performed to be successful. For example, the robot can grasp a pot that is empty and cold in many ways, but when the popcorn is done and the pot is still hot, the robot needs to know how to grasp it by the handles.

A robot that is able to cope with context-dependent actions in a dynamic environment needs to be implemented as a so-called cognition-enabled agent. A cognition-enabled agent has the skills to decide which actions it has to do and also how it should perform the actions according to the given context. In addition, the agent is aware at any time what action needs to be done to reach the goal and can also explain why it is performing those actions. To enable the robot to be cognitive, a complex software system is required which allows splitting the activity in pick and place actions. This generates the specific required movements. The system needs to handle a fast and reactive interplay of hardware and software components to let the robot cope with the context related actions in a complex task such as popcorn making.

To make our discussions more tangible we will take a home assistance robot as our running example, which is supposed to prepare popcorn in a kitchen environment. In order to translate an instruction, such as "make popcorn'', into action a robot has to infer missing information including missing action steps and their appropriate order. In order to get a better intuition about the complexity and details of such manipulation tasks, consider the snapshots of the cooking activity, which are depicted in Figure 1. Here is a video featuring a robot performing the complete popcorn preparation task:

Fig. 2: Action steps for popcorn making: (1) putting the cooking pot on the stove, (2) opening the drawer, (3) pouring the corn into the pot, (4) switching on the drawer, (5) grasping the lid, (6) putting the lid on the pot, (7) distributing the corn evenly in pot, (8) pouring the popcorn onto the plate, (9) salting the popcorn.

Competently making popcorn requires robots to solve a number of challenging reasoning tasks. The robot has to infer implicit goals and unmentioned subactions, including that in order to cook popcorn the unpopped corn has to be put into a cooking pot, the pot be placed on a stove top, the knob for the respective stove be turned in order to switch the stove on. The robot has to wait for the popcorn to be done, the lid has to be removed from the pot in order to pour it onto a plate.

Other kinds of needed knowledge might be that a pot with a lid on top is heavier than without it and that a filled pot should be held upright. When pouring the pot should not be held too high above the plate to avoid spilling. Also, a hot pot should be only grasped by its handles. For grasping objects the robot should choose the grasp type and position that is most reliable, efficient, and controllable. Additional task knowledge includes knowledge about where the popcorn is stored, that the salt grinder has to be turned upside down, and the the drawers open outwards.

... mastering household chores

... working in a retail store

... accomplishing tasks together with humans

... assisting in laboratories

... being ocean scientists

2. The challenge of realizing robot agents

The challenges of realizing cognition-enabled robot agents that are able to master complex tasks include:

perception
control
planning
decision making
reasoning
uncertainty
environment dynamics
... and many more

3. A glimpse at modeling robot agents

We begin our discourse about robot agency with introducing a framework of concepts that enables us to formalize the interaction of robots and the environment they act in, how goals and objectives of the robots can be stated and the goal achievement through robot actions be measured, and how robots should select their course of action in order to maximize the impact of its actions.

Control system model of robot agents

Fig. 1: Top-level model of robot agents.

In our considerations we structure the robot agent and the environment it is accomplishing its tasks in into the controlling system and the controlled system. The controlling system is the control program that receives the data from the robot sensors and outputs motion specifications for the articulated robot body. The controlled system are the motors of the joints of the robot, the robot body, the environment in which the robot is acting and the sensor system that is generating the sensor data that the controlling system is working with.