Getting Started With AI Planning

Open Table of Contents

Introduction
Theoretical Concepts
- Transition System
- Classical Planning
Practical Concepts
- PDDL
- Planner
Examples
- Blocks World
References

Introduction

Planning involves choosing actions ahead of time by predicting their likely outcomes. The skill of creating clear plans for problem-solving is seen as a key part of smart agents.

Intuitively, the importance of planning in intelligence becomes apparent. In his book “Thinking, Fast and Slow,” Nobel laureate Daniel Kahneman introduces the concept of two systems of thinking: System 1 and System 2. These systems delineate two distinct cognitive approaches individuals employ to make decisions and tackle problems. System 1, also referred to as fast thinking, functions automatically, utilizing intuition and heuristics for decision-making without conscious effort. In contrast, System 2, also referred to as slow thinking, involves a methodical and analytical reasoning process that necessitates conscious effort to process information. We can view planning as the System 2 within an artificial intelligence agent.

The aim of this blog post is to introduce the theory that enables reasoning about planning problems, illustrate the emergence of different types of planning problems, and demonstrate how to solve the simplest category of planning problems using Python.

Theoretical Concepts

A good understanding of the theoretical concepts of planning not only deepens comprehension of the tools employed for practical problem-solving but also also provides a framework for better understanding the various presentations and contexts in which planning problems manifest.

Transition System

Planning involves the task of predicting the potential outcomes of actions that an agent may wish to undertake. A transition system serves as a valuable model for illustrating the dynamics of a system as it undergoes state changes. By providing a framework to illustrate how actions influence the state of the world, a transition system serves as a valuable construct in this context.

A transition system is formally defined by a 3-tuple $\Sigma = (S, A, \gamma)$ , where:

The set $S = {s_1, \ldots, s_n}$ encompasses all possible states that $\Sigma$ can assume.
$A = {a_1, \ldots, a_k}$ denotes the set encompassing all possible actions that an agent could execute within this environment.
$\gamma: S \times A \to 2^S$ represents a transition function; when provided with a state and an action, it yields the state resulting from the execution of that specific action, starting from the given state.

To illustrate the concept, let’s consider the following scenario: We have three blocks on a table that can be stacked or unstacked. In this context, let’s define a transition system. The set of possible actions $A$ includes stacking one block on top of another or unstacking them. The set of possible states $S$ encompasses various configurations such as having all blocks on the table, one block on top of another, all three blocks stacked, and so forth. The transition function $\gamma$ informs you about the outcome when an action is applied to a particular state. For instance, if we start with three blocks on the table and perform the action of stacking one block on top of another, $\gamma$ would yield a new state where one block is now positioned on top of another.

Classical Planning

Until this point, we have exclusively explored the description of states and transitions resulting from actions. We will now introduce the classical planning problem, which arises from assuming the following properties of the transition system:

Deterministic: A deterministic transition system is one in which, given a state and a action, the transition leads to only one other state. More formally, for $\gamma(s, a)$ , it generates a single subsequent state, denoted as $s'$ .
Static: A static transition system is characterized by its reluctance to undergo transitions unless prompted by the execution of an action by the agent.
Fully Observable: The transition system is considered fully observable when the agent possesses complete knowledge of the state of $\Sigma$ .
Finite: A transition system is considered finite when it has a finite set $A$ .

A classical planning problem is defined by a 3-tuple $(\Sigma, s_i, s_g)$ , where $\Sigma$ represents a transition system with the aforementioned characteristics, $s_i$ is the initial state, and $s_g$ is a goal state. The goal in a classical planning problem is to determine a sequence of actions, denoted as $P = <a_1, \ldots, a_k>$ , such that when applied to the transition system $\Sigma$ according to the transition function $\gamma$ , the system transitions from the initial state $s_i$ to the goal state $s_g$ .

Again, to exemplify a classical planning problem, consider the example of blocks. In the initial state $s_i$ , two blocks are on a table, and the goal state $s_g$ is achieved when the first block is on top of the second block. A solution to this classical planning problem would be the plan $P$ to take block one and stack it on top of block two.

Practical Concepts

In this section, we will delve into writing planning problems in PDDL and solving them using Python. We will explore an example of a problem that adhere to the classical planning problem setting and solve it using these tools.

PDDL

PDDL, or Planning Domain Definition Language, is a domain-specific language designed to standardize the representation of planning problems. This language encodes problems using two files: the domain file and the problem file.

The domain file serves as a representation of the world. If we were to map this to our theory section, it would be akin to a description of the transition system $\Sigma$ . This file is composed of the following components:

Object Types: An object type is a category of objects that share common properties. Each object defined in the domain can have an object instance associated with it in the problem file. These types also aid in defining predicates and actions in the domain file.
Predicates: Predicates are logical expressions that represent properties or relationships between objects in a planning problem. They play a role in defining the initial and goal states of the planning problem, as well as expressing conditions and effects of actions.
Actions: Actions represent possible changes in the world state, outlining preconditions required for their execution and specifying effects that result from post-execution state changes.

Conversely, the problem file functions as a representation of the remaining planning problem components beyond the scope of the world model. The elements in the problem file include:

Object Instances: These instances delineate the objects present in the problem, identified by assigning them a name and associating them with a type from the defined object types within the domain. These instances are instrumental in defining both the initial state and the goal state.
Initial State: This configuration signifies the starting state of the world within a planning problem. It is articulated through the utilization of predicates derived from the domain file and the instantiation of object instances. This initial state adheres to the “world close” assumption, signifying that all predicates defined here are considered true, while any not explicitly specified are presumed to be false. This ensures that the initial state is both unique and fully specified.
Goal State: The goal state outlines the desired configuration of the world that the planner seeks to attain through the execution of actions. It is characterized by predicates that must be satisfied for the problem to be deemed successfully solved. Unlike the initial state, the goal state does not adhere to the closed world assumption. In this context, only explicitly defined predicates are considered true, while everything else remains undetermined as either true or false.

Planner

A planner is a software designed to take a specification of a PDDL problem and generate a plan that satisfies the given planning problem. There are several planner options available, and in this post, our emphasis will be on utilizing a planner that seamlessly integrates with other software. Consequently, I have opted for Pyperplan as the chosen planner due to its Python compatibility and straightforward interfaces, making it user-friendly. While Pyperplan may not match the speed of other planners such as FastDownward, the primary focus of this post is not centered around achieving optimal speed with the planner.

Examples

This section will provide an example of classical planning, elucidating how the concepts discussed in the preceding sections are applied in practice.

Blocks World

In the blocks world planning problem, various blocks are positioned on a table, and they have the potential to be stacked atop one another. The objective is to employ a robotic hand to reposition the blocks from an initial configuration to achieve a specified goal configuration.

PDDL Domain

Let’s start crafting the PDDL domain file for this planning problem. We’ll begin with defining the types, and in this domain, there is only one type: “block.”

(:types block)

Next, we can define the predicates necessary to describe the possible states of the world. The predicates for this domain are as follows:

on: This predicate involves two blocks, representing the situation where the first block is positioned on top of the second block.
ontable: This predicate involves a single block and signifies the state of the world where the block is positioned on top of the table.
clear: This predicate indicates that a block has no other block on top, allowing another block to be placed on it.
handempty: This predicate represents the state where the robotic hand is carrying nothing. This is crucial as the hand can only grab a block when it is empty.
holding: This predicate involves a block and describes the state when the hand is holding that specific block.

The following code implements these predicates in PDDL:

(:predicates (on ?x - block ?y - block)
             (ontable ?x - block)
             (clear ?x - block)
             (handempty)
             (holding ?x - block))

Next, we need to code the actions available in this environment. Let’s begin with the pick-up action, where the robotic hand grabs a block. This action makes the robotic arm grab a block, with the conditions that the block is clear, on the table, and the hand is empty. Upon execution of this action, the new state of the world will be one where the block is no longer on the table, the hand is no longer clear, and the hand is now holding the block. The following PDDL code represents this action:

(:action pick-up
     :parameters (?x - block)
     :precondition (and (clear ?x) (ontable ?x) (handempty))
     :effect
     (and (not (ontable ?x))
	      (not (clear ?x))
	      (not (handempty))
	      (holding ?x)))

A important point to observe in the code for the actions is that the effects and preconditions are articulated using the predicates previously defined.

The second action to be defined is the “put down” action, which allows the robotic arm to release a block it is currently holding onto the table. The precondition for this action is that the arm must be holding the block it intends to put down. The effects of this action include the hand no longer holding the block, rendering the hand clear, and the block is then positioned on the table.

(:action put-down
     :parameters (?x - block)
     :precondition (holding ?x)
     :effect
     (and (not (holding ?x))
          (clear ?x)
          (handempty)
          (ontable ?x)))

Another action that needs to be defined is the stack action, enabling the robotic hand to position one block on top of another. The precondition for this action is that the hand must be holding a block, and the hand itself should be clear. Upon executing this action, the effects include the hand no longer holding the block, making the hand clear; the block being placed on top becomes clear; the block beneath the one being positioned on top becomes not clear; the hand is then empty; and, lastly, the initial block is designated as positioned on top of the second one.

(:action stack
     :parameters (?x - block ?y - block)
     :precondition (and (holding ?x) (clear ?y))
     :effect
     (and (not (holding ?x))
       (not (clear ?y))
       (clear ?x)
       (handempty)
       (on ?x ?y)))

Finally, the unstack action must be defined. Preconditions for executing the unstack action include having one block positioned on top of another, the block to be unstacked being clear, and the hand being empty. Upon execution, the effects are as follows: the hand now holds the unstacked block, the block beneath the unstacked one is now clear, the hand is no longer empty, the unstacked block is no longer clear, and the blocks are no longer stacked atop each other.

(:action unstack
     :parameters (?x - block ?y - block)
     :precondition (and (on ?x ?y) (clear ?x) (handempty))
     :effect
     (and (holding ?x)
       (clear ?y)
       (not (clear ?x))
       (not (handempty))
       (not (on ?x ?y)))))

To bring it all together, a couple of small details need to be added. Firstly, assign a name to the domain; in this case, we’ll name it “BLOCKS.” Additionally, include the requirements, specifying the type of problems that can be described within this domain. Importing “strips” indicates that you are defining a classical planning problem.

(define (domain BLOCKS)
  (:requirements :strips :typing)
  (:types block)
  (:predicates (on ?x - block ?y - block)
	       (ontable ?x - block)
	       (clear ?x - block)
	       (handempty)
	       (holding ?x - block))

  (:action pick-up
	     :parameters (?x - block)
	     :precondition (and (clear ?x) (ontable ?x) (handempty))
	     :effect
	     (and (not (ontable ?x))
		      (not (clear ?x))
		      (not (handempty))
		      (holding ?x)))

  (:action put-down
	     :parameters (?x - block)
	     :precondition (holding ?x)
	     :effect
	     (and (not (holding ?x))
		      (clear ?x)
		      (handempty)
		      (ontable ?x)))

  (:action stack
	     :parameters (?x - block ?y - block)
	     :precondition (and (holding ?x) (clear ?y))
	     :effect
	     (and (not (holding ?x))
		      (not (clear ?y))
		      (clear ?x)
		      (handempty)
		      (on ?x ?y)))

  (:action unstack
	     :parameters (?x - block ?y - block)
	     :precondition (and (on ?x ?y) (clear ?x) (handempty))
	     :effect
	     (and (holding ?x)
		      (clear ?y)
		      (not (clear ?x))
		      (not (handempty))
		      (not (on ?x ?y)))))

PDDL Problem

Now, we can proceed to define the domain PDDL file. An important aspect to note is that multiple problem files can be associated with a single domain file. This file aligns with the 3-tuple outlined in the “classical planning” section of this blog, where $\Sigma$ represents the domain file together with the object instances, $s_i$ signifies the initial state defined in this file, and $s_g$ is represented by the goal defined within the same file.

(:objects D B A C - block)

Let’s set the initial state by using the conditions outlined in the PDDL domain file. We rely on the predicates specified in the file and the objects we’ve defined earlier. In this specific initial state, blocks $A, B, C, D$ are placed on the table and are not stacked on each other. Additionally, the robotic hand is currently empty.

(:init (clear C)
       (clear A)
       (clear B)
       (clear D)
       (ontable C)
       (ontable A)
       (ontable B)
       (ontable D)
       (handempty))

Let’s emphasize that the initial state follows the close world assumption. This implies that anything not explicitly mentioned in this state is considered false by default. For instance, if block A is on top of block B in the initial state, you don’t need to explicitly state that block B is not clear. By omitting it from the initial state, we understand that it is false.

Lastly, let’s establish the goal state. Our objective is to have block $D$ positioned on top of block $C$ , block $C$ on top of block $B$ , and block $B$ on top of block $A.

(:goal (and (on D C)
            (on C B)
            (on B A)))

This goal state operates under the open world assumption, implying that predicates not explicitly stated may be either true or false. Notably, there’s no need to specify clear predicates or on-table conditions in this context. Consequently, there isn’t a single definitive state marking the end of the plan; multiple states can satisfy the specified conditions. Consider the hypothetical scenario of adopting a closed-world assumption for the goal—what states could we achieve with this combination of predicates and assumption?

To put everything together into a single file, we need to include a problem name and incorporate the domain’s name into the file.

(define (problem BLOCKS-4-0)
  (:domain BLOCKS)
  (:objects D B A C - block)
  (:init (clear C)
         (clear A)
         (clear B)
         (clear D)
         (ontable C)
         (ontable A)
         (ontable B)
         (ontable D)
         (handempty))
  (:goal (and (on D C)
              (on C B)
              (on B A))))

Planning

We will utilize the pyperplan planner to solve the problem by providing it with both the domain PDDL file and the problem PDDL file.

from pyperplan import planner
from pyperplan.search import breadth_first_search

plan = planner.search_plan("./domain.pddl", "./problem.pddl", breadth_first_search, None)

print(plan)

The plan generated by the planner suggests that the gripper should pick up block $B$ , stack it on top of block $A$ , then pick up block $C$ and stack it on top of block $B$ , followed by picking up block $D$ and placing it on top of block $C$ .

[<Op (pick-up b)>,
 <Op (stack b a)>,
 <Op (pick-up c)>,
 <Op (stack c b)>,
 <Op (pick-up d)>,
 <Op (stack d c)>]

Pyperplan offers various search algorithms and heuristics that you can explore on your own. Given the simplicity of this domain and problem, I chose not to extensively tune these parameters, and note that BFS (Breadth-First Search) can be slow. If you plan to tackle more complex scenarios, I recommend exploring other search algorithms and heuristics for optimization.

References

Kahneman, D. (2011). Thinking, fast and slow. ISBN:9780141033570
Ghallab, M., Nau, D. S., & Traverso, P. (2004). Automated Planning: theory and practice. In HAL (Le Centre pour la Communication Scientifique Directe). https://hal.archives-ouvertes.fr/hal-01982019
Alkhazraji, Y., Frorath, M., Grützner, M., Helmert, M., Liebetraut, T., Mattmüller, R., … Wülfing, J. (2020). Pyperplan. doi:10.5281/zenodo.3700819