🤖 Basic Pick and Place - Kinematics and Spatial Algebra

The lecture covers foundational techniques for robot manipulation, especially pick-and-place tasks, with a focus on kinematics and spatial algebra.

Course Link

I. Introduction to Robotic Manipulation and the Pick and Place Problem

A core challenge in robotic manipulation is commanding a robot to move an object (e.g., a red brick) from an initial to a goal pose. This “pick and place” task requires geometric, kinematic, and control insights.

📌 Assumptions: Perception is deferred — the system assumes perfect knowledge of the object’s pose.

The Recipe:

Kinematic Frames and Spatial Algebra
Gripper Frame Plan
Forward Kinematics
Differential Inverse Kinematics

II. Monogram Notation and Spatial Algebra: The Language of Robotics

A. Key Concepts and Notation

Point: $( p^A )$ = position of point A
Relative Position:
$( {}^A p^C )$ = position of point C measured from A
Expressed-in Frame:
$( {}^A p^C_F )$ = position of C from A, expressed in frame F

“Expressed-in frame is an implementation detail” (Ch. 3)
World Frame $( W )$:
Uses vehicle coordinates (x-forward, y-left, z-up)
Body Frame $( B_i )$:
Attached to each body in simulation

Shorthands:

$( {}^F p^A \equiv {}^F p^A_F )$
$( p^A \equiv {}^W p^A_W )$
Rotation:
$( {}^B R^A )$ = orientation of frame A measured from B
Pose:
$( {}^B X^A )$ = pose of frame A measured from B

“Pose is a noun, transform is a verb” — they are equivalent.

Code Notation:

$( {}^B p^A_C \rightarrow \texttt{p_BA_C} )$
$( {}^B R^A \rightarrow \texttt{R_BA} )$
$( {}^B X^A \rightarrow \texttt{X_BA} )$

B. Rules of Spatial Algebra

Position Addition:
\[{}^A p^B_F + {}^B p^C_F = {}^A p^C_F\] \[{}^A p^B_F = - {}^B p^A_F\]
Frame Change:
\[{}^A p^B_G = {}^G R^F \cdot {}^A p^B_F\]
Rotation Composition:
\[{}^A R^B \cdot {}^B R^C = {}^A R^C\] \[\left[ {}^A R^B \right]^{-1} = {}^B R^A = \left( {}^A R^B \right)^T\]
Transform Composition:
\[{}^A X^B \cdot {}^B X^C = {}^A X^C\] \[\left[ {}^A X^B \right]^{-1} = {}^B X^A\]

C. Representations for 3D Rotation

Rotation Matrices: 3×3, orthonormal
Euler Angles (RPY): Order matters, susceptible to gimbal lock
Axis-Angle: Rotation vector
Unit Quaternions:
Represented by $( (w, x, y, z) )$, with:
\[w^2 + x^2 + y^2 + z^2 = 1\]

III. Gripper Frame Plan “Sketch” and Trajectories

Keyframes:

Sequence of gripper poses:

Initial
Pre-pick
Pick
Post-pick
Pre-place
Place
Post-place

Pre-Grasp Pose:

Safely positions gripper above object for simple vertical motion.

Grasp Specification:

Pose of object relative to gripper during grasp.

Example:

🎯 Grasp Pose Definition in Pick and Place Tasks

When determining the pose of an object relative to the gripper during a grasp, the object frame is aligned with the gripper frame through a specific combination of position and orientation to ensure a successful and stable grasp. This combined position and orientation is referred to as a spatial pose or transform.

For the pick and place task involving a red foam brick and a gripper, the desired pose of the object in the gripper frame — denoted as $( {}^{G_{\text{grasp}}}X^O )$ — is explicitly defined. This definition includes both translational and rotational components:

alt_text

📍 Position

The desired position (in meters) of the object in the gripper frame is:

\[{}^{G_{\text{grasp}}}p^O = \begin{bmatrix} 0 \\ 0.11 \\ 0 \end{bmatrix}\]

This value represents the “happy place for the center of mass of the brick” within the gripper’s coordinate system.

🔄 Orientation (Rotation)

The desired orientation of the object relative to the gripper, denoted as $( {}^{G_{\text{grasp}}}R^O )$, is achieved by applying two successive elementary rotations:

MakeXRotation(π/2) @ MakeZRotation(π/2)

This rotation scheme ensures:

The positive z-axis of the object aligns with the negative y-axis of the gripper frame.
The positive x-axis of the object aligns with the positive z-axis of the gripper frame.

The rotation uses the right-hand rule to determine the direction and result of each axis transformation.

✋ Pre-Grasp Pose

The pre-grasp pose — an intermediate step used to approach the object without collision — is defined to have the same orientation as the grasp pose. This

f MakeGripperFrames(X_WG, X_WO):
    """
    Takes a partial specification with X_G["initial"] and X_O["initial"] and
    X_0["goal"], and returns a X_G and times with all of the pick and place
    frames populated.
    """
    # Define (again) the gripper pose relative to the object when in grasp.
    p_GgraspO = [0, 0.11, 0]
    R_GgraspO = RotationMatrix.MakeXRotation(
    np.pi / 2.0
    ) @ RotationMatrix.MakeZRotation(np.pi / 2.0)
    X_GgraspO = RigidTransform(R_GgraspO, p_GgraspO)

Trajectory Generation

Timing:
Conservative speeds (e.g., 10 cm/s)
Position Interpolation:
Linear interpolation using PiecewisePolynomial
Rotation Interpolation:
Use SLERP (spherical linear interpolation), via PiecewiseQuaternionSlerp
Pose Trajectory:
Combined in PiecewisePose
Gripper Command Trajectory:
Controls open/close timing

📐 Why Rotations Cannot Be Linearly Interpolated

When dealing with trajectory rotations, it is crucial to understand that you cannot simply average rotation matrices or Euler angles to interpolate between orientations. This is because rotations do not behave linearly in the same way as positions.

Rotations Live on a Manifold:
Unlike positions, which exist in Euclidean space (ℝ³), 3D rotations reside on a mathematical manifold called the Special Orthogonal Group, denoted as $SO(3)$. In 2D, this is analogous to the unit circle $S^1$.
- If you take two rotation matrices and average them component-wise, the result is not guaranteed to be orthonormal, which is a requirement for a valid rotation matrix.
- Example in 2D:
  Consider two 2D rotations:
  - One at 0 degrees,
  - One at 90 degrees ($\pi/2$ radians).
    A straight average will point through the circle, not along it, resulting in an invalid or incorrect rotation.
Singularities in Representations:
Euler angles (e.g., roll-pitch-yaw) are a common way to represent 3D rotations, but they suffer from singularities.
- At certain angles (like pitch = $\pi/2$), different combinations of roll and yaw can produce the same orientation.
- This causes gimbal lock, a loss of one degree of freedom and potentially ambiguous interpolations.

✅ The Solution: Spherical Linear Interpolation (Slerp)

Slerp stands for Spherical Linear Interpolation. It operates on unit quaternions, which represent 3D rotations in a smooth and non-singular way.
Why Quaternions?
- Quaternions are 4D vectors $(w, x, y, z)$ satisfying: $w^2 + x^2 + y^2 + z^2 = 1$
- They do not suffer from gimbal lock, unlike Euler angles.
- However, there is a “double covering” issue: both $q$ and $-q$ represent the same rotation.
How Slerp Works:
- Given two unit quaternions $q_0$ and $q_1$, slerp interpolates along the shortest arc on the unit 4D sphere (hypersphere).
- It linearly interpolates rotation angle, not quaternion components.
- This keeps the interpolation on the manifold of valid rotations.
In Drake:
Drake provides a class called PiecewiseQuaternionSlerp to handle this exact case:
- You define keyframes as quaternions at specific times.
- The class generates a smooth interpolation between them.

🔄 Position + Orientation: Pose Interpolation

For interpolating full poses (position + orientation), Drake offers:

PiecewisePose: Combines both:
- Linear interpolation for position (via PiecewisePolynomial)
- Slerp for orientation (via PiecewiseQuaternionSlerp)

This approach ensures smooth, valid, and physically consistent motion of the robot end-effector across keyframes in pick-and-place or other manipulation tasks.

IV. Kinematics: Mapping Joint Angles to End-Effector Pose

A. Forward Kinematics

Defines mapping:

\[X^G = f_{\text{kin}}^G(q)\]

$( q )$: joint angles (configuration)
$( X^G )$: gripper pose

If you know the robot’s joint angles (q), forward kinematics tells you where the gripper is in 3D space.

Input: joint positions
Output: gripper’s pose (position + orientation)

This is typically done using:

Denavit–Hartenberg parameters (classical method)
Matrix multiplication of local transforms
Modern tools: KDL, MoveIt!, Drake, etc.

🔧 Joint Configuration: $( q )$

\[q \in \mathbb{R}^n\]
Where $( n )$ is the number of joints (e.g., revolute or prismatic).
Each element of $( q )$ corresponds to one degree of freedom (DoF).

Example:
For a 2-joint robot arm with revolute joints:

\[q = \begin{bmatrix} \theta_1 \\ \theta_2 \end{bmatrix} = \begin{bmatrix} 1.0 \\ 2.0 \end{bmatrix}\]

This means joint 1 is at 1 radian, joint 2 is at 2 radians.

🤖 Gripper Pose: $( X^G )$

\[X^G \in \mathbb{R}^6 \quad \text{or} \quad SE(3)\]
Describes the pose of the end-effector (position + orientation)

If represented as a 6D vector:

\[X^G = \begin{bmatrix} x \\ y \\ z \\ \alpha \\ \beta \\ \gamma \end{bmatrix}\]

$( (x, y, z) )$: Position in 3D space
$( (\alpha, \beta, \gamma) )$: Orientation in Euler angles or axis-angle

If represented in SE(3):

\[X^G = T = \begin{bmatrix} R & t \\ 0 & 1 \end{bmatrix}\]

Where:

$( R \in SO(3) )$: 3×3 rotation matrix
$( t \in \mathbb{R}^3 )$: 3D translation vector

Example (in 6D vector form):

\[X^G = \begin{bmatrix} 0.5 \\ 0.2 \\ 0.1 \\ 0.0 \\ 1.57 \\ 0.0 \end{bmatrix}\]

→ The gripper is at position (0.5, 0.2, 0.1) with a pitch of 90° (1.57 rad).

Quantity	Symbol	Dimension	Description
Joint config	$( q )$	$( \mathbb{R}^n )$	Joint angles or positions
Gripper pose	$( X^G )$	$( \mathbb{R}^6 )$ or $( SE(3) )$	Position + Orientation in space

MultibodyPlant Structure:

Tree of bodies and joints
Each joint defines a transform dependent on joint angle

“Joints are constraints” — they reduce DoF added by bodies

Revolute Joint:
Removes 5 DoF, leaves 1 rotational DoF

B. The Kinematic Tree in Robotics

The kinematic tree is a fundamental concept in robotics, especially for multi-body systems. It organizes all the bodies in a simulated world into a tree topology for efficient computations.

Purpose and Structure

The MultibodyPlant in Drake organizes all bodies in the world into this tree topology.
Every body in the system, except for the world body, has a parent body.
Bodies connect to their parents via either a Joint or a floating base.
The world frame ($W$) serves as the canonical root for the tree—a fixed reference point in space.
Every body in the physics engine has a unique frame attached to it called a body frame.

Joints and Degrees of Freedom

Joints are transforms that depend on the joint angle $q$ and define how to go from one link’s coordinate frame to the next.
For example, a revolute joint defines a pure rotation between frames depending on $q$.
Important clarification:
Adding a joint does not add degrees of freedom (DoF).
Instead:
- Adding a body typically adds many DoF (e.g., 6 for a free-floating body).
- Joints act as constraints that remove DoF by specifying fixed or limited motion relationships between bodies.
- For example, welding a robot’s base to the world frame removes all floating DoF.
Robot description files (URDF, SDF) specify different joint types (revolute, prismatic, planar) with corresponding mathematical transforms as functions of $q$.

Role in Forward Kinematics

The kinematic tree is essential for forward kinematics:
Computing the pose (position and orientation) of any frame in the world (e.g., the gripper frame) as a function of joint positions $q$.
This is done by recursively composing spatial transforms through joint-dependent transforms along the chain.
Drake’s MultibodyPlant automates this spatial algebra, often with caching, efficiently returning body poses in the world.
Floating bodies (e.g., foam bricks, floating hands) also have DoF contributing to the full configuration vector $q$.
- A floating body’s pose is represented by 7 numbers:
  - 3 for position
  - 4 for orientation as a unit quaternion (to avoid singularities like gimbal lock).

Examples and Visualization

The KUKA iiwa robot is a serial manipulator, so its kinematic tree resembles a vine.
Dexterous robot hands have more complex branching tree structures.
Free objects like a foam brick appear as branches connected directly to the world root node.
Robot visualizations often display coordinate frames with colors:
- Red for X-axis
- Green for Y-axis
- Blue for Z-axis
  These frames illustrate transforms between bodies.

C. Degrees of Freedom in the Example System (Allegro Hand + Foam Brick)