🤖 Introduction to Robot Learning (CMU 16-831)

I. 📘 Course Overview and Core Concepts

The “16-831: Introduction to Robot Learning” course, taught by Professor Guanya Shi at Carnegie Mellon University (CMU), focuses on the fundamental principles and applications of robot learning.

Theme: “Learning to make sequential decisions in the physical world”

Course link

This concept is broken down into three components:

🔍 Learning

Emphasizes data-driven approaches and continuous improvement through data.
Contrasts with traditional methods like:
- “Search & planning”
- “Classic control”
- “Optimal control”
- which operate “W/o learning & data”

🔁 Sequential Decisions

“The current action/decision influences the next state and the next action.”
Distinguishes from non-sequential problems like:
- “Bandits”
- “Standard supervised learning”

🌍 Physical World

Requires interaction in the closed-loop — also called embodied intelligence.
Contrasts with virtual domains (e.g., “RL for games”, “LLMs”) that don’t involve physical interaction.

II. ⚠️ Uniqueness and Challenges of Robot Learning

Robot learning presents unique challenges, especially compared to LLMs/GPTs and DRL in games like AlphaGo.

A. 🧠 Contrasting with LLMs/GPTs

LLMs rely on:

Architecture: Transformer
Data: Web text, books, wikis
Loss: Next-token prediction
Optimization: SGD
Generation: Autoregressive

➡️ Challenges in Robot Learning:

❓ Where is the data from?
📦 How to use physical-world data?
Data collection is costly, slow, and task-specific.

B. 🎮 Contrasting with DRL in Games (e.g., AlphaGo, DQN)

Aspect	Games 🕹️	Robotics 🤖
Environment Dynamics	Known & static	Unknown & dynamic
Task Scope	One specific task	Many diverse tasks
Goal Definition	Clear (reward function)	Often unclear or implicit
Learning Mode	Offline suffices	Requires online adaptation
Action Speed	Less constrained	Real-time (e.g., 50Hz)
Failure Tolerance	Allow failures	Physics doesn’t forgive 💥

III. 🎯 Goal and Current State of Robot Learning

A. 🧠 Ultimate Goal: General-Purpose Embodied Intelligence

“Build general-purpose embodied intelligence by learning to make sequential decisions in the physical world.”
Vision: Robots that can do “thousands of tasks in thousands of environments”
Requires synergy in:
- 📊 Algorithm & Data
- 🧮 Computation
- 🦾 Hardware

B. 📉 Current Progress and Gaps

Progress in domain-specific intelligence
But “still far from general-purpose embodied intelligence!”

C. ⚙️ Role of Learning vs. Non-Learning Methods

Examples of Non-Learning Success:

🚀 Apollo 11: Optimal + Robust Control
🦿 Boston Dynamics: Trajectory Optimization + MPC
🚜 Offroad Autonomy: Sampling-based MPC

Why Learning is Needed:

📉 Modeling is hard
🔁 Tasks/environments change
🧠 Policy space may be limited
❌ Optimizer could be wrong
🤯 Assumptions may not hold

Learning = Tightly integrated and adaptive, while traditional = Modular and brittle

IV. 📚 Course Structure and Topics

alt_text

📌 Topics Overview:

Intro to Robot Learning (Lectures 1–2)
Machine/Deep Learning Refresher (Lectures 3–4)
Imitation Learning (Lectures 5–6)
Model-Free RL (Lectures 7–12)
- Q-Learning, Policy Gradient, etc.
Model-Based RL (Lectures 13–16)
Bandits & Exploration (Lectures 17–18)
Offline RL (Lecture 19)
Special Topics:
- Inverse RL
- Sim2Real
- Safe RL
- Multi-task & Adaptive RL (Lectures 20, 22–24)
Challenges & Opportunities (Lecture 25)

What is Robot Learning

Robot Learning Lecture 1