1
00:00:00,000 --> 00:00:02,490
The following content is
provided under a Creative

2
00:00:02,490 --> 00:00:03,940
Commons license.

3
00:00:03,940 --> 00:00:06,330
Your support will help
MIT OpenCourseWare

4
00:00:06,330 --> 00:00:10,660
continue to offer high quality
educational resources for free.

5
00:00:10,660 --> 00:00:13,320
To make a donation or
view additional materials

6
00:00:13,320 --> 00:00:17,160
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,160 --> 00:00:18,252
at ocw.mit.edu.

8
00:00:21,580 --> 00:00:24,970
PROFESSOR: OK, so we
added another tool,

9
00:00:24,970 --> 00:00:28,930
quickly I admit, last
time to our arsenal,

10
00:00:28,930 --> 00:00:34,660
saying that if you've
got a system where you're

11
00:00:34,660 --> 00:00:37,690
trying to do a
trajectory plan for

12
00:00:37,690 --> 00:00:41,350
and your trajectory
optimizers are failing you,

13
00:00:41,350 --> 00:00:44,560
because they're only guaranteed
to be local, locally good,

14
00:00:44,560 --> 00:00:48,250
then there are class
of more globally more

15
00:00:48,250 --> 00:00:51,110
complete algorithms that are
guaranteed to find a solution,

16
00:00:51,110 --> 00:00:54,310
if it exists, based on
feasible motion planning.

17
00:00:54,310 --> 00:00:56,920
So we talked about ROTs mostly.

18
00:00:56,920 --> 00:01:00,250
I talked a little bit about
also the more discreet planning,

19
00:01:00,250 --> 00:01:02,800
A*STAR and things like that.

20
00:01:02,800 --> 00:01:04,660
OK, so, so far, our
methods are still

21
00:01:04,660 --> 00:01:08,080
clumped in very distinct bins.

22
00:01:14,950 --> 00:01:17,680
We still have our value
iteration type methods,

23
00:01:17,680 --> 00:01:24,890
our dynamic programming
methods, which I love,

24
00:01:24,890 --> 00:01:30,610
which give us policies over
the entire state space, so

25
00:01:30,610 --> 00:01:31,420
global policies.

26
00:01:40,240 --> 00:01:42,280
But they're stuck by--

27
00:01:42,280 --> 00:01:46,780
they're cursed by the
curse of dimensionality.

28
00:01:46,780 --> 00:01:48,880
So it only works for
low dimensional systems.

29
00:01:59,500 --> 00:02:02,290
OK, we've been
talking also about--

30
00:02:02,290 --> 00:02:04,828
and we've been talking about
policy search in general.

31
00:02:04,828 --> 00:02:06,370
And I'm going to,
later in the class,

32
00:02:06,370 --> 00:02:07,745
make a point that
that's not just

33
00:02:07,745 --> 00:02:09,280
about designing trajectories.

34
00:02:09,280 --> 00:02:10,150
I made it initially.

35
00:02:10,150 --> 00:02:12,100
We'll make it more
compelling later.

36
00:02:12,100 --> 00:02:14,500
But mostly what we've been
talking about other than that

37
00:02:14,500 --> 00:02:22,750
has been falling under the
class of trajectory planning

38
00:02:22,750 --> 00:02:24,345
and/or optimization.

39
00:02:34,050 --> 00:02:34,890
OK.

40
00:02:34,890 --> 00:02:40,620
And this is only locally
good but scales very nicely

41
00:02:40,620 --> 00:02:42,210
to higher dimensional systems.

42
00:02:54,830 --> 00:02:59,500
So you might ask, how
well does this scale?

43
00:02:59,500 --> 00:03:02,810
I don't really think
there's a good limit.

44
00:03:02,810 --> 00:03:07,060
I mean, it just depends on the
complexity of your problem.

45
00:03:07,060 --> 00:03:10,900
People have used ROTs very
effectively five years ago

46
00:03:10,900 --> 00:03:13,360
on 32-dimensional robots.

47
00:03:13,360 --> 00:03:15,960
That's pretty darn good, right?

48
00:03:15,960 --> 00:03:19,270
If I have a system where
the start and the goal

49
00:03:19,270 --> 00:03:24,460
can be easily found,
Alex says we can do it

50
00:03:24,460 --> 00:03:26,540
in thousands of dimensions.

51
00:03:26,540 --> 00:03:29,230
If I have a system where
the only hope of-- if I have

52
00:03:29,230 --> 00:03:31,555
a six-dimensional system
where the only hope of finding

53
00:03:31,555 --> 00:03:33,070
my way from the
start to the goal

54
00:03:33,070 --> 00:03:35,170
is by going through
this little channel,

55
00:03:35,170 --> 00:03:37,712
then I told you that's going to
fail, even in low dimensions.

56
00:03:37,712 --> 00:03:41,890
So it's a hard question
for me to specifically say

57
00:03:41,890 --> 00:03:46,480
what class of system should
you expect this to work for.

58
00:03:46,480 --> 00:03:48,640
But I think they're
the best tools

59
00:03:48,640 --> 00:03:50,470
we have for higher
dimensional systems, OK.

60
00:03:53,860 --> 00:03:57,070
So the big question I
want to address today

61
00:03:57,070 --> 00:04:00,490
is whether these ideas,
which seem very local--

62
00:04:00,490 --> 00:04:03,220
we were talking about
single trajectory planning--

63
00:04:03,220 --> 00:04:06,910
can be used to design a
feedback policy that's

64
00:04:06,910 --> 00:04:10,450
more broadly general, that's
valid over lots of areas

65
00:04:10,450 --> 00:04:12,610
of the state space, OK.

66
00:04:15,605 --> 00:04:17,230
Does that makes sense,
what I'm saying?

67
00:04:17,230 --> 00:04:17,980
Yeah?

68
00:04:17,980 --> 00:04:20,050
I'm saying I could design
a single trajectory,

69
00:04:20,050 --> 00:04:23,530
but that's really only relevant
very close to the trajectory.

70
00:04:23,530 --> 00:04:28,870
So let's make our
favorite picture.

71
00:04:28,870 --> 00:04:31,150
Let's say I've designed
for the simple pendulum

72
00:04:31,150 --> 00:04:35,733
a nice trajectory, which goes
up and gets me to the goal.

73
00:04:35,733 --> 00:04:36,400
And that's good.

74
00:04:36,400 --> 00:04:39,790
If I start here, I know
exactly what to do.

75
00:04:39,790 --> 00:04:44,620
We talked about stabilizing
it with LTV LQR.

76
00:04:44,620 --> 00:04:48,770
So that means if I start here or
here, I'm in pretty good shape.

77
00:04:48,770 --> 00:04:51,520
If I'm smart enough to
index into the closest

78
00:04:51,520 --> 00:04:53,830
point on the trajectory, then
maybe even starting here,

79
00:04:53,830 --> 00:04:54,330
it's fine.

80
00:04:54,330 --> 00:04:57,130
I'll just execute the second
half of that trajectory.

81
00:04:57,130 --> 00:05:01,570
But what happens if I start over
here or if I start over here?

82
00:05:01,570 --> 00:05:06,478
Probably, the controller
based on the linearization

83
00:05:06,478 --> 00:05:08,770
is not going to have a lot
to say about the points that

84
00:05:08,770 --> 00:05:10,970
are far from my trajectory.

85
00:05:10,970 --> 00:05:14,440
So the goal today is to take
these methods that we've

86
00:05:14,440 --> 00:05:17,020
been pretty happy
with for designing

87
00:05:17,020 --> 00:05:19,510
trajectories, and even
stabilizing trajectories,

88
00:05:19,510 --> 00:05:22,690
and see if we can make them
useful throughout the state

89
00:05:22,690 --> 00:05:26,950
space, just to see how
well that can work.

90
00:05:26,950 --> 00:05:32,920
OK, there's a couple of
ideas that I want to get to,

91
00:05:32,920 --> 00:05:42,190
but first, I want to make
sure I say that there's

92
00:05:42,190 --> 00:05:43,390
no hope of getting--

93
00:05:43,390 --> 00:05:46,120
there's no magic bullet here.

94
00:05:46,120 --> 00:05:53,200
So there's no hope of me
finding global optimal policies,

95
00:05:53,200 --> 00:05:56,800
unless I'm willing to look
at every state/action pair.

96
00:05:56,800 --> 00:05:58,420
I'm not going to
tell you that I can

97
00:05:58,420 --> 00:06:01,247
use these trajectories
to just magically do

98
00:06:01,247 --> 00:06:03,080
what value iteration
did in high dimensions.

99
00:06:03,080 --> 00:06:05,290
That's not what I'm saying.

100
00:06:05,290 --> 00:06:07,270
Unless you have some
analytical insight which

101
00:06:07,270 --> 00:06:10,197
turns the problem into a linear
problem or something like that,

102
00:06:10,197 --> 00:06:12,280
I'm not saying that I'm
going to give you globally

103
00:06:12,280 --> 00:06:14,560
optimal policies.

104
00:06:14,560 --> 00:06:17,488
What I'm trying to say is we
can get good enough policies,

105
00:06:17,488 --> 00:06:19,030
potentially, using
these methods, OK.

106
00:06:19,030 --> 00:06:23,170
So I just want to make sure I
make the point that we really

107
00:06:23,170 --> 00:06:42,070
can't expect globally
optimal policies

108
00:06:42,070 --> 00:06:56,512
unless we explore every
state/action pair, of maybe

109
00:06:56,512 --> 00:06:57,970
if we have some
analytical insight.

110
00:07:11,492 --> 00:07:14,830
OK, so the curse of
dimensionality is real.

111
00:07:14,830 --> 00:07:17,080
It's not that some-- the
value iteration algorithm

112
00:07:17,080 --> 00:07:18,100
is a little quirky.

113
00:07:18,100 --> 00:07:19,850
It's got this problem
of dimensionality.

114
00:07:19,850 --> 00:07:21,630
It's really not that at all.

115
00:07:21,630 --> 00:07:23,380
It's not that somebody
hasn't just come up

116
00:07:23,380 --> 00:07:25,510
with the right algorithm.

117
00:07:25,510 --> 00:07:30,520
The problem is you can't know if
there's a better way unless you

118
00:07:30,520 --> 00:07:31,720
look at every possible way.

119
00:07:31,720 --> 00:07:32,810
That's the real problem.

120
00:07:32,810 --> 00:07:35,440
I mean, so it might
be that I want

121
00:07:35,440 --> 00:07:37,900
to find my way from the start
of the-- front of the room

122
00:07:37,900 --> 00:07:39,275
to the back of
the room, and I've

123
00:07:39,275 --> 00:07:41,168
got some cost function
which penalizes

124
00:07:41,168 --> 00:07:42,460
for the number of steps I take.

125
00:07:42,460 --> 00:07:44,500
But unless I go
down that third row,

126
00:07:44,500 --> 00:07:47,040
I didn't know that
there was actually a--

127
00:07:47,040 --> 00:07:48,790
see if I can say
something not ridiculous,

128
00:07:48,790 --> 00:07:50,200
but some pot of
gold or something

129
00:07:50,200 --> 00:07:51,640
in the middle of the third row.

130
00:07:51,640 --> 00:07:53,710
And I just didn't
see it, and I'm never

131
00:07:53,710 --> 00:07:56,440
going to see it unless
I go down the third row.

132
00:07:56,440 --> 00:07:59,140
So you really can't
get around that.

133
00:08:02,630 --> 00:08:04,540
So the goal is to really--

134
00:08:04,540 --> 00:08:08,500
maybe we can efficiently
get good enough policies.

135
00:08:08,500 --> 00:08:10,690
And I don't care about
optimality, per se.

136
00:08:10,690 --> 00:08:13,030
I've said that before.

137
00:08:13,030 --> 00:08:15,893
I just care about using
optimal control and the like

138
00:08:15,893 --> 00:08:17,935
to turn these things into
computational problems.

139
00:08:37,000 --> 00:08:37,500
OK?

140
00:08:41,730 --> 00:08:44,330
So there's a couple ideas
out there that are relevant.

141
00:08:55,700 --> 00:09:03,500
The first one sounds
a little silly,

142
00:09:03,500 --> 00:09:06,280
but it's increasingly plausible.

143
00:09:10,880 --> 00:09:13,700
Let's say my trajectory
optimizers or my planning

144
00:09:13,700 --> 00:09:16,640
algorithms got so
fast, or maybe just

145
00:09:16,640 --> 00:09:19,580
computers got so
fast that I didn't

146
00:09:19,580 --> 00:09:21,680
have to do any work
in the algorithms,

147
00:09:21,680 --> 00:09:24,080
that it takes me a
hundredth of a second

148
00:09:24,080 --> 00:09:28,850
to design a trajectory from
the start to the goal here.

149
00:09:28,850 --> 00:09:31,460
I've got a real time
execution task here.

150
00:09:31,460 --> 00:09:33,887
Every, let's say,
hundredth of a second,

151
00:09:33,887 --> 00:09:35,720
my control system's
asking me for a decision

152
00:09:35,720 --> 00:09:37,020
about what to do.

153
00:09:37,020 --> 00:09:39,620
But if I can plan fast
enough, and I find myself

154
00:09:39,620 --> 00:09:42,145
in this state, then you
could just plan again.

155
00:09:42,145 --> 00:09:44,270
You could really just,
every time you find yourself

156
00:09:44,270 --> 00:09:46,340
in a new state, plan
a trajectory that's

157
00:09:46,340 --> 00:09:48,800
going to get me to the goal.

158
00:09:48,800 --> 00:09:49,900
If I find myself--

159
00:09:49,900 --> 00:09:52,173
so if I'm executing this
trajectory and I get

160
00:09:52,173 --> 00:09:53,840
pushed off on a
disturbance, no problem.

161
00:09:53,840 --> 00:09:58,340
Every step, I'm just planning
a trajectory to the goal.

162
00:09:58,340 --> 00:09:59,278
If you can plan--

163
00:09:59,278 --> 00:10:01,070
if we teach the course
again in five years,

164
00:10:01,070 --> 00:10:02,278
maybe that's the only answer.

165
00:10:02,278 --> 00:10:02,823
I don't know.

166
00:10:02,823 --> 00:10:04,490
If you can plan fast
enough, that really

167
00:10:04,490 --> 00:10:05,570
is a beautiful answer.

168
00:10:08,570 --> 00:10:10,820
For the most part, the
problems we've looked at so far

169
00:10:10,820 --> 00:10:13,760
are not that easy that
you can plan that fast,

170
00:10:13,760 --> 00:10:16,270
but there's a middle ground.

171
00:10:16,270 --> 00:10:22,280
So this was basically
plan every dt.

172
00:10:37,070 --> 00:10:40,670
There's a middle ground
that people use today a lot.

173
00:10:40,670 --> 00:10:43,280
I mentioned it once before.

174
00:10:43,280 --> 00:10:47,030
But a lot of times what we do
to make real time-- to make

175
00:10:47,030 --> 00:10:50,100
the planning fast enough
to execute in real time

176
00:10:50,100 --> 00:10:52,100
is a lot of times we'll
do some sort of receding

177
00:10:52,100 --> 00:10:53,015
horizon problem.

178
00:11:09,073 --> 00:11:10,240
So how's that going to work?

179
00:11:13,420 --> 00:11:17,320
The simplest answer is,
for receding horizon,

180
00:11:17,320 --> 00:11:19,750
I've got some long-term
cost function,

181
00:11:19,750 --> 00:11:27,970
and I've got my total
cost function is from t

182
00:11:27,970 --> 00:11:36,615
equals 0 to some t
final g of xu dt.

183
00:11:36,615 --> 00:11:37,990
I could-- I did
it discrete time.

184
00:11:37,990 --> 00:11:39,190
That's fine.

185
00:11:39,190 --> 00:11:43,990
So n to capital N
for discrete time.

186
00:11:47,310 --> 00:11:50,820
And let's say it takes me too
long to plan N steps ahead,

187
00:11:50,820 --> 00:11:55,920
but I know I can plan three
steps ahead really fast.

188
00:11:55,920 --> 00:11:59,790
So a lot of times people will
actually approximate that

189
00:11:59,790 --> 00:12:05,430
with the problem of just looking
some finite receding horizon

190
00:12:05,430 --> 00:12:06,960
step ahead.

191
00:12:06,960 --> 00:12:08,940
And if you can--

192
00:12:08,940 --> 00:12:11,460
if you're doing it at
every ti-- if at time 2,

193
00:12:11,460 --> 00:12:13,350
you're asking for the
receding horizon plan,

194
00:12:13,350 --> 00:12:15,390
then you can just
look from time 2.

195
00:12:15,390 --> 00:12:25,320
So let's say my current time to
my current time plus 3 gx of u.

196
00:12:28,550 --> 00:12:32,780
That could be an arbitrarily bad
estimate of my long-term cost,

197
00:12:32,780 --> 00:12:34,350
of course.

198
00:12:34,350 --> 00:12:39,230
If you're clever enough to have
a guess at the long-term cost,

199
00:12:39,230 --> 00:12:42,080
then you can put in
some sort of estimate

200
00:12:42,080 --> 00:12:49,430
of what j x from t plus 3 might
be, and that's going to help.

201
00:12:58,700 --> 00:13:02,020
So for instance, let's
say I find myself

202
00:13:02,020 --> 00:13:06,250
off my trajectory
somewhere over here,

203
00:13:06,250 --> 00:13:08,980
and I'm willing to say
my planner's fast enough.

204
00:13:08,980 --> 00:13:10,830
My controller's
running at 100 hertz.

205
00:13:10,830 --> 00:13:12,550
And in a hundredth
of a second, I

206
00:13:12,550 --> 00:13:16,480
can about solve
an optimal control

207
00:13:16,480 --> 00:13:19,600
problem that's of half a
second in duration, let's say.

208
00:13:19,600 --> 00:13:21,490
That's a reasonable thing.

209
00:13:21,490 --> 00:13:24,303
Half a second along puts
me-- it would put me here.

210
00:13:24,303 --> 00:13:25,720
So let's say I'm
going to design--

211
00:13:25,720 --> 00:13:27,262
I'm going to use my
planner to design

212
00:13:27,262 --> 00:13:31,308
a trajectory that gets me
back to this in half a second.

213
00:13:31,308 --> 00:13:33,100
And then I use my cost
to go that I already

214
00:13:33,100 --> 00:13:36,070
knew from this design
to get me to the goal.

215
00:13:36,070 --> 00:13:40,690
That's one way to implement
what I just said, OK.

216
00:13:40,690 --> 00:13:46,006
And it's not just talk.

217
00:13:46,006 --> 00:13:49,640
I can show you a
good example of it.

218
00:13:49,640 --> 00:13:51,790
So I showed you guys
this once before,

219
00:13:51,790 --> 00:13:54,260
but let's just look
at it again quickly.

220
00:13:58,340 --> 00:14:02,840
This is Pieter Abbeel's
and Andrew Ng's work

221
00:14:02,840 --> 00:14:05,300
on the autonomous
helicopters, OK.

222
00:14:05,300 --> 00:14:11,330
So they execute these
comically cool trajectories

223
00:14:11,330 --> 00:14:13,940
with their helicopter.

224
00:14:13,940 --> 00:14:16,910
The way they do it is actually,
they get a desired trajectory

225
00:14:16,910 --> 00:14:18,920
from a human pilot,
and then they

226
00:14:18,920 --> 00:14:20,660
stabilize that in real time.

227
00:14:23,420 --> 00:14:27,537
They do-- he calls it
DDP, but it's actually

228
00:14:27,537 --> 00:14:29,120
what we've been
calling iterative LQR.

229
00:14:29,120 --> 00:14:32,300
I told you that a lot of people
blur the lines, unfortunately,

230
00:14:32,300 --> 00:14:33,590
between those two.

231
00:14:33,590 --> 00:14:36,213
So they do an iterative
LQR controller design,

232
00:14:36,213 --> 00:14:38,630
and they decided that it's
fast enough that they can do it

233
00:14:38,630 --> 00:14:41,010
three seconds into the future.

234
00:14:41,010 --> 00:14:43,760
So they're doing exactly
these receding-- every dt

235
00:14:43,760 --> 00:14:46,760
for that control
for that helicopter,

236
00:14:46,760 --> 00:14:50,150
they're doing
iterative LQR to design

237
00:14:50,150 --> 00:14:51,950
a trajectory that's
going to get me back

238
00:14:51,950 --> 00:14:55,910
to my pilot's trajectory.

239
00:14:55,910 --> 00:14:58,650
And they're running it every dt,
thinking three seconds ahead,

240
00:14:58,650 --> 00:15:02,100
and they say, that's
comparable to the time of--

241
00:15:02,100 --> 00:15:04,970
the dynamics of instability
for their helicopter.

242
00:15:04,970 --> 00:15:06,630
Yeah.

243
00:15:06,630 --> 00:15:09,690
Put it all together, and
you get this thing tracking

244
00:15:09,690 --> 00:15:12,270
pretty cool trajectories, OK.

245
00:15:12,270 --> 00:15:14,097
It took a lot of
good engineering

246
00:15:14,097 --> 00:15:15,930
behind that, too, of
getting the model right

247
00:15:15,930 --> 00:15:17,800
and getting the
helicopter right,

248
00:15:17,800 --> 00:15:20,217
but it's pretty impressive.

249
00:15:37,270 --> 00:15:40,628
OK, so if you can plan fast
enough-- and like I said,

250
00:15:40,628 --> 00:15:42,920
in a few years, maybe the
planning algorithms are going

251
00:15:42,920 --> 00:15:43,582
to be--

252
00:15:43,582 --> 00:15:45,290
and the computers are
going to be so fast

253
00:15:45,290 --> 00:15:46,100
and the planning
algorithms are going

254
00:15:46,100 --> 00:15:47,990
to be so fast that we never
do value iteration anymore,

255
00:15:47,990 --> 00:15:49,080
but I kind of doubt it.

256
00:15:49,080 --> 00:15:50,600
I think that
there's always going

257
00:15:50,600 --> 00:15:55,580
to be reasons to do
more global methods.

258
00:15:55,580 --> 00:15:57,830
If you can plan fast
enough, even a little bit

259
00:15:57,830 --> 00:16:00,620
into the future, that
it might be good enough

260
00:16:00,620 --> 00:16:02,570
to just turn your
planner immediately

261
00:16:02,570 --> 00:16:05,640
into a feedback policy.

262
00:16:05,640 --> 00:16:06,140
OK.

263
00:16:09,560 --> 00:16:11,580
We don't do that so
much in my group.

264
00:16:11,580 --> 00:16:14,550
I think it's a good
idea, and it makes sense.

265
00:16:14,550 --> 00:16:16,610
But I do think there's a
lot of other good ideas

266
00:16:16,610 --> 00:16:23,270
out there on how to turn
your planners into policies.

267
00:16:23,270 --> 00:16:44,540
OK, the next one is
multi-query planning.

268
00:16:44,540 --> 00:16:46,600
Anybody know what
I mean by that?

269
00:16:51,560 --> 00:16:54,540
AUDIENCE: [INAUDIBLE]

270
00:16:54,540 --> 00:16:57,540
PROFESSOR: No, that's
not what I mean.

271
00:16:57,540 --> 00:17:00,300
You can imagine doing something
like that and it meaning this,

272
00:17:00,300 --> 00:17:00,800
but--

273
00:17:04,500 --> 00:17:06,750
so I spent relatively
little time on the ROTs,

274
00:17:06,750 --> 00:17:08,849
but actually, it's one
of the tools we think

275
00:17:08,849 --> 00:17:10,500
a lot about in my group now.

276
00:17:10,500 --> 00:17:13,530
It's actually-- the only reason
I spend little time on it

277
00:17:13,530 --> 00:17:17,605
is I think that seeing the
big idea in class is enough,

278
00:17:17,605 --> 00:17:19,230
that the ideas are
so simple, that when

279
00:17:19,230 --> 00:17:20,980
you do your problem
set and make it work,

280
00:17:20,980 --> 00:17:23,450
that's the best way for
you to learn about it, OK.

281
00:17:23,450 --> 00:17:30,450
So it's such a simple idea,
and it just works very well.

282
00:17:30,450 --> 00:17:33,000
OK, so let's say we've got
these ROTs that we like,

283
00:17:33,000 --> 00:17:34,200
we know and love.

284
00:17:34,200 --> 00:17:35,940
And for the pendulum,
I showed you

285
00:17:35,940 --> 00:17:40,560
a plot of the ROT trying to
find its way to the goal.

286
00:17:40,560 --> 00:17:43,955
It started splintering
off lots of--

287
00:17:43,955 --> 00:17:46,290
eventually, it'll find
some trajectory that

288
00:17:46,290 --> 00:17:48,330
will find its way there,
but along the way,

289
00:17:48,330 --> 00:17:50,490
it's generated
lots of trees that

290
00:17:50,490 --> 00:17:53,550
do random things and
lots of paths that

291
00:17:53,550 --> 00:17:54,885
didn't turn out to be useful.

292
00:18:00,010 --> 00:18:04,650
And what you have is a web.

293
00:18:04,650 --> 00:18:05,985
In this case, it's a tree.

294
00:18:05,985 --> 00:18:08,910
If you run the ROT
once, you have a tree

295
00:18:08,910 --> 00:18:11,160
of feasible trajectories
that you could

296
00:18:11,160 --> 00:18:14,850
execute on the real robot.

297
00:18:14,850 --> 00:18:17,160
It happens that one of
them got me from the start

298
00:18:17,160 --> 00:18:21,090
to the goal in my initial
problem formulation.

299
00:18:21,090 --> 00:18:24,990
OK, but instead of throwing all
that computation out and just

300
00:18:24,990 --> 00:18:29,580
keeping the nominal trajectory,
I might as well store it.

301
00:18:29,580 --> 00:18:31,650
If I get a new
problem, which is,

302
00:18:31,650 --> 00:18:35,010
let's say I wanted to start
from here, like I said

303
00:18:35,010 --> 00:18:38,010
and get to the
goal, then really,

304
00:18:38,010 --> 00:18:41,460
all I need to do in my new
time, in my new planning problem

305
00:18:41,460 --> 00:18:44,040
is connect back to
my old solution.

306
00:18:44,040 --> 00:18:46,650
If I can find a new plan
that gets me back here,

307
00:18:46,650 --> 00:18:51,930
then I can just ride the rest
of the solution into the goal.

308
00:18:51,930 --> 00:18:54,750
Simultaneously, if
someone were to tell me

309
00:18:54,750 --> 00:18:56,220
I want to get to
a different goal--

310
00:18:56,220 --> 00:18:58,560
let's say I want to get
the system to the upright

311
00:18:58,560 --> 00:18:59,910
with some velocity--

312
00:18:59,910 --> 00:19:02,310
all I really need
to do is find a way

313
00:19:02,310 --> 00:19:05,220
to connect from my old
plan to the new goal.

314
00:19:08,830 --> 00:19:12,100
So as you design
these, the first start

315
00:19:12,100 --> 00:19:13,637
to the goal planning
problem, where

316
00:19:13,637 --> 00:19:15,220
you're designing
trees that try to get

317
00:19:15,220 --> 00:19:17,230
as-- cover all over
the place, could

318
00:19:17,230 --> 00:19:19,590
be potentially very painful.

319
00:19:19,590 --> 00:19:23,650
We might take a long time
to find your way from start

320
00:19:23,650 --> 00:19:24,428
to goal.

321
00:19:24,428 --> 00:19:26,470
But if I want to solve a
new problem which is not

322
00:19:26,470 --> 00:19:29,410
so different, then it could
be actually very efficient

323
00:19:29,410 --> 00:19:32,890
to reuse your old
computation and do

324
00:19:32,890 --> 00:19:39,700
a multi-query-- this is a
multi-query planning idea, OK.

325
00:19:39,700 --> 00:19:42,220
And I think that idea is so
good that it's actually--

326
00:19:46,165 --> 00:19:50,580
if you do this again
and again, you're

327
00:19:50,580 --> 00:19:55,860
going to slowly end up with this
web of feasible trajectories

328
00:19:55,860 --> 00:19:58,553
that you could execute.

329
00:19:58,553 --> 00:19:59,595
People call it a roadmap.

330
00:20:03,330 --> 00:20:06,390
When you have some network,
some graph of these feasible

331
00:20:06,390 --> 00:20:08,985
trajectories, people
call it a roadmap.

332
00:20:22,475 --> 00:20:24,350
If all you care about
is getting to the goal,

333
00:20:24,350 --> 00:20:26,475
then all you need to do is
connect to your existing

334
00:20:26,475 --> 00:20:28,727
roadmap and write
it to the goal.

335
00:20:28,727 --> 00:20:31,310
If your roadmap is so rich that
once I connect to the roadmap,

336
00:20:31,310 --> 00:20:33,227
there's actually a bunch
of different options,

337
00:20:33,227 --> 00:20:35,810
bunch of different paths I could
take through the graph to get

338
00:20:35,810 --> 00:20:38,210
there, well, then at least
you've got a discrete planning

339
00:20:38,210 --> 00:20:42,470
problem, and you can do A*STAR
on it or something like this.

340
00:20:42,470 --> 00:20:46,430
And effectively,
the trajectories

341
00:20:46,430 --> 00:20:50,455
I've already generated
will turn this back

342
00:20:50,455 --> 00:20:51,830
into a discrete
planning problem.

343
00:21:34,620 --> 00:21:37,050
That idea is so good
that some people

344
00:21:37,050 --> 00:21:40,500
believe it's the only
thing you need to do, OK.

345
00:21:40,500 --> 00:21:42,870
There's a camp out there
that does these probabilistic

346
00:21:42,870 --> 00:21:44,370
roadmaps that's--

347
00:21:44,370 --> 00:21:46,770
Jean-Claude Latombe,
I think, is the head

348
00:21:46,770 --> 00:21:51,060
of the camp,
started these ideas.

349
00:21:51,060 --> 00:21:54,600
And they believe that you should
address a complicated motion

350
00:21:54,600 --> 00:21:57,525
planning problem in two steps.

351
00:22:03,380 --> 00:22:09,128
First you'll construct some
dense enough graph, a roadmap,

352
00:22:09,128 --> 00:22:10,170
I guess I should call it.

353
00:22:16,950 --> 00:22:23,310
And then once you've got it, you
just do your query phase, OK.

354
00:22:23,310 --> 00:22:27,270
So let's think about that
in a configuration space.

355
00:22:27,270 --> 00:22:37,840
I've got a bunch
of obstacles, and I

356
00:22:37,840 --> 00:22:42,480
want to get myself from
some start to some goal.

357
00:22:42,480 --> 00:22:44,230
All right, if I know
I'm going to be doing

358
00:22:44,230 --> 00:22:46,900
a lot of these things,
then it actually

359
00:22:46,900 --> 00:22:50,290
makes a lot of sense for
me to go ahead and build

360
00:22:50,290 --> 00:22:52,660
a pretty good graph.

361
00:22:52,660 --> 00:22:55,480
So before I even start to
solve the first problem,

362
00:22:55,480 --> 00:22:57,550
let's just drop in a
lot of random samples

363
00:22:57,550 --> 00:23:04,810
throughout the space, choose
uniformly, OK, at the space.

364
00:23:04,810 --> 00:23:12,340
Every time I add a point in
the configuration space world,

365
00:23:12,340 --> 00:23:15,490
they try to connect that
new point to the end

366
00:23:15,490 --> 00:23:18,400
closest points with
simple strategies.

367
00:23:31,960 --> 00:23:34,020
So I'll pick a point
at random, I'll

368
00:23:34,020 --> 00:23:36,420
try to find the guys
that are close to it,

369
00:23:36,420 --> 00:23:40,200
and I'll connect with it.

370
00:23:40,200 --> 00:23:41,637
Pick a new point at random.

371
00:23:41,637 --> 00:23:43,470
Oh, there's really only
one guy close to it.

372
00:23:43,470 --> 00:23:45,240
I'll connect to it.

373
00:23:45,240 --> 00:23:46,300
Pick another point.

374
00:23:46,300 --> 00:23:48,150
Maybe these guys are connected.

375
00:23:48,150 --> 00:23:49,550
And that's it.

376
00:23:49,550 --> 00:23:52,050
And if I do it
enough, and then I

377
00:23:52,050 --> 00:23:53,950
come up with a
pretty good roadmap--

378
00:23:53,950 --> 00:23:59,310
maybe this guy was the one
that connects to everybody--

379
00:23:59,310 --> 00:24:03,360
that when the query
phase comes along, again,

380
00:24:03,360 --> 00:24:07,710
all you need to do is
connect to your roadmap.

381
00:24:07,710 --> 00:24:09,600
I got a new query.

382
00:24:09,600 --> 00:24:11,340
I just connect to my roadmap.

383
00:24:11,340 --> 00:24:14,370
I do whatever my discrete
searching problem may--

384
00:24:14,370 --> 00:24:16,920
A*STAR or whatever to
find a path from the start

385
00:24:16,920 --> 00:24:18,360
to the goal, OK.

386
00:24:22,628 --> 00:24:24,420
I actually think it's
a very beautiful idea

387
00:24:24,420 --> 00:24:29,730
to have this web of possible
trajectories covering the state

388
00:24:29,730 --> 00:24:32,110
space.

389
00:24:32,110 --> 00:24:34,050
And then all it takes
at execution time

390
00:24:34,050 --> 00:24:39,630
is connecting and then
executing your trajectory.

391
00:24:39,630 --> 00:24:41,310
Now the probabilistic
roadmaps, again,

392
00:24:41,310 --> 00:24:46,500
this step of connecting
nearby points

393
00:24:46,500 --> 00:24:49,320
in under-actuated
systems might be hard.

394
00:24:49,320 --> 00:24:51,390
Might be as hard
as finding the path

395
00:24:51,390 --> 00:24:53,500
from the start to the goal.

396
00:24:53,500 --> 00:24:55,110
So maybe what you
do here is actually

397
00:24:55,110 --> 00:24:58,020
do a [? DR call ?] or
something to find that path,

398
00:24:58,020 --> 00:25:00,300
or you do an RRT, or one
of any of the other methods

399
00:25:00,300 --> 00:25:03,940
we've done to make these
initial connections.

400
00:25:03,940 --> 00:25:05,970
And maybe to make them
feasible to execute,

401
00:25:05,970 --> 00:25:11,250
you've got to do some trajectory
stabilization to get on that.

402
00:25:11,250 --> 00:25:15,160
But if you can solve some
local planning problems,

403
00:25:15,160 --> 00:25:17,520
then you can use these big
roadmap ideas to maybe do

404
00:25:17,520 --> 00:25:20,490
more global behaviors, OK.

405
00:25:20,490 --> 00:25:24,720
So again, I think multi-query
planning is a nice way

406
00:25:24,720 --> 00:25:30,420
to go from local policies to
more globally valid policies.

407
00:25:30,420 --> 00:25:32,367
Yeah.

408
00:25:32,367 --> 00:25:34,200
AUDIENCE: I can see
that working pretty well

409
00:25:34,200 --> 00:25:36,043
with a static obstacle field.

410
00:25:36,043 --> 00:25:36,710
PROFESSOR: Good.

411
00:25:36,710 --> 00:25:38,320
AUDIENCE: Could it move
[? moving ?] obstacles,

412
00:25:38,320 --> 00:25:38,995
and might--

413
00:25:38,995 --> 00:25:42,200
the roadmap might change?

414
00:25:42,200 --> 00:25:44,450
PROFESSOR: Well, I
don't really know

415
00:25:44,450 --> 00:25:47,100
what the proponents would say.

416
00:25:47,100 --> 00:25:52,940
But if you know where
the obstacles are, then--

417
00:25:52,940 --> 00:25:55,130
or if you even sense
where the obstacles are

418
00:25:55,130 --> 00:25:57,470
going to be in a
receding horizon quickly,

419
00:25:57,470 --> 00:25:59,180
then you could--

420
00:25:59,180 --> 00:26:01,898
maybe this one's blocked, and
I can just take another path.

421
00:26:01,898 --> 00:26:03,440
But if I have a rich
enough road map,

422
00:26:03,440 --> 00:26:05,035
hopefully you can
get around that.

423
00:26:05,035 --> 00:26:06,410
And the other
thing is, if I have

424
00:26:06,410 --> 00:26:08,250
a model of how those
obstacles are changing,

425
00:26:08,250 --> 00:26:12,650
then naively, that just adds one
dimension in time, let's say,

426
00:26:12,650 --> 00:26:15,610
to my plan, and I just have to
do a higher dimensional plan.

427
00:26:15,610 --> 00:26:17,360
But I think the case
you're thinking about

428
00:26:17,360 --> 00:26:19,360
is if these things are
just moving on their own.

429
00:26:19,360 --> 00:26:20,690
I don't have any good model.

430
00:26:20,690 --> 00:26:22,880
I suddenly find
that I'm obstructed.

431
00:26:22,880 --> 00:26:26,540
Then, again, you could
dynamically replan

432
00:26:26,540 --> 00:26:27,622
or you could--

433
00:26:27,622 --> 00:26:30,080
by either taking a different
path here or making a new edge

434
00:26:30,080 --> 00:26:31,430
if you had to.

435
00:26:31,430 --> 00:26:36,257
I don't think it breaks
the fundamental goal.

436
00:26:36,257 --> 00:26:38,840
You could almost think of this
as having-- in a dynamic sense,

437
00:26:38,840 --> 00:26:40,460
you could almost think
of this as having

438
00:26:40,460 --> 00:26:42,920
a bunch of repertoires, a bunch
of things I know how to do.

439
00:26:42,920 --> 00:26:45,110
So maybe if it's
a walking robot,

440
00:26:45,110 --> 00:26:47,120
maybe I know how to
take a step here.

441
00:26:47,120 --> 00:26:48,230
That's one of my edges.

442
00:26:48,230 --> 00:26:49,040
I know how to execute that.

443
00:26:49,040 --> 00:26:50,290
I know how to take a big step.

444
00:26:50,290 --> 00:26:51,680
I know how to take a small step.

445
00:26:51,680 --> 00:26:56,150
It's a repertoire of local
skills, local trajectories

446
00:26:56,150 --> 00:26:57,380
in this case.

447
00:26:57,380 --> 00:27:01,200
Then I just got to stitch them
together in the right way.

448
00:27:01,200 --> 00:27:03,950
So that's a fairly
robust thing, even if--

449
00:27:03,950 --> 00:27:05,757
yeah.

450
00:27:05,757 --> 00:27:07,590
AUDIENCE: Given a rich
enough roadmap, would

451
00:27:07,590 --> 00:27:09,800
you have problems finding--

452
00:27:09,800 --> 00:27:13,856
choosing the best path
among those path nodes,

453
00:27:13,856 --> 00:27:15,175
like discrete search?

454
00:27:15,175 --> 00:27:16,550
PROFESSOR: The
discrete search, I

455
00:27:16,550 --> 00:27:20,120
think in general, you should
think of as being unlimitedly--

456
00:27:20,120 --> 00:27:21,050
basically unlimited.

457
00:27:21,050 --> 00:27:23,660
I mean, compared to all these
continuous time methods,

458
00:27:23,660 --> 00:27:25,610
it's very, very efficient.

459
00:27:25,610 --> 00:27:28,445
People doing it on huge
collections of nodes very

460
00:27:28,445 --> 00:27:30,320
efficiently, especially
if you can do A*STAR,

461
00:27:30,320 --> 00:27:32,090
if you find a good heuristic.

462
00:27:32,090 --> 00:27:35,572
I mean, this is
how you can go to--

463
00:27:35,572 --> 00:27:38,030
to maybe overplay the title,
this is how you go to MapQuest

464
00:27:38,030 --> 00:27:39,988
and you ask it to go from
Boston to California,

465
00:27:39,988 --> 00:27:42,420
and it just happens.

466
00:27:42,420 --> 00:27:45,695
These things are very fast,
even with a lot of nodes,

467
00:27:45,695 --> 00:27:46,320
a lot of roads.

468
00:27:46,320 --> 00:27:48,730
Yeah.

469
00:27:48,730 --> 00:27:49,870
Yeah.

470
00:27:49,870 --> 00:27:52,298
AUDIENCE: How did it compare
to [INAUDIBLE] discretize

471
00:27:52,298 --> 00:27:53,373
in state space?

472
00:27:53,373 --> 00:27:54,040
PROFESSOR: Good.

473
00:27:54,040 --> 00:27:54,370
AUDIENCE: [INAUDIBLE]

474
00:27:54,370 --> 00:27:56,160
PROFESSOR: So this--
very good question.

475
00:27:56,160 --> 00:27:58,930
Let me answer that first,
and then I'll-- yeah.

476
00:27:58,930 --> 00:28:01,800
So I almost talked about
this last time right

477
00:28:01,800 --> 00:28:05,040
after I said, what happens
when you turn this state

478
00:28:05,040 --> 00:28:09,600
space into buckets, how it's
a reasonable thing to try

479
00:28:09,600 --> 00:28:11,640
but not very elegant.

480
00:28:11,640 --> 00:28:15,030
I think these guys would have
put this topic immediately

481
00:28:15,030 --> 00:28:17,130
after that, saying,
instead of discretizing

482
00:28:17,130 --> 00:28:20,250
in some unnatural
grid maneuver, we're

483
00:28:20,250 --> 00:28:23,430
discretizing here by
sampling randomly.

484
00:28:23,430 --> 00:28:28,080
That has the benefit that
you could, for instance--

485
00:28:28,080 --> 00:28:30,330
you can actually-- you don't
have to sample uniformly.

486
00:28:30,330 --> 00:28:32,310
Maybe you care more about
things in this area.

487
00:28:32,310 --> 00:28:33,850
You can bias your sampling
distribution, same way

488
00:28:33,850 --> 00:28:36,040
you can add more grid cells
or something like that.

489
00:28:36,040 --> 00:28:39,940
But the real benefit is that
it's a more continuous process.

490
00:28:39,940 --> 00:28:42,710
It's not stuck in some
very discrete bins.

491
00:28:42,710 --> 00:28:43,350
OK.

492
00:28:43,350 --> 00:28:45,520
Sorry, second part
of your question.

493
00:28:45,520 --> 00:28:47,020
AUDIENCE: Well, now
it would follow,

494
00:28:47,020 --> 00:28:49,425
like if we have a discrete
[? board ?] and we can run any

495
00:28:49,425 --> 00:28:52,670
of those algorithms on
[? value ?] [? iteration ?]

496
00:28:52,670 --> 00:28:54,420
on top of it, [? we can ?]
[? find out? ?]

497
00:28:54,420 --> 00:28:55,420
PROFESSOR: Yes, exactly.

498
00:28:55,420 --> 00:28:57,172
So why did I say A*STAR.

499
00:28:57,172 --> 00:28:59,130
I should have said value
reason on the-- right?

500
00:28:59,130 --> 00:28:59,850
Yeah.

501
00:28:59,850 --> 00:29:01,577
Right.

502
00:29:01,577 --> 00:29:03,660
I mean, A*STAR can be
faster than value iteration.

503
00:29:03,660 --> 00:29:06,080
If you have a good heuristic,
you don't have to--

504
00:29:06,080 --> 00:29:07,316
yeah.

505
00:29:07,316 --> 00:29:09,440
AUDIENCE: So when you're
doing the sampling here--

506
00:29:09,440 --> 00:29:09,980
PROFESSOR: Good.

507
00:29:09,980 --> 00:29:10,480
Yeah.

508
00:29:10,480 --> 00:29:11,450
AUDIENCE: --you were--

509
00:29:11,450 --> 00:29:13,910
ROTs do this very
uniform sampling.

510
00:29:13,910 --> 00:29:15,695
And you say you can
bias the sample.

511
00:29:15,695 --> 00:29:16,320
PROFESSOR: Yes.

512
00:29:16,320 --> 00:29:17,737
AUDIENCE: But if
you're doing this

513
00:29:17,737 --> 00:29:20,990
before you even do
your first path,

514
00:29:20,990 --> 00:29:23,688
why don't you actually choose
[? optimal ?] [INAUDIBLE]??

515
00:29:23,688 --> 00:29:25,980
Can't you do some kind of
[INAUDIBLE] diagram with this

516
00:29:25,980 --> 00:29:29,565
and just say, I'm going to
test using the best I can find,

517
00:29:29,565 --> 00:29:31,190
given that I have a
model of the world,

518
00:29:31,190 --> 00:29:34,190
and then [? get them ?]
[? with sampling, ?] [? your ?]

519
00:29:34,190 --> 00:29:38,030
[? subcontinuous ?]
time, and then you find--

520
00:29:38,030 --> 00:29:39,975
why is the sampling
[? still ?] part of this--

521
00:29:39,975 --> 00:29:41,100
PROFESSOR: Excellent point.

522
00:29:41,100 --> 00:29:45,027
I think if the problem
permits that, then you

523
00:29:45,027 --> 00:29:46,110
should absolutely do that.

524
00:29:46,110 --> 00:29:46,575
AUDIENCE: OK.

525
00:29:46,575 --> 00:29:47,040
So then--

526
00:29:47,040 --> 00:29:48,290
PROFESSOR: I think
even for the pendulum,

527
00:29:48,290 --> 00:29:49,915
though, I wouldn't
know how to tell you

528
00:29:49,915 --> 00:29:52,850
what the optimal sampling is,
because the way these things

529
00:29:52,850 --> 00:29:54,028
connect are non-trivial.

530
00:29:54,028 --> 00:29:55,820
They're subject to the
dynamic constraints.

531
00:29:55,820 --> 00:29:56,490
AUDIENCE: Right.

532
00:29:56,490 --> 00:29:58,910
PROFESSOR: Right.

533
00:29:58,910 --> 00:30:00,800
So if you could formulate
that and solve it

534
00:30:00,800 --> 00:30:02,120
for this-- and maybe you can.

535
00:30:02,120 --> 00:30:02,960
Maybe people have.

536
00:30:02,960 --> 00:30:03,800
I don't know that.

537
00:30:03,800 --> 00:30:04,917
I haven't seen that.

538
00:30:04,917 --> 00:30:07,250
But then that sounds like a
very reasonable thing to do.

539
00:30:07,250 --> 00:30:08,510
AUDIENCE: But it doesn't
have to be quick, right?

540
00:30:08,510 --> 00:30:09,560
PROFESSOR: It doesn't
have to be quick--

541
00:30:09,560 --> 00:30:09,770
AUDIENCE: [INAUDIBLE]
first time--

542
00:30:09,770 --> 00:30:10,070
PROFESSOR: No.

543
00:30:10,070 --> 00:30:11,270
AUDIENCE: --can be
as slow as possible--

544
00:30:11,270 --> 00:30:11,590
PROFESSOR: It could--

545
00:30:11,590 --> 00:30:12,080
AUDIENCE: --because
you want it-- well--

546
00:30:12,080 --> 00:30:13,402
PROFESSOR: Well, right.

547
00:30:13,402 --> 00:30:15,860
AUDIENCE: --times the universe
can explode with that, but--

548
00:30:15,860 --> 00:30:16,120
PROFESSOR: Right.

549
00:30:16,120 --> 00:30:16,790
AUDIENCE: OK.

550
00:30:16,790 --> 00:30:17,540
PROFESSOR: Right.

551
00:30:17,540 --> 00:30:21,110
Take a chance to-- this
is, explore your system.

552
00:30:21,110 --> 00:30:23,450
Build things that are good
in random places, and then

553
00:30:23,450 --> 00:30:25,305
worry about
connecting them later.

554
00:30:25,305 --> 00:30:25,805
Mm-hmm.

555
00:30:31,030 --> 00:30:31,580
Really good.

556
00:30:31,580 --> 00:30:33,700
OK.

557
00:30:33,700 --> 00:30:40,060
So again, making
these connections

558
00:30:40,060 --> 00:30:42,070
in under-actuated
systems is more subtle.

559
00:30:42,070 --> 00:30:44,320
It might be that there's a
lot of one-way connections,

560
00:30:44,320 --> 00:30:45,195
but we can still do--

561
00:30:45,195 --> 00:30:46,750
we know how to do
graph search, OK.

562
00:30:46,750 --> 00:30:48,910
But these are
generally good tools,

563
00:30:48,910 --> 00:30:51,940
and they've been used a
lot in robotics lately.

564
00:30:51,940 --> 00:30:56,200
These are-- the other ones, the
Rapidly Exploring Randomized

565
00:30:56,200 --> 00:30:57,220
Trees, goes by RRTs.

566
00:30:57,220 --> 00:31:02,500
These go by PRMs,
Probabilistic Roadmaps.

567
00:31:02,500 --> 00:31:05,260
A lot of people seem to think
that they're competitors,

568
00:31:05,260 --> 00:31:07,185
intellectual
competitors with RRTs,

569
00:31:07,185 --> 00:31:08,810
and I don't think
that they are really.

570
00:31:08,810 --> 00:31:11,420
I think the RRT
guys would just say,

571
00:31:11,420 --> 00:31:14,020
well, you just use an RRT
to make the connections,

572
00:31:14,020 --> 00:31:16,150
and the roadmap is
still a very good idea.

573
00:31:16,150 --> 00:31:20,605
And I think RRTs
effectively make roadmaps.

574
00:31:20,605 --> 00:31:23,620
So I think they're
very harmonious ideas.

575
00:31:26,810 --> 00:31:27,830
Excellent.

576
00:31:27,830 --> 00:31:31,820
So that's at least two ideas
to take these local trajectory

577
00:31:31,820 --> 00:31:34,940
optimizers and turn them into
more of a feedback policy.

578
00:31:34,940 --> 00:31:37,670
But there's a big one,
big one that I like a lot,

579
00:31:37,670 --> 00:31:40,520
that I haven't said, OK.

580
00:31:40,520 --> 00:31:43,330
So big I'm going to
go back to the left.

581
00:31:43,330 --> 00:31:47,512
OK, let's say it's
idea number three--

582
00:31:47,512 --> 00:31:49,220
these things aren't
perfectly orthogonal,

583
00:31:49,220 --> 00:31:52,690
but this was the breakdown
I was most happy with, OK.

584
00:31:56,900 --> 00:31:58,790
Feedback motion planning.

585
00:32:09,220 --> 00:32:11,410
OK.

586
00:32:11,410 --> 00:32:16,120
So, so far, we've talked
about building some trajectory

587
00:32:16,120 --> 00:32:20,740
that we thought was good,
and then afterwards,

588
00:32:20,740 --> 00:32:23,800
go through and stabilize
it with feedback.

589
00:32:23,800 --> 00:32:25,660
That's not always
the best recipe,

590
00:32:25,660 --> 00:32:27,400
because you could
imagine, for instance,

591
00:32:27,400 --> 00:32:29,710
designing a controller that
locally looked very good

592
00:32:29,710 --> 00:32:32,590
but was completely
unstablizable.

593
00:32:32,590 --> 00:32:33,308
I go to then--

594
00:32:33,308 --> 00:32:34,100
I'm done with this.

595
00:32:34,100 --> 00:32:37,750
I say, perfect, my first
stage of my control design

596
00:32:37,750 --> 00:32:39,160
picked this trajectory.

597
00:32:39,160 --> 00:32:42,743
Now I'm going to run LTV
LQR on it to stabilize it.

598
00:32:42,743 --> 00:32:44,410
And then I find out,
whoops, right there

599
00:32:44,410 --> 00:32:46,280
it's not controllable
or something

600
00:32:46,280 --> 00:32:49,390
and that my cost to
go function blows up.

601
00:32:49,390 --> 00:32:52,810
Maybe my open loop
trajectory optimizer

602
00:32:52,810 --> 00:32:55,773
told me to walk along
the side of a cliff

603
00:32:55,773 --> 00:32:57,190
and wasn't really
paying attention

604
00:32:57,190 --> 00:33:00,400
to the fact that
stabilizing that's hard.

605
00:33:00,400 --> 00:33:02,530
Or maybe it was
saturating my actuators

606
00:33:02,530 --> 00:33:04,990
the entire time-- that's
a very real possibility--

607
00:33:04,990 --> 00:33:09,865
and left me no margin of control
to go back and stabilize it.

608
00:33:09,865 --> 00:33:11,845
OK.

609
00:33:11,845 --> 00:33:15,640
AUDIENCE: [INAUDIBLE]
[? putting ?] those edges down,

610
00:33:15,640 --> 00:33:19,143
that it is actually feasible
to go from A to B, so--

611
00:33:19,143 --> 00:33:19,810
PROFESSOR: Yeah.

612
00:33:19,810 --> 00:33:22,630
It's definitely feasible
to go from A to B,

613
00:33:22,630 --> 00:33:24,183
but it doesn't say that--

614
00:33:24,183 --> 00:33:25,600
nothing thought
about whether if I

615
00:33:25,600 --> 00:33:29,521
get disturbed epsilon from
this, whether I can recover.

616
00:33:29,521 --> 00:33:32,170
AUDIENCE: So you're
worried about the noise.

617
00:33:32,170 --> 00:33:34,360
PROFESSOR: I'm worried
about noise, right.

618
00:33:34,360 --> 00:33:37,760
So it's feasible for me to
walk along the side of a cliff,

619
00:33:37,760 --> 00:33:40,130
but I wouldn't
want to be bumped.

620
00:33:40,130 --> 00:33:41,690
If I know I'm
going to be bumped,

621
00:33:41,690 --> 00:33:44,360
then I pick a
different path, OK.

622
00:33:44,360 --> 00:33:46,610
So you can imagine-- for
maybe each of those examples,

623
00:33:46,610 --> 00:33:49,910
you could imagine ways to try
to make the planning process

624
00:33:49,910 --> 00:33:51,260
more--

625
00:33:51,260 --> 00:33:53,850
let's say, OK, well, don't
use your full torque limits.

626
00:33:53,850 --> 00:33:55,838
Use 90% of your torque limits.

627
00:33:55,838 --> 00:33:56,630
That's a good idea.

628
00:33:56,630 --> 00:33:57,172
That'll help.

629
00:33:57,172 --> 00:34:00,150
But there's a more general
philosophy out there,

630
00:34:00,150 --> 00:34:03,080
which is that you shouldn't
just do trajectory planning

631
00:34:03,080 --> 00:34:04,100
and then stabilize it.

632
00:34:04,100 --> 00:34:06,560
You should really be
planning with feedback,

633
00:34:06,560 --> 00:34:07,802
if that makes any sense, OK.

634
00:34:07,802 --> 00:34:09,260
Well, it'll make
sense in a minute.

635
00:34:14,757 --> 00:34:16,340
There's a lot of
ways to present this.

636
00:34:16,340 --> 00:34:18,507
I thought the best way would
be to start with a case

637
00:34:18,507 --> 00:34:19,820
study, someone who--

638
00:34:19,820 --> 00:34:22,130
a problem where people
really use this, OK.

639
00:34:30,485 --> 00:34:31,860
There's been a
lot of people that

640
00:34:31,860 --> 00:34:34,010
have been interested in
making robots juggle.

641
00:34:34,010 --> 00:34:37,670
One of them's been
sitting in the room here.

642
00:34:37,670 --> 00:34:40,819
The ones that did a lot of the
work I'm talking about here

643
00:34:40,819 --> 00:34:42,260
is Dan Koditschek's camp.

644
00:34:47,880 --> 00:34:48,380
OK.

645
00:34:52,159 --> 00:34:56,105
So it's actually
very, very harmonious

646
00:34:56,105 --> 00:34:58,220
with John's lecture
on running, and that's

647
00:34:58,220 --> 00:35:00,085
why Koditschek's done
both, for instance.

648
00:35:00,085 --> 00:35:01,460
Now let's think
about the problem

649
00:35:01,460 --> 00:35:02,780
with making a robot juggle, OK.

650
00:35:02,780 --> 00:35:04,677
So the first thing you
need to think about--

651
00:35:04,677 --> 00:35:09,110
and let's make a
one-dimensional juggler, OK.

652
00:35:09,110 --> 00:35:11,120
So we've got a paddle
here, constrained

653
00:35:11,120 --> 00:35:15,830
to live in this plane, and we've
got a ball, also constrained

654
00:35:15,830 --> 00:35:17,990
to live in that plane.

655
00:35:17,990 --> 00:35:19,430
Yeah?

656
00:35:19,430 --> 00:35:21,440
And your goal is to--

657
00:35:21,440 --> 00:35:26,510
if this thing is in a rail,
it can only move vertically,

658
00:35:26,510 --> 00:35:29,180
your goal is just
to move that paddle

659
00:35:29,180 --> 00:35:32,357
to, say, stabilize
a bouncing height.

660
00:35:32,357 --> 00:35:33,940
Let's say you've got
a desired height.

661
00:35:40,100 --> 00:35:40,600
OK.

662
00:35:43,270 --> 00:35:45,190
This is the 1-D juggler.

663
00:35:45,190 --> 00:35:51,130
I think they call it the
line juggler by Martin

664
00:35:51,130 --> 00:35:53,140
Buehler and Dan Koditschek.

665
00:35:53,140 --> 00:35:58,480
Martin went on to build Big Dog
at VDI, and now he's at iRobot.

666
00:35:58,480 --> 00:36:02,020
So these are famous guys, OK.

667
00:36:02,020 --> 00:36:04,970
So the dynamics are pretty
simple to write down.

668
00:36:04,970 --> 00:36:06,190
You have a mass of the ball.

669
00:36:06,190 --> 00:36:08,087
You have some dynamics
of your paddle.

670
00:36:08,087 --> 00:36:09,670
You assume that the
mass of the paddle

671
00:36:09,670 --> 00:36:12,130
is much, much bigger
than the ball.

672
00:36:12,130 --> 00:36:13,790
That simplifies some things.

673
00:36:13,790 --> 00:36:19,930
And so now the dynamics are just
ballistic flight of the ball.

674
00:36:31,390 --> 00:36:32,680
You need some trajectory.

675
00:36:32,680 --> 00:36:35,410
Your control is to design
some trajectory of the paddle,

676
00:36:35,410 --> 00:36:45,040
and then you have an impact
dynamics, which these guys use

677
00:36:45,040 --> 00:36:48,390
an elastic model--

678
00:36:48,390 --> 00:36:51,880
model it is an instantaneous
elastic collision

679
00:36:51,880 --> 00:36:55,540
with a coefficient
of restitution.

680
00:36:55,540 --> 00:36:57,310
That's a reasonable
collision model

681
00:36:57,310 --> 00:36:58,762
if your energy is conserved.

682
00:37:03,390 --> 00:37:06,520
And again, they assume that
when the collision happens,

683
00:37:06,520 --> 00:37:10,180
the ball changes direction
and keeps 90% of its energy,

684
00:37:10,180 --> 00:37:12,250
and the paddle was unaffected.

685
00:37:12,250 --> 00:37:17,427
Relative to the mass of the
paddle, the ball is negligible.

686
00:37:17,427 --> 00:37:20,010
AUDIENCE: [INAUDIBLE] juggling
the balls are almost completely

687
00:37:20,010 --> 00:37:22,252
[INAUDIBLE].

688
00:37:22,252 --> 00:37:23,210
PROFESSOR: That's true.

689
00:37:23,210 --> 00:37:25,640
These are, I guess, not--

690
00:37:25,640 --> 00:37:29,180
[? Philipp's ?] are completely
almost as hard as possible.

691
00:37:29,180 --> 00:37:30,830
In his project, he
said he spent lots

692
00:37:30,830 --> 00:37:32,747
of time trying to find
the perfect ball, which

693
00:37:32,747 --> 00:37:38,720
was the perfectly machined,
very hard precision ball, yeah.

694
00:37:38,720 --> 00:37:42,440
Compliant juggling, maybe
that's our next challenge

695
00:37:42,440 --> 00:37:44,840
for robotics, squishy balls.

696
00:37:49,490 --> 00:37:50,060
OK, good.

697
00:37:50,060 --> 00:37:54,590
So it turns out they do a
really nice control design.

698
00:37:54,590 --> 00:37:57,830
It turns out to be
very natural to--

699
00:37:57,830 --> 00:38:10,030
the controller that they
come up with for the paddle

700
00:38:10,030 --> 00:38:11,030
uses a mirror law.

701
00:38:17,420 --> 00:38:20,390
Turns out if you can sense
the state of the ball

702
00:38:20,390 --> 00:38:25,310
and you just do a distorted
mirror image of that ball,

703
00:38:25,310 --> 00:38:27,050
then everything
gets really easy.

704
00:38:27,050 --> 00:38:29,300
Your impacts always happen at 0.

705
00:38:29,300 --> 00:38:31,170
It's at the same place.

706
00:38:31,170 --> 00:38:33,890
And you can, just by
changing the velocity here,

707
00:38:33,890 --> 00:38:37,200
you can roughly affect
the impact height.

708
00:38:37,200 --> 00:38:41,760
So what they do is they can
nominally stabilize some limit

709
00:38:41,760 --> 00:38:44,150
cycle with just
mirroring the ball,

710
00:38:44,150 --> 00:38:46,850
and they add an extra term to
stabilize the energy to get it

711
00:38:46,850 --> 00:38:49,700
to whatever height they want.

712
00:38:49,700 --> 00:39:09,710
So they do a distorted mirror
image of ball trajectory plus--

713
00:39:09,710 --> 00:39:14,390
the distortion is scaled by
some energy correcting term.

714
00:39:26,530 --> 00:39:28,600
It's a beautiful thing.

715
00:39:28,600 --> 00:39:31,120
Very, very simple controller.

716
00:39:31,120 --> 00:39:35,260
Has a nice, very
stable solution.

717
00:39:35,260 --> 00:39:38,050
In fact, I think they prove
it's globally stable for--

718
00:39:38,050 --> 00:39:38,960
you can tell me if--

719
00:39:38,960 --> 00:39:41,020
is it globally stable
in the 1-D case?

720
00:39:41,020 --> 00:39:42,190
I think it probably is.

721
00:39:46,600 --> 00:39:48,215
OK.

722
00:39:48,215 --> 00:39:49,840
How do they prove
it's globally stable?

723
00:39:49,840 --> 00:39:54,700
They do an apex to
apex return map.

724
00:40:01,330 --> 00:40:04,660
And the same way we did for
the hopping models and all

725
00:40:04,660 --> 00:40:08,680
the other models, these guys
were pushing the unimodal maps

726
00:40:08,680 --> 00:40:10,930
and getting some global
stability results out of that.

727
00:40:10,930 --> 00:40:14,870
That's why I think that they
had a global result, OK.

728
00:40:14,870 --> 00:40:17,950
So it's actually exactly
like a hopping robot.

729
00:40:17,950 --> 00:40:21,770
Just the ball's moving
instead of the robot.

730
00:40:24,899 --> 00:40:28,630
OK, so they got a pretty good
controller for 1-D juggling,

731
00:40:28,630 --> 00:40:31,272
and then they started
doing 2-D juggling.

732
00:40:31,272 --> 00:40:32,230
I think I have the vi--

733
00:40:32,230 --> 00:40:33,940
I don't have the video for
the 1-D juggling somehow,

734
00:40:33,940 --> 00:40:35,815
but I do have the video
for the 2-D juggling.

735
00:40:41,446 --> 00:40:47,940
Yeah, so here's your
2-D case showing off

736
00:40:47,940 --> 00:40:49,890
doing two balls
at once, since all

737
00:40:49,890 --> 00:40:52,810
that matters is the state of the
robot when the impact occurs.

738
00:40:52,810 --> 00:40:54,400
So you might as well
do something else

739
00:40:54,400 --> 00:40:56,525
during the other time, like
stabilize another ball.

740
00:40:58,688 --> 00:41:00,480
And you can see that
actually, it turns out

741
00:41:00,480 --> 00:41:06,210
to be pretty easy to get
stability in this plane,

742
00:41:06,210 --> 00:41:08,325
just because if you tend to--

743
00:41:08,325 --> 00:41:09,930
if you're too far
to this side, you

744
00:41:09,930 --> 00:41:13,590
tend to get hit earlier, which
causes you to go out more

745
00:41:13,590 --> 00:41:14,390
and vise versa.

746
00:41:14,390 --> 00:41:16,057
So that stability
almost comes for free.

747
00:41:19,190 --> 00:41:21,020
AUDIENCE: [INAUDIBLE]

748
00:41:21,020 --> 00:41:23,300
PROFESSOR: This is, I think,
vision off to the side.

749
00:41:23,300 --> 00:41:24,758
I know it's vision
off to the side,

750
00:41:24,758 --> 00:41:27,170
where they're tracking
the bright yellow balls.

751
00:41:27,170 --> 00:41:28,608
If it had been
dark gray balls, it

752
00:41:28,608 --> 00:41:29,900
might have been something else.

753
00:41:29,900 --> 00:41:33,450
But the bright yellow tennis
balls suggests vision.

754
00:41:33,450 --> 00:41:33,950
Yeah.

755
00:41:36,890 --> 00:41:42,150
And they went on to
do the 3-D juggling.

756
00:41:42,150 --> 00:41:46,310
This one was, I remember, in the
basement of the Michigan AI lab

757
00:41:46,310 --> 00:41:49,970
when I was there,
behind that curtain.

758
00:41:56,710 --> 00:42:03,660
It's always using the vision
sensing for the balls.

759
00:42:03,660 --> 00:42:05,160
You could do you
pretty good things.

760
00:42:13,510 --> 00:42:15,310
And then they got
so good that they

761
00:42:15,310 --> 00:42:20,230
started doing other maneuvers
like catching and palming

762
00:42:20,230 --> 00:42:22,722
and things like this, OK.

763
00:42:22,722 --> 00:42:25,180
It actually turned out to be
the same, pretty much, control

764
00:42:25,180 --> 00:42:25,680
derivation.

765
00:42:25,680 --> 00:42:28,840
They just set the
desired energy to 0,

766
00:42:28,840 --> 00:42:31,308
and suddenly they have
a catching controller.

767
00:42:40,100 --> 00:42:45,770
And then this is palming when
they're doing their thing, OK.

768
00:42:45,770 --> 00:42:49,010
And then they can get it back up
to catching with the same sort

769
00:42:49,010 --> 00:42:50,240
of energy shaping.

770
00:42:50,240 --> 00:42:55,920
And I should show you, you don't
actually need all that feedback

771
00:42:55,920 --> 00:42:56,420
to do it.

772
00:42:56,420 --> 00:42:57,795
You don't need to
sense the ball.

773
00:43:01,610 --> 00:43:02,120
Here he is.

774
00:43:04,640 --> 00:43:06,770
This is [? Phillips. ?]
We'll show the one

775
00:43:06,770 --> 00:43:10,250
where he's pushing it so
you can tell who it is here.

776
00:43:10,250 --> 00:43:12,300
Blind juggler.

777
00:43:12,300 --> 00:43:14,120
So this is open loop
stable juggling.

778
00:43:20,930 --> 00:43:23,240
You can see-- actually, do
you see the ball up there?

779
00:43:23,240 --> 00:43:24,740
Yeah, it's going
to a stable height,

780
00:43:24,740 --> 00:43:26,120
and he's moving it around.

781
00:43:26,120 --> 00:43:28,748
He's got just a itty
little bit of concavity

782
00:43:28,748 --> 00:43:31,040
in that plate, which gives
it all the passive stability

783
00:43:31,040 --> 00:43:31,915
properties.

784
00:43:31,915 --> 00:43:34,040
And you've got versions
where it's doing things off

785
00:43:34,040 --> 00:43:36,140
to the side in 3-D
or in 2-D and--

786
00:43:39,346 --> 00:43:41,850
yeah, so let's open loop stable.

787
00:43:41,850 --> 00:43:45,425
So juggling is actually a really
cool problem for robotics.

788
00:43:45,425 --> 00:43:50,633
It's led to a lot of nice
dynamic insights and party

789
00:43:50,633 --> 00:43:51,300
tricks, I guess.

790
00:43:56,030 --> 00:43:56,530
Yeah.

791
00:44:00,510 --> 00:44:06,080
OK, so these guys said, we
got pretty good at juggling.

792
00:44:06,080 --> 00:44:09,620
We can do a mirror law
to stabilize whatever

793
00:44:09,620 --> 00:44:11,510
juggling height we want.

794
00:44:11,510 --> 00:44:17,798
We've got a catching
controller also,

795
00:44:17,798 --> 00:44:19,340
which has roughly
set the energy to 0

796
00:44:19,340 --> 00:44:21,650
and just sort of does this step.

797
00:44:21,650 --> 00:44:25,197
And they also had a
palming controller,

798
00:44:25,197 --> 00:44:27,530
which was when the dynamics
were actually on the paddle,

799
00:44:27,530 --> 00:44:29,238
they did a little bit
of different things

800
00:44:29,238 --> 00:44:31,820
to be able to move it around
without it falling off

801
00:44:31,820 --> 00:44:33,817
the paddle.

802
00:44:33,817 --> 00:44:35,650
What they were left
with was this challenge.

803
00:44:35,650 --> 00:44:39,620
And we've got these controllers,
which are good locally.

804
00:44:39,620 --> 00:44:42,060
What do we do to make them
do more interesting things?

805
00:44:42,060 --> 00:44:45,200
So they actually--
so if they want

806
00:44:45,200 --> 00:44:47,750
to transition, for instance,
between the catching

807
00:44:47,750 --> 00:44:49,952
and the palming-- the
bouncing and the palming,

808
00:44:49,952 --> 00:44:51,410
they use their
catching controller.

809
00:44:51,410 --> 00:44:54,438
Maybe they want to
avoid moving obstacles.

810
00:44:54,438 --> 00:44:55,730
They want to do multiple balls.

811
00:44:55,730 --> 00:44:57,830
They want to do
all these things.

812
00:44:57,830 --> 00:45:01,040
They introduced a really
nice, beautiful picture

813
00:45:01,040 --> 00:45:06,245
of feedback motion
planning using funnels.

814
00:45:28,080 --> 00:45:34,310
OK, so every one of
those controllers

815
00:45:34,310 --> 00:45:37,640
had the property that it
would take initial conditions

816
00:45:37,640 --> 00:45:42,780
in state space and move them
to some more desirable state.

817
00:45:42,780 --> 00:45:44,210
So for instance,
the ball hopping,

818
00:45:44,210 --> 00:45:47,120
the ball could be anywhere here.

819
00:45:47,120 --> 00:45:49,612
By applying this controller
for some finite amount of time,

820
00:45:49,612 --> 00:45:52,070
when it's done, it's going to
be closer to its apex height.

821
00:45:54,860 --> 00:45:58,645
In many cases, not
in the juggling case,

822
00:45:58,645 --> 00:46:00,270
not in the experimental
juggling case--

823
00:46:00,270 --> 00:46:02,570
and even in the model
one, I guess they do.

824
00:46:02,570 --> 00:46:05,420
But in many cases, you actually
have Lyapunov functions

825
00:46:05,420 --> 00:46:09,050
which describe the way that
convergence happens, OK,

826
00:46:09,050 --> 00:46:11,760
but it's not strictly necessary.

827
00:46:11,760 --> 00:46:14,840
So the idea is, let's think
about this thing as a funnel,

828
00:46:14,840 --> 00:46:15,830
OK.

829
00:46:15,830 --> 00:46:20,250
It takes lots of states in.

830
00:46:20,250 --> 00:46:23,300
So this is initial states.

831
00:46:29,720 --> 00:46:36,410
And after applying it for
some finite amount of time,

832
00:46:36,410 --> 00:46:37,970
I'm going to be some--

833
00:46:37,970 --> 00:46:39,305
you get some new final states.

834
00:46:43,130 --> 00:46:45,710
And if my controller
was any good, then

835
00:46:45,710 --> 00:46:48,620
hopefully the final states
are a smaller region

836
00:46:48,620 --> 00:46:49,640
than the initial states.

837
00:46:55,480 --> 00:47:04,270
So in some sense, this
is a geometric cartoon

838
00:47:04,270 --> 00:47:05,440
for a Lyapunov function.

839
00:47:17,490 --> 00:47:22,840
Lyapunov functions take my
state in, and descend down,

840
00:47:22,840 --> 00:47:27,600
and will put me in
some other state, OK.

841
00:47:27,600 --> 00:47:30,000
Experimentally, you can
also find these things,

842
00:47:30,000 --> 00:47:32,700
even if you can't do
the Lyapunov function.

843
00:47:32,700 --> 00:47:36,450
Experimentally, this input
is basically the basin

844
00:47:36,450 --> 00:47:38,610
of attraction of my controller.

845
00:47:49,892 --> 00:47:51,600
So if it was really
a basin of attraction

846
00:47:51,600 --> 00:47:53,823
and it stabilized some fixed
point, then if I ran it

847
00:47:53,823 --> 00:47:55,740
long enough that it was
asymptotically stable,

848
00:47:55,740 --> 00:47:57,180
I'd call it a basin
of attraction.

849
00:47:57,180 --> 00:47:59,310
Here I'm just going to run
it for some finite time.

850
00:47:59,310 --> 00:48:02,190
So you have to be a little
careful calling it a basin,

851
00:48:02,190 --> 00:48:04,680
but I think it's still
intuitive that this is the--

852
00:48:04,680 --> 00:48:07,200
there's lots of names for this.

853
00:48:07,200 --> 00:48:10,660
Another name for it is pre-image
in the motion planning world.

854
00:48:10,660 --> 00:48:15,100
Lot people call this the
pre-image of our action.

855
00:48:15,100 --> 00:48:16,865
This is, I guess,
the post-image, yeah.

856
00:48:16,865 --> 00:48:18,240
These are the set
of states where

857
00:48:18,240 --> 00:48:21,000
my controller is applicable.

858
00:48:21,000 --> 00:48:25,805
I'm going to have a
funnel for the mirror law.

859
00:48:25,805 --> 00:48:27,930
I'm going have a funnel
for my catching controller.

860
00:48:27,930 --> 00:48:30,060
That takes a different
set of initial conditions

861
00:48:30,060 --> 00:48:31,560
and gets me where I want to be.

862
00:48:31,560 --> 00:48:34,980
And I have a funnel that
can allow my palming

863
00:48:34,980 --> 00:48:37,200
to do different things, OK.

864
00:48:37,200 --> 00:48:39,800
And I might even
have lots of funnels.

865
00:48:39,800 --> 00:48:42,690
So I might have a different
funnel given the mirror law

866
00:48:42,690 --> 00:48:47,190
where my desired energy is
4, versus the mirror law

867
00:48:47,190 --> 00:48:48,442
where my desired energy is 6.

868
00:48:48,442 --> 00:48:50,400
Maybe those should look
like different funnels.

869
00:48:53,280 --> 00:48:57,450
So the picture that
these guys gave us--

870
00:49:07,060 --> 00:49:10,645
this is Burridge, Rizzi,
and Koditschek, the guys.

871
00:49:14,380 --> 00:49:16,360
Al Rizzi's at Boston
Dynamics also--

872
00:49:23,560 --> 00:49:27,550
is that you can do
feedback motion planning

873
00:49:27,550 --> 00:49:31,770
as a sequential composition
of funnels, yeah?

874
00:50:03,260 --> 00:50:07,562
So if I want to get from
one state to another state,

875
00:50:07,562 --> 00:50:09,770
and I don't have a single
controller that will get me

876
00:50:09,770 --> 00:50:14,780
there, all I need to do is
reason about a set of these--

877
00:50:14,780 --> 00:50:19,520
a sequence of these funnels for
which the first funnel takes me

878
00:50:19,520 --> 00:50:23,330
from my initial conditions
into a domain where

879
00:50:23,330 --> 00:50:26,300
my second funnel is applicable.

880
00:50:26,300 --> 00:50:28,820
And then I can use
my second funnel

881
00:50:28,820 --> 00:50:32,360
to get me somewhere else, and
then my third funnel maybe

882
00:50:32,360 --> 00:50:34,520
will get me to my target.

883
00:50:37,160 --> 00:50:41,540
So if my goal state is somewhere
abstractly here in state space,

884
00:50:41,540 --> 00:50:44,660
that's not accessible
from any one--

885
00:50:44,660 --> 00:50:46,880
I can't get from my initial
condition to my goal

886
00:50:46,880 --> 00:50:49,130
with any one of my controllers.

887
00:50:49,130 --> 00:50:51,860
I can sequence
these controllers,

888
00:50:51,860 --> 00:50:55,550
just making sure that
the output of one funnel

889
00:50:55,550 --> 00:50:57,380
is covered, completely
covered by the input

890
00:50:57,380 --> 00:50:59,270
of the next funnel.

891
00:50:59,270 --> 00:51:01,520
Then that's enough,
then, to turn this again,

892
00:51:01,520 --> 00:51:03,890
to use these funnels
as an abstraction

893
00:51:03,890 --> 00:51:05,720
to take away the
continuous problem

894
00:51:05,720 --> 00:51:08,953
and give me a discrete planning
problem, which just says I just

895
00:51:08,953 --> 00:51:11,120
need to go through this
funnel, through this funnel,

896
00:51:11,120 --> 00:51:15,380
through this funnel, and
I can get to the goal, OK.

897
00:51:15,380 --> 00:51:19,640
So in this case, they
did tasks like there

898
00:51:19,640 --> 00:51:23,298
was a beam here that was--

899
00:51:23,298 --> 00:51:24,590
they were bouncing on one side.

900
00:51:24,590 --> 00:51:26,625
They wanted to be bouncing
on the other side.

901
00:51:26,625 --> 00:51:28,250
So I think they had
one controller that

902
00:51:28,250 --> 00:51:29,000
went over the top.

903
00:51:29,000 --> 00:51:30,140
Then the beam got taller.

904
00:51:30,140 --> 00:51:32,690
They had another one where it
caught it, brought it under,

905
00:51:32,690 --> 00:51:34,010
started paddling again.

906
00:51:34,010 --> 00:51:36,650
And these things just
fall naturally out.

907
00:51:36,650 --> 00:51:38,690
This could be the catching.

908
00:51:38,690 --> 00:51:40,250
Or this could be the--

909
00:51:40,250 --> 00:51:41,780
yeah, catching.

910
00:51:41,780 --> 00:51:46,120
This could be the palm, and
this could be my mirror again.

911
00:51:48,950 --> 00:51:52,670
And I'm right back to
where I want to be, OK.

912
00:51:52,670 --> 00:51:53,900
Very, very beautiful idea.

913
00:51:56,420 --> 00:51:59,690
As far as I could tell,
everybody who read that paper

914
00:51:59,690 --> 00:52:02,960
was enamored by it, and nobody's
really used it that much,

915
00:52:02,960 --> 00:52:05,510
because there was
one critical problem.

916
00:52:08,600 --> 00:52:12,830
Figuring out what those funnels
looked like are really hard.

917
00:52:12,830 --> 00:52:19,250
So really, the only
issue, I think,

918
00:52:19,250 --> 00:52:26,000
is that describing the basins
of attraction, let's say--

919
00:52:42,470 --> 00:52:45,920
so if you read the Burridge,
Rizzi and Koditschek paper,

920
00:52:45,920 --> 00:52:48,830
you'll see a ridiculous
number of scatter plots

921
00:52:48,830 --> 00:52:50,900
where they put the
ball in this location,

922
00:52:50,900 --> 00:52:52,610
they ran their
controller for a while,

923
00:52:52,610 --> 00:52:53,900
and they determined
experimentally

924
00:52:53,900 --> 00:52:56,442
whether it was in the basin of
attraction of this controller.

925
00:52:56,442 --> 00:52:57,320
Yeah?

926
00:52:57,320 --> 00:53:00,050
Ouch, right?

927
00:53:00,050 --> 00:53:04,580
That's not what I want
to do with my time.

928
00:53:04,580 --> 00:53:07,550
So if you're willing to do that,
then it's a workable method.

929
00:53:07,550 --> 00:53:11,090
But I think today we've
got a better way to do it.

930
00:53:15,350 --> 00:53:20,120
And my group, we've been
working on an implementation

931
00:53:20,120 --> 00:53:22,880
of this feedback motion
planning idea, which

932
00:53:22,880 --> 00:53:27,440
is very much in line with the
things we've been talking about

933
00:53:27,440 --> 00:53:40,990
so far, which we've been
calling the LQR trees, OK.

934
00:53:47,160 --> 00:53:52,650
And the big idea that
happened is that these guys

935
00:53:52,650 --> 00:53:54,930
in LIDS, Pablo Parrilo--

936
00:53:54,930 --> 00:53:57,120
anybody know Pablo?

937
00:53:57,120 --> 00:54:01,800
And Alex [? McGretsky's ?] the
one who taught me about this--

938
00:54:01,800 --> 00:54:05,910
have figured out
new effective ways

939
00:54:05,910 --> 00:54:08,250
to computationally
estimate basins

940
00:54:08,250 --> 00:54:12,570
of attraction of some
classes of controls, OK.

941
00:54:32,865 --> 00:54:34,830
OK, so this is a new thing.

942
00:54:34,830 --> 00:54:37,730
People have been
doing algorithms

943
00:54:37,730 --> 00:54:43,132
to design Lyapunov functions
for at least a decade.

944
00:54:43,132 --> 00:54:45,590
But I think they got really
practical a couple of years ago

945
00:54:45,590 --> 00:54:47,120
in Pablo's thesis, actually.

946
00:54:55,300 --> 00:54:56,300
Think it's two Ls, yeah.

947
00:54:59,750 --> 00:55:06,200
What Pablo did in his thesis
is he promoted this sums

948
00:55:06,200 --> 00:55:08,066
of squares programming.

949
00:55:21,210 --> 00:55:25,460
In fact, you can even
download SoS tools from his--

950
00:55:25,460 --> 00:55:31,070
as a MATLAB package from
his website to do this.

951
00:55:31,070 --> 00:55:35,300
Sums of squares programs
are efficient ways

952
00:55:35,300 --> 00:55:41,150
to check whether a polynomial
function is negative definite,

953
00:55:41,150 --> 00:56:11,280
OK, potentially with free
parameters, and so on.

954
00:56:11,280 --> 00:56:12,870
These can be made--

955
00:56:12,870 --> 00:56:14,670
and these can be
vector variables,

956
00:56:14,670 --> 00:56:16,710
can be made uniformly
negative definite

957
00:56:16,710 --> 00:56:22,830
or semidefinite, OK,
or trivially positive

958
00:56:22,830 --> 00:56:25,260
semidefinite.

959
00:56:25,260 --> 00:56:26,250
Seems like a little--

960
00:56:28,960 --> 00:56:32,790
you can see how it
might be relevant, OK.

961
00:56:32,790 --> 00:56:34,770
So this is just a
mathematical idea

962
00:56:34,770 --> 00:56:39,240
to turn the problem of checking
the positive definiteness

963
00:56:39,240 --> 00:56:43,680
of a polynomial
into a linear matrix

964
00:56:43,680 --> 00:56:47,247
inequality and then a
convex optimization problem.

965
00:56:47,247 --> 00:56:49,080
So I'm not going to go
into all the details,

966
00:56:49,080 --> 00:56:52,470
but know that there's
these tools out there that

967
00:56:52,470 --> 00:56:59,213
use convex optimization to check
that property of a problem, OK.

968
00:56:59,213 --> 00:57:00,130
And you can read more.

969
00:57:00,130 --> 00:57:04,060
I've got links if you want
to read more about that.

970
00:57:04,060 --> 00:57:09,490
What that allows us to do now,
at least in the case of the LQR

971
00:57:09,490 --> 00:57:15,550
design we've worked it out,
it's possible to now check

972
00:57:15,550 --> 00:57:19,330
whether a function,
a polynomial function

973
00:57:19,330 --> 00:57:21,550
is a Lyapunov function
for the system.

974
00:57:21,550 --> 00:57:25,840
Lyapunov functions have to have
their derivatives going down

975
00:57:25,840 --> 00:57:27,880
over time, yeah.

976
00:57:27,880 --> 00:57:30,430
In order for a good function
to be a Lyapunov function,

977
00:57:30,430 --> 00:57:35,980
its value had better be
going down at all times.

978
00:57:35,980 --> 00:57:38,140
If your candidate
Lyapunov function is even

979
00:57:38,140 --> 00:57:40,870
a vector polynomial
function, then you

980
00:57:40,870 --> 00:57:44,350
can use this to check
whether it's a valid Lyapunov

981
00:57:44,350 --> 00:57:46,840
function for your system, OK.

982
00:57:46,840 --> 00:57:47,710
So we can now--

983
00:57:57,470 --> 00:58:01,374
AUDIENCE: [SNEEZE]

984
00:58:01,374 --> 00:58:02,350
PROFESSOR: Bless you.

985
00:58:27,760 --> 00:58:30,340
So I threw this one in
without saying it before.

986
00:58:30,340 --> 00:58:34,510
The only caveat is you have
to take your nonlinear system

987
00:58:34,510 --> 00:58:39,100
and make a polynomial
approximation of it,

988
00:58:39,100 --> 00:58:40,263
a Taylor expansion of it.

989
00:58:40,263 --> 00:58:41,680
It doesn't have
to be first order.

990
00:58:41,680 --> 00:58:43,570
That's the linear system.

991
00:58:43,570 --> 00:58:49,420
But it has to be polynomial, OK.

992
00:58:49,420 --> 00:59:00,010
So suddenly, it turns out
that for the LQR systems--

993
00:59:00,010 --> 00:59:03,310
remember, our value function.

994
00:59:03,310 --> 00:59:05,950
Let's just think of the LTI LQR.

995
00:59:05,950 --> 00:59:10,960
The value function turns out
to be this quadratic form.

996
00:59:10,960 --> 00:59:12,160
It's the optimal cost to go.

997
00:59:14,950 --> 00:59:16,660
That's a Lyapunov function.

998
00:59:16,660 --> 00:59:20,290
J of x for the linear system
is a Lyapunov function.

999
00:59:20,290 --> 00:59:23,320
As I take control
actions, my cost to go

1000
00:59:23,320 --> 00:59:25,400
is only going to go down.

1001
00:59:25,400 --> 00:59:29,200
It had better, otherwise it's
not the optimal cost to go.

1002
00:59:29,200 --> 00:59:57,490
So OK.

1003
00:59:57,490 --> 01:00:01,820
If I have a nonlinear
system, where

1004
01:00:01,820 --> 01:00:06,680
I've linearized it
and done LQR control,

1005
01:00:06,680 --> 01:00:09,140
then I expect that to be--

1006
01:00:09,140 --> 01:00:14,000
this function to be a Lyapunov
function over some domain

1007
01:00:14,000 --> 01:00:16,640
where the
linearization was good,

1008
01:00:16,640 --> 01:00:18,110
and eventually,
to no longer have

1009
01:00:18,110 --> 01:00:21,650
this nice negative
definiteness property.

1010
01:00:21,650 --> 01:00:23,930
Does that make sense?

1011
01:00:23,930 --> 01:00:28,280
The optimal cost to go from
LQR isn't a Lyapunov function

1012
01:00:28,280 --> 01:00:29,240
for the entire state.

1013
01:00:29,240 --> 01:00:32,600
It's always going to
descend for the entire--

1014
01:00:32,600 --> 01:00:36,240
for any initial conditions
for the linear system.

1015
01:00:36,240 --> 01:00:47,300
But when I've
linearized the system,

1016
01:00:47,300 --> 01:00:54,560
I expect this to
be, J star to be,

1017
01:00:54,560 --> 01:01:03,960
a valid Lyapunov function
near the linearization.

1018
01:01:16,137 --> 01:01:17,720
You guys should stop
and ask questions

1019
01:01:17,720 --> 01:01:20,520
now if you have any questions.

1020
01:01:20,520 --> 01:01:21,950
Does that make sense?

1021
01:01:21,950 --> 01:01:24,050
I never actually
said before that you

1022
01:01:24,050 --> 01:01:26,955
can think of these cost
[? to goes ?] as Lyapunov

1023
01:01:26,955 --> 01:01:28,580
functions, but that's
a nice connection

1024
01:01:28,580 --> 01:01:31,700
between the optimal control
and the stability theory, OK.

1025
01:01:34,480 --> 01:01:39,670
But the cost to go actually
is a Lyapunov function.

1026
01:01:39,670 --> 01:01:42,607
Remember, when we're taking
a linearization doing LQR,

1027
01:01:42,607 --> 01:01:44,440
we already know that
the basin of attraction

1028
01:01:44,440 --> 01:01:45,773
is going to be something finite.

1029
01:01:45,773 --> 01:01:47,860
We talked about
that at the Acrobot.

1030
01:01:47,860 --> 01:01:49,810
I do a linearization
around the top.

1031
01:01:49,810 --> 01:01:52,360
I know if I'm near
that point, it's

1032
01:01:52,360 --> 01:01:54,640
got some small finite
basin of attraction.

1033
01:01:54,640 --> 01:01:57,580
If I'm inside that region,
it'll go to the goal.

1034
01:01:57,580 --> 01:01:59,987
If I'm outside that,
then the linear design,

1035
01:01:59,987 --> 01:02:02,320
controller design, isn't valid
for the nonlinear system.

1036
01:02:02,320 --> 01:02:04,240
Eventually, you're going to get
far enough away that it's not

1037
01:02:04,240 --> 01:02:04,823
going to work.

1038
01:02:04,823 --> 01:02:05,553
It's going to--

1039
01:02:05,553 --> 01:02:06,970
I think I even
showed a simulation

1040
01:02:06,970 --> 01:02:08,797
of it doing something crazy.

1041
01:02:08,797 --> 01:02:11,380
So what we did was we designed
a controller that got up there,

1042
01:02:11,380 --> 01:02:13,030
and then we turned on
the linear controller.

1043
01:02:13,030 --> 01:02:13,590
All was good.

1044
01:02:16,510 --> 01:02:20,440
So a different way to say
that exact same thing is

1045
01:02:20,440 --> 01:02:27,100
that at some point, when I get
too far from the fixed point,

1046
01:02:27,100 --> 01:02:30,550
if I evaluate this function
and look at the time

1047
01:02:30,550 --> 01:02:33,510
derivative of this
function, it's

1048
01:02:33,510 --> 01:02:35,045
no longer going
to-- my cost is not

1049
01:02:35,045 --> 01:02:36,420
going to go down
as time goes up.

1050
01:02:39,097 --> 01:02:40,930
And in the case for the
Acrobot, if I'm here

1051
01:02:40,930 --> 01:02:42,100
and I start going
this way, then I'm

1052
01:02:42,100 --> 01:02:43,308
getting further from my goal.

1053
01:02:43,308 --> 01:02:44,380
My cost is going up.

1054
01:02:44,380 --> 01:02:47,860
At some point, for
the nonlinear system,

1055
01:02:47,860 --> 01:02:50,860
this function is not
going to be a Lyapunov

1056
01:02:50,860 --> 01:02:53,620
function for that system, OK.

1057
01:02:53,620 --> 01:02:56,710
So what we've got, thanks
to Pablo and [? Sasha ?]

1058
01:02:56,710 --> 01:03:03,080
[? McGretzky, ?] is a way
to figure out exactly a--

1059
01:03:03,080 --> 01:03:05,230
well, not exactly--
to estimate the place

1060
01:03:05,230 --> 01:03:09,070
where that transition
happens using

1061
01:03:09,070 --> 01:03:10,705
these sums of squares programs.

1062
01:03:13,022 --> 01:03:14,980
My goal here is to tell
you about the existence

1063
01:03:14,980 --> 01:03:18,820
of these things, and
I'm happy to push you

1064
01:03:18,820 --> 01:03:21,490
more in that direction if you're
interested, for your project

1065
01:03:21,490 --> 01:03:23,710
or for whatever.

1066
01:03:23,710 --> 01:03:27,908
But we're going to use this
to do the feedback motion

1067
01:03:27,908 --> 01:03:28,450
planning, OK.

1068
01:03:28,450 --> 01:03:33,685
So it turns out J is a scalar.

1069
01:03:37,340 --> 01:03:40,460
My cost to go is a scalar.

1070
01:03:40,460 --> 01:03:45,500
It turns out I can very
succinctly describe the place

1071
01:03:45,500 --> 01:03:48,373
where this-- a boundary
of this function--

1072
01:03:48,373 --> 01:03:50,581
let me just write it, and
then I'll say it carefully.

1073
01:04:01,890 --> 01:04:05,700
I can describe a
region of my system

1074
01:04:05,700 --> 01:04:08,010
just by looking at the
height of my cost to go.

1075
01:04:10,770 --> 01:04:13,470
This is a quadratic function.

1076
01:04:13,470 --> 01:04:16,950
It's going to look like
ellipsoids going out.

1077
01:04:16,950 --> 01:04:20,010
If you were to draw
this landscape,

1078
01:04:20,010 --> 01:04:24,570
it's going to look like
an ellipse, a parabola

1079
01:04:24,570 --> 01:04:27,180
in high dimensions, yeah.

1080
01:04:27,180 --> 01:04:29,820
At some point, as I move
farther from my fixed point,

1081
01:04:29,820 --> 01:04:32,580
the cost is going to get
higher and higher, OK.

1082
01:04:32,580 --> 01:04:36,120
And at some point, it crosses
some scalar value rho.

1083
01:04:36,120 --> 01:04:38,400
So the way I want
to design, I want

1084
01:04:38,400 --> 01:04:40,950
to call my basin of
attraction for this system

1085
01:04:40,950 --> 01:04:45,060
the place where my
cost to go reaches rho.

1086
01:04:45,060 --> 01:04:49,140
And we've got a program, thanks
to [? Sasha and ?] Pablo,

1087
01:04:49,140 --> 01:04:53,520
which will try to estimate this
scalar value, rho, as a scalar

1088
01:04:53,520 --> 01:04:57,990
representative of the basin
of attraction of my system.

1089
01:04:57,990 --> 01:05:00,120
AUDIENCE: Why would you
use a particular cost

1090
01:05:00,120 --> 01:05:05,300
to go rather than looking
at what the variation is

1091
01:05:05,300 --> 01:05:07,532
in the linearization?

1092
01:05:07,532 --> 01:05:09,740
PROFESSOR: That is-- so
we're going to determine this

1093
01:05:09,740 --> 01:05:15,290
by looking at the variation
based on the linearization.

1094
01:05:15,290 --> 01:05:20,960
So I could do this in a
lot of different ways.

1095
01:05:20,960 --> 01:05:23,330
I could look at boxes
around my fixed point

1096
01:05:23,330 --> 01:05:25,718
and try to design some geometry.

1097
01:05:25,718 --> 01:05:27,260
The real basin of
attraction is going

1098
01:05:27,260 --> 01:05:28,718
to be some complicated
thing, which

1099
01:05:28,718 --> 01:05:31,460
depends on my LQR controller
design and the way

1100
01:05:31,460 --> 01:05:33,770
the non-linearity affects.

1101
01:05:33,770 --> 01:05:37,550
AUDIENCE: Right, but wouldn't--
so I guess what would seem more

1102
01:05:37,550 --> 01:05:40,760
intuitive to me would be to say,
look at the next highest order

1103
01:05:40,760 --> 01:05:45,730
term in the expansion and then
see how that's varying and use

1104
01:05:45,730 --> 01:05:47,060
that to--

1105
01:05:47,060 --> 01:05:50,270
PROFESSOR: That's exactly how
we're going to verify it, OK.

1106
01:05:50,270 --> 01:05:51,440
So there's two questions.

1107
01:05:51,440 --> 01:05:54,140
There's a question
of what shapes are

1108
01:05:54,140 --> 01:05:56,570
we going to try to verify, OK.

1109
01:05:56,570 --> 01:06:00,230
The choice here is
to verify contours

1110
01:06:00,230 --> 01:06:01,355
of the cost to go function.

1111
01:06:01,355 --> 01:06:03,688
I'm going to try to find the
biggest contour of the cost

1112
01:06:03,688 --> 01:06:05,420
to go function for
the linear system.

1113
01:06:05,420 --> 01:06:07,712
You're asking if I could
choose a different shape based

1114
01:06:07,712 --> 01:06:08,637
on the contours.

1115
01:06:08,637 --> 01:06:10,220
What we've elected
to do-- and I think

1116
01:06:10,220 --> 01:06:13,100
it's a tighter version, maybe,
than what you're saying,

1117
01:06:13,100 --> 01:06:14,600
but I could be wrong.

1118
01:06:14,600 --> 01:06:15,860
There could be better ways--

1119
01:06:15,860 --> 01:06:19,430
is to find the biggest contour
such that the next higher order

1120
01:06:19,430 --> 01:06:22,250
terms of the
linearization don't break

1121
01:06:22,250 --> 01:06:23,600
the negative definiteness.

1122
01:06:23,600 --> 01:06:25,610
AUDIENCE: OK, so this
is just for purposes

1123
01:06:25,610 --> 01:06:26,450
of choosing a shape.

1124
01:06:26,450 --> 01:06:30,800
PROFESSOR: This is
choosing my shape, OK.

1125
01:06:30,800 --> 01:06:34,700
So I'm going to make this
all concrete right now

1126
01:06:34,700 --> 01:06:37,880
by trying to show
you an example here.

1127
01:06:40,460 --> 01:06:44,827
OK, here's a simple pendulum,
which we know and love, OK.

1128
01:06:44,827 --> 01:06:46,910
This is the phase portrait
of the simple pendulum.

1129
01:06:53,350 --> 01:06:56,400
OK, and the green is at
0, 0, which, in this case,

1130
01:06:56,400 --> 01:06:58,200
is my downward fixed point.

1131
01:06:58,200 --> 01:07:01,260
The top is my
unstable fixed point.

1132
01:07:01,260 --> 01:07:06,240
My goal is to use these local
trajectory ideas in order

1133
01:07:06,240 --> 01:07:07,800
to cover--

1134
01:07:07,800 --> 01:07:11,250
to make all states
go to them, OK.

1135
01:07:11,250 --> 01:07:15,000
Now all I told you so far is I
know how to take an LQR problem

1136
01:07:15,000 --> 01:07:17,490
and try to estimate the
basin of attraction, OK.

1137
01:07:17,490 --> 01:07:22,470
So that's step one, is take a
linearization around my goal

1138
01:07:22,470 --> 01:07:27,030
state, estimate-- design
an LQR controller,

1139
01:07:27,030 --> 01:07:29,880
and estimate it's basin
of attraction, OK.

1140
01:07:29,880 --> 01:07:31,920
And that looks like this, OK.

1141
01:07:31,920 --> 01:07:35,770
We've seen cost to go
functions for the ellipsoids,

1142
01:07:35,770 --> 01:07:40,950
or ellipsoids around
the fixed point.

1143
01:07:40,950 --> 01:07:42,870
I do a sums of
square optimization

1144
01:07:42,870 --> 01:07:46,110
to verify that this function
is negative definite, which

1145
01:07:46,110 --> 01:07:48,000
involves a higher order
polynomial expansion

1146
01:07:48,000 --> 01:07:50,580
of the dynamics in this form.

1147
01:07:50,580 --> 01:07:54,540
And I try to find the biggest
contour for which that system

1148
01:07:54,540 --> 01:07:56,730
is still negative definite.

1149
01:07:56,730 --> 01:07:58,158
That's all the detail.

1150
01:07:58,158 --> 01:08:00,450
All you really need to know
is that I can estimate now,

1151
01:08:00,450 --> 01:08:03,630
with convex optimization,
the basin of attraction

1152
01:08:03,630 --> 01:08:05,610
of that system.

1153
01:08:05,610 --> 01:08:08,580
This is Koditschek's
funnel at the top.

1154
01:08:08,580 --> 01:08:11,580
That blue region is the
beginning of the funnel.

1155
01:08:11,580 --> 01:08:13,800
In this case, I'm going
to run it infinitely long,

1156
01:08:13,800 --> 01:08:15,842
so it's going to eventually
get to the red point.

1157
01:08:15,842 --> 01:08:17,490
That's the output.

1158
01:08:17,490 --> 01:08:22,189
OK, now how do we design funnels
that try to fill this space?

1159
01:08:22,189 --> 01:08:26,220
OK, my proposition is
that we should do roughly

1160
01:08:26,220 --> 01:08:30,870
what the RRTs are doing
and start growing out

1161
01:08:30,870 --> 01:08:33,620
to try to cover the space in
lots of different directions,

1162
01:08:33,620 --> 01:08:34,620
OK.

1163
01:08:34,620 --> 01:08:36,090
The only difference
is, every time

1164
01:08:36,090 --> 01:08:39,390
we grow out in
random directions,

1165
01:08:39,390 --> 01:08:44,250
I'm going to stabilize that
trajectory with an LTV feedback

1166
01:08:44,250 --> 01:08:47,460
and compute the basin
of attraction on it, OK.

1167
01:08:47,460 --> 01:08:48,270
So here we go.

1168
01:08:48,270 --> 01:08:50,672
Pick a point at random, OK.

1169
01:08:50,672 --> 01:08:51,630
And actually, I'm not--

1170
01:08:51,630 --> 01:08:53,047
I don't always
play the RRT trick.

1171
01:08:53,047 --> 01:08:56,710
So I could do lots of RRTs to
try to get back to that point,

1172
01:08:56,710 --> 01:08:59,100
but I'm actually going to
just use this as my goal

1173
01:08:59,100 --> 01:09:02,040
and do [? DR call ?] to get me
there, to design a trajectory

1174
01:09:02,040 --> 01:09:02,729
to get me there.

1175
01:09:02,729 --> 01:09:04,892
If that works, that's
perfectly fine.

1176
01:09:04,892 --> 01:09:05,850
So that's a trajectory.

1177
01:09:05,850 --> 01:09:06,630
I didn't draw it nicely.

1178
01:09:06,630 --> 01:09:08,340
It actually starts
here, goes this way,

1179
01:09:08,340 --> 01:09:12,569
wraps around, and comes
to that red point, OK.

1180
01:09:12,569 --> 01:09:15,060
From [? DR ?] call, it quickly
designs that trajectory.

1181
01:09:15,060 --> 01:09:18,359
Now let's back up and
start computing the cost

1182
01:09:18,359 --> 01:09:20,460
to go function, the
Riccati equation,

1183
01:09:20,460 --> 01:09:22,770
backwards to stabilize
that trajectory.

1184
01:09:22,770 --> 01:09:25,050
And as we go, we'll
compute the basin

1185
01:09:25,050 --> 01:09:32,430
of attraction of that
controller, which has exactly--

1186
01:09:32,430 --> 01:09:35,340
I drew it in finite
segments, but I

1187
01:09:35,340 --> 01:09:40,439
hope you can see that's
exactly the funnels, yeah?

1188
01:09:40,439 --> 01:09:44,080
If I start the system inside
any of that blue region,

1189
01:09:44,080 --> 01:09:46,500
and I execute the
trajectory, the LQR,

1190
01:09:46,500 --> 01:09:49,140
the trajectory stabilizer
along that trajectory,

1191
01:09:49,140 --> 01:09:52,229
it's going to take me
around and get me to my goal

1192
01:09:52,229 --> 01:09:54,840
and stay there, OK.

1193
01:09:54,840 --> 01:09:59,220
So this is feedback
motion planning happening.

1194
01:09:59,220 --> 01:10:02,025
Now the cool thing is, I told
you about the multi-query idea.

1195
01:10:02,025 --> 01:10:03,400
I told you about
all these ideas,

1196
01:10:03,400 --> 01:10:05,108
talked about making
very dense trees that

1197
01:10:05,108 --> 01:10:07,992
handle all these situations.

1198
01:10:07,992 --> 01:10:10,200
If you know the basins of
attraction of your existing

1199
01:10:10,200 --> 01:10:13,770
controller, you don't have
to build a very dense tree.

1200
01:10:13,770 --> 01:10:16,020
I know, if I were to pick
another random point that

1201
01:10:16,020 --> 01:10:17,622
was already inside
my blue region,

1202
01:10:17,622 --> 01:10:19,080
I'm not going to
get a lot of value

1203
01:10:19,080 --> 01:10:22,002
out of adding nodes
inside that blue region.

1204
01:10:22,002 --> 01:10:23,460
So let's pick
another random point,

1205
01:10:23,460 --> 01:10:25,960
and if it's inside the blue
region, we'll throw it away.

1206
01:10:25,960 --> 01:10:29,370
If it's outside, I'll keep it,
and I'll try to grow to it, OK.

1207
01:10:29,370 --> 01:10:32,450
So I get another random
point, which is here.

1208
01:10:32,450 --> 01:10:34,830
Going to pick the closest
point in my current tree, which

1209
01:10:34,830 --> 01:10:39,540
was just, in this case, a
trajectory, connect that back.

1210
01:10:39,540 --> 01:10:43,120
Now in this one, my
dynamic distance metric,

1211
01:10:43,120 --> 01:10:45,690
which was that LQR
distance metric,

1212
01:10:45,690 --> 01:10:47,872
connected and said this
was the closest point.

1213
01:10:47,872 --> 01:10:49,830
Looks a little surprising,
but maybe the torque

1214
01:10:49,830 --> 01:10:51,150
limits said that one
couldn't get there,

1215
01:10:51,150 --> 01:10:53,070
or maybe my distance
metric just wasn't perfect.

1216
01:10:53,070 --> 01:10:53,987
But that's reasonable.

1217
01:10:53,987 --> 01:10:57,210
It tries to go from here,
add a little bit more torque,

1218
01:10:57,210 --> 01:10:58,260
and drive out.

1219
01:10:58,260 --> 01:11:01,880
And I stabilize that
with the funnel, yeah?

1220
01:11:01,880 --> 01:11:03,960
OK, now I have two
trajectories, and I've

1221
01:11:03,960 --> 01:11:06,430
got a pretty good coverage
of the space already.

1222
01:11:06,430 --> 01:11:09,420
You can imagine I design a
handful more trajectories,

1223
01:11:09,420 --> 01:11:14,640
picking the state as
I go, and I can really

1224
01:11:14,640 --> 01:11:22,270
quickly and efficiently fill
that state space with funnels

1225
01:11:22,270 --> 01:11:26,340
which take me to the goal.

1226
01:11:26,340 --> 01:11:28,710
Does that makes sense?

1227
01:11:28,710 --> 01:11:30,460
AUDIENCE: So when you
do your multi-query,

1228
01:11:30,460 --> 01:11:32,848
how do you choose
which funnel you're in?

1229
01:11:32,848 --> 01:11:33,640
PROFESSOR: Awesome.

1230
01:11:33,640 --> 01:11:42,280
Well, first, even-- so if I
just want to execute this, yeah.

1231
01:11:42,280 --> 01:11:43,725
And I want to get
to that goal, it

1232
01:11:43,725 --> 01:11:45,100
might be that I
don't really have

1233
01:11:45,100 --> 01:11:47,080
to do the-- so you could think
of this as being every time

1234
01:11:47,080 --> 01:11:48,163
being a multi-query thing.

1235
01:11:48,163 --> 01:11:50,320
So every time I--
if I start, if I

1236
01:11:50,320 --> 01:11:52,750
pick a point that isn't in
any basin of attraction,

1237
01:11:52,750 --> 01:11:55,810
then I'll try to connect
and grow a tree there.

1238
01:11:55,810 --> 01:11:57,430
If I, however, pick a point--

1239
01:11:57,430 --> 01:11:59,440
if it's execution
time, I say the robot's

1240
01:11:59,440 --> 01:12:01,220
got to run from
here, I pick a point.

1241
01:12:01,220 --> 01:12:02,845
It's already in the
basin of attraction

1242
01:12:02,845 --> 01:12:04,552
that I just execute
that trajectory.

1243
01:12:04,552 --> 01:12:06,010
If, I think what
you're alluding to

1244
01:12:06,010 --> 01:12:08,860
is that it's in the basin of
attraction of multiple points,

1245
01:12:08,860 --> 01:12:10,848
then I pick the one with
the lowest cost to go,

1246
01:12:10,848 --> 01:12:12,640
because those are all
estimates of the cost

1247
01:12:12,640 --> 01:12:17,530
to go that are centered
around that trajectory, OK.

1248
01:12:17,530 --> 01:12:20,680
So for the simple pendulum,
with damping and torque limits

1249
01:12:20,680 --> 01:12:22,735
and everything set
the way it was,

1250
01:12:22,735 --> 01:12:27,400
this little randomized
algorithm can fill the space

1251
01:12:27,400 --> 01:12:31,158
with basins of attraction with
just a handful of trajectories.

1252
01:12:35,062 --> 01:12:35,987
AUDIENCE: [INAUDIBLE]

1253
01:12:35,987 --> 01:12:37,820
PROFESSOR: I've never
said it with so much--

1254
01:12:37,820 --> 01:12:38,960
[LAUGHTER]

1255
01:12:38,960 --> 01:12:42,350
--such dramatic force.

1256
01:12:42,350 --> 01:12:44,206
OK?

1257
01:12:44,206 --> 01:12:47,390
[CHUCKLES] It's the highlight
of the class right here.

1258
01:12:47,390 --> 01:12:52,510
OK, so this is exactly the
feedback motion planning idea

1259
01:12:52,510 --> 01:12:56,160
that I'm most excited
about right now.

1260
01:12:56,160 --> 01:12:58,280
Because we can suddenly--

1261
01:12:58,280 --> 01:13:01,250
for LQR controller,
it depended on--

1262
01:13:01,250 --> 01:13:03,770
the thing we've worked out
is, if the cost to go function

1263
01:13:03,770 --> 01:13:06,140
is this, or in the time
varying case, this,

1264
01:13:06,140 --> 01:13:09,790
then I can come up with a
very nice representation

1265
01:13:09,790 --> 01:13:13,910
of the basin of attraction
based on just a scalar value.

1266
01:13:13,910 --> 01:13:18,320
And I could just start designing
funnels through my state space.

1267
01:13:18,320 --> 01:13:20,990
And the vision is, if you
can think about the funnels

1268
01:13:20,990 --> 01:13:22,672
as you build them,
then you actually

1269
01:13:22,672 --> 01:13:24,380
don't have to build
too many trajectories

1270
01:13:24,380 --> 01:13:26,951
to start filling
the state space.

1271
01:13:26,951 --> 01:13:29,240
AUDIENCE: Why do you call
it feedback motion planning.

1272
01:13:29,240 --> 01:13:30,350
PROFESSOR: Yeah.

1273
01:13:30,350 --> 01:13:33,080
It's because I'm thinking about
the feedback control, which

1274
01:13:33,080 --> 01:13:34,900
is the funnel, as I'm
doing the planning.

1275
01:13:34,900 --> 01:13:35,400
Yeah.

1276
01:13:38,917 --> 01:13:41,250
Do you agree why Koditschek's
version is feedback motion

1277
01:13:41,250 --> 01:13:43,883
planning, or do you not like
that being feedback motion

1278
01:13:43,883 --> 01:13:44,913
planning?

1279
01:13:44,913 --> 01:13:45,955
AUDIENCE: It makes sense.

1280
01:13:45,955 --> 01:13:48,292
I guess I'm used to
different funnels.

1281
01:13:48,292 --> 01:13:49,250
PROFESSOR: That's true.

1282
01:13:49,250 --> 01:13:50,060
You are, yeah.

1283
01:13:50,060 --> 01:13:50,490
AUDIENCE: [CHUCKLES]

1284
01:13:50,490 --> 01:13:50,900
PROFESSOR: OK?

1285
01:13:50,900 --> 01:13:52,192
AUDIENCE: [? Think ?] [? so. ?]

1286
01:13:52,192 --> 01:13:53,810
PROFESSOR: So these
are very much--

1287
01:13:53,810 --> 01:13:57,230
in Koditschek's case,
there's no debate

1288
01:13:57,230 --> 01:14:00,710
that each funnel is a
feedback controller.

1289
01:14:00,710 --> 01:14:02,060
I think of this as the same way.

1290
01:14:02,060 --> 01:14:03,602
You could argue with
it, because it's

1291
01:14:03,602 --> 01:14:06,055
centered around trajectory
design, which his is not.

1292
01:14:06,055 --> 01:14:07,430
So this one has
a little bit more

1293
01:14:07,430 --> 01:14:12,590
of a feel of conventional
motion planning.

1294
01:14:12,590 --> 01:14:15,207
But by virtue of thinking
about the feedback

1295
01:14:15,207 --> 01:14:16,790
as I design the
trajectories, it means

1296
01:14:16,790 --> 01:14:19,252
I have to build less
trajectories, yeah.

1297
01:14:19,252 --> 01:14:21,710
So it'd be nice to actually
have the conversation about how

1298
01:14:21,710 --> 01:14:23,666
these are related
to float tubes.

1299
01:14:23,666 --> 01:14:24,950
Mm-hmm.

1300
01:14:24,950 --> 01:14:27,180
It's pretty similar
in some ways.

1301
01:14:27,180 --> 01:14:30,970
But these are very
effective to compute

1302
01:14:30,970 --> 01:14:34,170
the stable-- the
basins of attraction,

1303
01:14:34,170 --> 01:14:35,280
so I think it's relevant.

1304
01:14:35,280 --> 01:14:37,010
Yeah.

1305
01:14:37,010 --> 01:14:40,310
AUDIENCE: Could you
factor in actuator limits

1306
01:14:40,310 --> 01:14:42,025
into the Lyapunov function?

1307
01:14:42,025 --> 01:14:42,650
PROFESSOR: Yes.

1308
01:14:42,650 --> 01:14:48,600
So OK, actuator limits in the
Lyapunov function are harder.

1309
01:14:48,600 --> 01:14:50,150
So what you do is
you-- everything

1310
01:14:50,150 --> 01:14:54,230
is based on a Taylor
expansion of the dynamics

1311
01:14:54,230 --> 01:14:56,240
around the nominal.

1312
01:14:56,240 --> 01:15:00,050
So a hard limit, if I linearize
and I don't see that limit,

1313
01:15:00,050 --> 01:15:03,800
then I'm not going
to know about it.

1314
01:15:03,800 --> 01:15:05,508
So there's a couple
things you could try.

1315
01:15:05,508 --> 01:15:07,550
And actually, I recommended
to Mike earlier today

1316
01:15:07,550 --> 01:15:09,500
that he should do this
for his final project,

1317
01:15:09,500 --> 01:15:11,690
is to do that, the case
where the actuator limits.

1318
01:15:11,690 --> 01:15:13,870
So you could imagine
making a soft limit,

1319
01:15:13,870 --> 01:15:16,970
some sigmoidal limit,
and having the gradients

1320
01:15:16,970 --> 01:15:21,410
of that visible from
your linearization point.

1321
01:15:21,410 --> 01:15:23,780
Or you could imagine
the LQR design

1322
01:15:23,780 --> 01:15:26,780
that actually does both the
quadratic cost and the bang

1323
01:15:26,780 --> 01:15:28,250
bang synonymously.

1324
01:15:28,250 --> 01:15:30,569
Haven't done it yet, but
I think that's consistent.

1325
01:15:37,420 --> 01:15:46,360
OK, so I told you a lot about
local trajectory optimizers.

1326
01:15:46,360 --> 01:15:48,700
And today we said there were
at least three good ways,

1327
01:15:48,700 --> 01:15:51,850
I think, to make those
trajectory optimizers

1328
01:15:51,850 --> 01:15:55,330
into a more feedback plan.

1329
01:15:55,330 --> 01:15:58,870
So the first idea was
real time planning.

1330
01:16:02,412 --> 01:16:04,120
And if it's fast
enough, well, then we're

1331
01:16:04,120 --> 01:16:08,380
all out of jobs, because
we could just do that.

1332
01:16:08,380 --> 01:16:12,100
The second idea was
building these trees

1333
01:16:12,100 --> 01:16:17,860
and doing multi-query,
keeping your tree around

1334
01:16:17,860 --> 01:16:21,310
and just finding your way to
the closest point of the tree

1335
01:16:21,310 --> 01:16:23,530
every time you execute.

1336
01:16:23,530 --> 01:16:25,975
And that has the nice feature
that every time I execute,

1337
01:16:25,975 --> 01:16:27,350
my tree gets a
little bit bigger,

1338
01:16:27,350 --> 01:16:31,120
and I know a little bit more
about my robot and myself,

1339
01:16:31,120 --> 01:16:32,350
yeah.

1340
01:16:32,350 --> 01:16:36,160
And the last one was this
feedback motion planning,

1341
01:16:36,160 --> 01:16:41,860
which there are only
a handful of ideas

1342
01:16:41,860 --> 01:16:44,500
out there, I think, about
feedback motion planning

1343
01:16:44,500 --> 01:16:45,850
that people use.

1344
01:16:45,850 --> 01:16:48,425
Koditschek's funnels are
definitely the most prominent.

1345
01:16:56,030 --> 01:16:59,480
And actually, I think that
the funnels should probably be

1346
01:16:59,480 --> 01:17:00,770
on my list, but I haven't--

1347
01:17:00,770 --> 01:17:02,750
sorry, the float tubes should
probably be more on my list,

1348
01:17:02,750 --> 01:17:03,470
but I don't--

1349
01:17:03,470 --> 01:17:07,670
I've never made a strong
enough connection.

1350
01:17:07,670 --> 01:17:10,666
We should make that a goal for
the rest of the class, yeah.

1351
01:17:10,666 --> 01:17:15,600
AUDIENCE: [INAUDIBLE]
float tube or--

1352
01:17:15,600 --> 01:17:18,432
PROFESSOR: So there's
definitely differences,

1353
01:17:18,432 --> 01:17:19,890
but we should really
figure it out.

1354
01:17:19,890 --> 01:17:23,840
So [? Brian ?] [? Williams' ?]
group does planning with float

1355
01:17:23,840 --> 01:17:28,840
tubes that are, in spirit,
similar to these funnels.

1356
01:17:28,840 --> 01:17:30,110
Yeah.

1357
01:17:30,110 --> 01:17:31,940
And so we should talk
about whether you

1358
01:17:31,940 --> 01:17:34,400
can design the float tubes
for the class of systems

1359
01:17:34,400 --> 01:17:37,091
that I care about in
the class and stuff.

1360
01:17:37,091 --> 01:17:39,380
Mm-hmm.

1361
01:17:39,380 --> 01:17:40,340
Excellent.

1362
01:17:40,340 --> 01:17:42,990
OK, so you saw the email
about the projects.

1363
01:17:42,990 --> 01:17:45,578
If you have any questions
about your projects,

1364
01:17:45,578 --> 01:17:47,120
we could talk for
a minute right now,

1365
01:17:47,120 --> 01:17:49,370
or we could schedule a
meeting before Thursday.

1366
01:17:52,130 --> 01:17:55,010
There's a few ideas on the
email we sent in the PDF

1367
01:17:55,010 --> 01:17:55,915
that we sent out.

1368
01:17:55,915 --> 01:17:57,290
If you're looking
for more ideas,

1369
01:17:57,290 --> 01:18:01,455
I've got a list of other
ideas that I'm happy to share.

1370
01:18:01,455 --> 01:18:03,830
It's going to work best if
you find a problem that you're

1371
01:18:03,830 --> 01:18:05,510
passionate about,
something that you

1372
01:18:05,510 --> 01:18:08,510
got excited about in
class or from your work,

1373
01:18:08,510 --> 01:18:11,330
and you apply some
idea from class.

1374
01:18:11,330 --> 01:18:13,850
But the goal for Thursday
is to say enough about it

1375
01:18:13,850 --> 01:18:16,070
that I can give you
some real feedback

1376
01:18:16,070 --> 01:18:18,080
on your half-page
write-up and try

1377
01:18:18,080 --> 01:18:21,500
to help you with the
scope and topic to make it

1378
01:18:21,500 --> 01:18:22,780
a good project.

1379
01:18:22,780 --> 01:18:23,660
OK?

1380
01:18:23,660 --> 01:18:25,580
Let me know if
there's any questions.

1381
01:18:25,580 --> 01:18:27,490
See you Thursday.