1
00:00:00,070 --> 00:00:02,430
The following content is
provided under a Creative

2
00:00:02,430 --> 00:00:03,820
Commons license.

3
00:00:03,820 --> 00:00:06,060
Your support will help
MIT OpenCourseWare

4
00:00:06,060 --> 00:00:10,150
continue to offer high quality,
educational resources for free.

5
00:00:10,150 --> 00:00:12,700
To make a donation, or to
view additional materials

6
00:00:12,700 --> 00:00:16,600
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:16,600 --> 00:00:17,260
at ocw.mit.edu.

8
00:00:26,985 --> 00:00:27,860
PROFESSOR: All right.

9
00:00:27,860 --> 00:00:29,760
Let's get started.

10
00:00:29,760 --> 00:00:32,409
So today we're going to
talk about capabilities,

11
00:00:32,409 --> 00:00:36,310
continue our discussion of how
to do privilege separation.

12
00:00:36,310 --> 00:00:39,960
And remember last week we
talked about how Unix provides

13
00:00:39,960 --> 00:00:41,910
some mechanisms for
applications to use

14
00:00:41,910 --> 00:00:45,600
if they want to privilege
separate the application's

15
00:00:45,600 --> 00:00:46,649
internal structure.

16
00:00:46,649 --> 00:00:48,940
And today we're going to talk
about capabilities, which

17
00:00:48,940 --> 00:00:53,720
is a very different way of
thinking about privileges

18
00:00:53,720 --> 00:00:56,220
that an application might have.

19
00:00:56,220 --> 00:00:59,030
And this is why we have actually
these two somewhat distinct

20
00:00:59,030 --> 00:01:06,840
readings for today, one of which
is this confused deputy problem

21
00:01:06,840 --> 00:01:10,982
and how to make your privileges
much more explicit when you're

22
00:01:10,982 --> 00:01:12,940
writing software so that
you don't accidentally

23
00:01:12,940 --> 00:01:14,595
use the wrong privileges.

24
00:01:14,595 --> 00:01:16,470
And then the second
paper is about the system

25
00:01:16,470 --> 00:01:20,700
called Capsicum, which is all
about sandboxing and running

26
00:01:20,700 --> 00:01:22,930
some piece of code
with fewer privileges

27
00:01:22,930 --> 00:01:26,420
so that it, very much
like [INAUDIBLE],

28
00:01:26,420 --> 00:01:29,786
if it's compromised, the
damage isn't that great.

29
00:01:29,786 --> 00:01:31,830
Now it turns out
that the authors

30
00:01:31,830 --> 00:01:34,380
of both of these
readings really think

31
00:01:34,380 --> 00:01:37,610
capabilities are the answer,
because they let you manipulate

32
00:01:37,610 --> 00:01:42,540
privileges in a rather different
way from how Unix, let's say,

33
00:01:42,540 --> 00:01:44,812
thinks about privileges.

34
00:01:44,812 --> 00:01:47,270
So to get started, maybe let's
look at this confused deputy

35
00:01:47,270 --> 00:01:48,880
problem and try
to understand what

36
00:01:48,880 --> 00:01:52,980
is this problem that Norman
Hardy ran into and was

37
00:01:52,980 --> 00:01:54,590
so perplexed by.

38
00:01:54,590 --> 00:01:56,854
So the paper is
written-- well, it

39
00:01:56,854 --> 00:01:58,395
was written quite
a while ago, and it

40
00:01:58,395 --> 00:02:01,020
uses syntax for file names
that's a bit surprising.

41
00:02:01,020 --> 00:02:04,480
But we can try to at least
transcribe his problem

42
00:02:04,480 --> 00:02:07,690
into more familiar syntax
with Unix-style path

43
00:02:07,690 --> 00:02:08,947
names, et cetera.

44
00:02:08,947 --> 00:02:10,530
So as far as I can
tell, what is going

45
00:02:10,530 --> 00:02:13,880
on in their system is that they
had a Fortran compiler, which

46
00:02:13,880 --> 00:02:16,310
sort of dates their
design at some level, too.

47
00:02:16,310 --> 00:02:22,030
But their Fortran compiler
lived in /sysx/fort,

48
00:02:22,030 --> 00:02:26,150
and they wanted to change
this Fortran compiler,

49
00:02:26,150 --> 00:02:29,554
so they would keep statistics
about what was compiled,

50
00:02:29,554 --> 00:02:31,720
what parts of a compiler
were particularly expensive

51
00:02:31,720 --> 00:02:33,410
presumably, et cetera.

52
00:02:33,410 --> 00:02:36,120
So he wanted to make sure this
Fortran compiler would somehow

53
00:02:36,120 --> 00:02:39,110
end up writing to
this file /sysx/stat,

54
00:02:39,110 --> 00:02:44,360
that it would record information
about various invocations

55
00:02:44,360 --> 00:02:46,350
of the compiler.

56
00:02:46,350 --> 00:02:50,070
And the way they did this is,
in their operating system, they

57
00:02:50,070 --> 00:02:52,170
had something kind
of like the setuid

58
00:02:52,170 --> 00:02:54,040
that we talked about in Unix.

59
00:02:54,040 --> 00:02:57,360
Except there, they called
it the home files license.

60
00:02:57,360 --> 00:03:01,380
And what it means is that
if you ran /sysx/fort,

61
00:03:01,380 --> 00:03:05,710
and this program had this
so-called home files license,

62
00:03:05,710 --> 00:03:09,860
then this process that you just
ran would have extra privileges

63
00:03:09,860 --> 00:03:13,102
on being able to write
everything in /sysx.

64
00:03:13,102 --> 00:03:15,310
So it would have these extra
privileges on everything

65
00:03:15,310 --> 00:03:18,819
in /sysx/, basically, star.

66
00:03:18,819 --> 00:03:20,610
It could access all
those files in addition

67
00:03:20,610 --> 00:03:22,985
to anything that it could
access because the user ran it,

68
00:03:22,985 --> 00:03:25,190
for example.

69
00:03:25,190 --> 00:03:27,030
So the particular
problem they ran into

70
00:03:27,030 --> 00:03:31,236
is that some clever user
was able to do this.

71
00:03:31,236 --> 00:03:32,860
So they would run
the Fortran compiler,

72
00:03:32,860 --> 00:03:35,151
and the Fortran compiler
would take arguments very much

73
00:03:35,151 --> 00:03:36,790
like GCC takes arguments.

74
00:03:36,790 --> 00:03:39,590
And they would compile
something like foo.f.

75
00:03:39,590 --> 00:03:41,620
Here is my Fortran source code.

76
00:03:41,620 --> 00:03:48,120
And they'd say, well, put that
output -o into /sysx/stat.

77
00:03:48,120 --> 00:03:50,700
Or more damagingly
in their case,

78
00:03:50,700 --> 00:03:54,470
there was another file in
/sysx that was the billing file

79
00:03:54,470 --> 00:03:56,390
for all the customers
on the system.

80
00:03:56,390 --> 00:04:01,850
So you could similarly ask the
Fortran compiler to compile

81
00:04:01,850 --> 00:04:05,800
the source file and put the
output into some special file

82
00:04:05,800 --> 00:04:07,980
in /sysx.

83
00:04:07,980 --> 00:04:10,860
And in their case,
this actually worked.

84
00:04:10,860 --> 00:04:12,570
Even though the user
themselves didn't

85
00:04:12,570 --> 00:04:15,430
have access to write to
this file or directory,

86
00:04:15,430 --> 00:04:18,620
because the compiler had
this extra privilege--

87
00:04:18,620 --> 00:04:21,660
this home files
license, in their case--

88
00:04:21,660 --> 00:04:24,590
it was able to
override these files

89
00:04:24,590 --> 00:04:28,784
despite that not being really
the developer's intention.

90
00:04:28,784 --> 00:04:29,450
This make sense?

91
00:04:29,450 --> 00:04:31,116
This is the rough
problem they ran into?

92
00:04:31,116 --> 00:04:32,515
So who do they blame?

93
00:04:32,515 --> 00:04:33,765
What do they think went wrong?

94
00:04:40,995 --> 00:04:42,780
Or how would you
design it differently

95
00:04:42,780 --> 00:04:46,150
to avoid running
into such problems?

96
00:04:46,150 --> 00:04:48,770
So the thing they sort
of think about here,

97
00:04:48,770 --> 00:04:51,930
or they talk about
in this write up,

98
00:04:51,930 --> 00:04:55,240
is that they believe this
Fortran compiler should

99
00:04:55,240 --> 00:04:57,990
be very careful when it's
using its privileges.

100
00:04:57,990 --> 00:04:59,960
Because, at some level,
the Fortran compiler

101
00:04:59,960 --> 00:05:01,570
has two types of privileges.

102
00:05:01,570 --> 00:05:05,660
It has one stemming from the
fact the user invoked it,

103
00:05:05,660 --> 00:05:08,140
so the user should be
able to access the source

104
00:05:08,140 --> 00:05:10,050
file, like foo.f.

105
00:05:10,050 --> 00:05:11,860
And if it was some
other user, maybe

106
00:05:11,860 --> 00:05:14,680
it wouldn't be able to
access the user source code.

107
00:05:14,680 --> 00:05:17,590
And in other sorts of privileges
is from those home files

108
00:05:17,590 --> 00:05:20,830
license thing that allows us to
write to these special files.

109
00:05:20,830 --> 00:05:23,480
And internally, in the
source code of the compiler,

110
00:05:23,480 --> 00:05:25,920
when they open a
file, the compiler

111
00:05:25,920 --> 00:05:28,900
should have been very explicit
about which of these privileges

112
00:05:28,900 --> 00:05:31,910
it wants to exercise
when opening a file

113
00:05:31,910 --> 00:05:34,372
or performing some
privileged operation.

114
00:05:34,372 --> 00:05:36,330
But their compiler was
not written in this way.

115
00:05:36,330 --> 00:05:38,140
It was just called
open, read, write,

116
00:05:38,140 --> 00:05:39,550
like any other program would do.

117
00:05:39,550 --> 00:05:42,440
And it would implicitly use
all the privileges that it has,

118
00:05:42,440 --> 00:05:45,033
which combines-- well,
in their system design,

119
00:05:45,033 --> 00:05:47,410
it was sort of the union
of the user privileges

120
00:05:47,410 --> 00:05:51,086
and these home files
license privileges.

121
00:05:51,086 --> 00:05:52,790
That make sense?

122
00:05:52,790 --> 00:05:55,390
So these guys were really
interested in fixing

123
00:05:55,390 --> 00:05:56,180
this problem.

124
00:05:56,180 --> 00:05:59,240
And they were sort of calling
this compiler this confused

125
00:05:59,240 --> 00:06:00,964
deputy, because it
needs to disambiguate

126
00:06:00,964 --> 00:06:02,505
these multiple
privileges that it has

127
00:06:02,505 --> 00:06:06,800
and carefully use them
in the right instance.

128
00:06:06,800 --> 00:06:09,350
So I guess one thing
we could try to look at

129
00:06:09,350 --> 00:06:15,120
is how would we design
such a compiler in Unix?

130
00:06:15,120 --> 00:06:18,024
So in their system, they had
this whole files license thing.

131
00:06:18,024 --> 00:06:20,190
Other mechanisms, then they
introduced capabilities.

132
00:06:20,190 --> 00:06:21,750
We'll talk about them shortly.

133
00:06:21,750 --> 00:06:24,830
But could we solve
this in a Unix system?

134
00:06:24,830 --> 00:06:27,080
Suppose you had to write
this Fortran compiler in Unix

135
00:06:27,080 --> 00:06:29,566
and write to a special file
and avoid this confused

136
00:06:29,566 --> 00:06:30,190
deputy problem.

137
00:06:30,190 --> 00:06:32,775
What would you do?

138
00:06:32,775 --> 00:06:33,275
Any ideas?

139
00:06:35,802 --> 00:06:37,760
I guess you could just
declare this a bad plan.

140
00:06:37,760 --> 00:06:40,212
Like don't keep statistics.

141
00:06:40,212 --> 00:06:42,028
Yeah?

142
00:06:42,028 --> 00:06:44,649
AUDIENCE: [INAUDIBLE].

143
00:06:44,649 --> 00:06:45,315
PROFESSOR: Sure.

144
00:06:45,315 --> 00:06:46,670
That could be, right?

145
00:06:46,670 --> 00:06:47,530
Well, yeah.

146
00:06:47,530 --> 00:06:50,530
So you could not
support flags like -o.

147
00:06:50,530 --> 00:06:52,210
On the other hand,
you might want

148
00:06:52,210 --> 00:06:55,980
to allow specifying which
source code you want

149
00:06:55,980 --> 00:06:58,196
to compile so that maybe
you could read the billing

150
00:06:58,196 --> 00:06:59,820
file or read the
statistics file, which

151
00:06:59,820 --> 00:07:01,230
maybe should be secret.

152
00:07:01,230 --> 00:07:02,897
Or maybe the source
code has-- maybe you

153
00:07:02,897 --> 00:07:04,646
can support a the
source code on standard,

154
00:07:04,646 --> 00:07:06,330
but it has include
statements, so

155
00:07:06,330 --> 00:07:08,370
it needs to include other
pieces of source code.

156
00:07:08,370 --> 00:07:09,354
So that's a little tricky.

157
00:07:09,354 --> 00:07:11,729
AUDIENCE: You could split up
the application [INAUDIBLE].

158
00:07:16,905 --> 00:07:17,530
PROFESSOR: Yes.

159
00:07:17,530 --> 00:07:20,270
So another potentially good
design is to split it up,

160
00:07:20,270 --> 00:07:20,770
right?

161
00:07:20,770 --> 00:07:23,130
And realize that this
fort compiler really

162
00:07:23,130 --> 00:07:25,525
doesn't need all these two
privileges at the same time.

163
00:07:25,525 --> 00:07:33,420
So maybe we should have our Unix
world /bin/fortcc or something,

164
00:07:33,420 --> 00:07:36,570
the compiler, and then this guy
is just a regular program with

165
00:07:36,570 --> 00:07:37,790
no extra privileges.

166
00:07:37,790 --> 00:07:41,980
And then we'll also maybe
have a /bin/fortlog,

167
00:07:41,980 --> 00:07:44,350
which is going to be a special
program with some extra

168
00:07:44,350 --> 00:07:47,640
privileges and it'll log some
statistics about what's going

169
00:07:47,640 --> 00:07:49,410
on in the compiler.

170
00:07:49,410 --> 00:07:53,010
And fortcc is going
to invoke this guy.

171
00:07:53,010 --> 00:07:56,020
So how do we give this
guy extra privileges?

172
00:07:56,020 --> 00:07:56,520
Yeah?

173
00:07:56,520 --> 00:07:58,153
AUDIENCE: Well, maybe if you
use something like setuid

174
00:07:58,153 --> 00:08:00,930
or something, like fortlog,
then presumably any other user

175
00:08:00,930 --> 00:08:03,034
could also log arbitrary
data through it.

176
00:08:03,034 --> 00:08:03,700
PROFESSOR: Yeah.

177
00:08:03,700 --> 00:08:04,719
So this is not so great.

178
00:08:04,719 --> 00:08:06,510
Because on fortlog,
presumably the only way

179
00:08:06,510 --> 00:08:07,968
to give extra
privileges in Unix is

180
00:08:07,968 --> 00:08:11,170
to in fact make it owned by, I
don't know, maybe the fort UID,

181
00:08:11,170 --> 00:08:14,490
and that's also setuid.

182
00:08:14,490 --> 00:08:17,550
So every time you run it, it
switches to this Fortran UID.

183
00:08:17,550 --> 00:08:19,580
And maybe there's some
special stats file.

184
00:08:19,580 --> 00:08:23,170
But then in fact anyone can
invoke this fortlog thingy.

185
00:08:23,170 --> 00:08:24,730
Which is maybe not great.

186
00:08:24,730 --> 00:08:26,940
Now anyone can write
to the stats file.

187
00:08:26,940 --> 00:08:29,782
But maybe this example is not
the biggest security concern

188
00:08:29,782 --> 00:08:31,490
about someone corrupting
your statistics.

189
00:08:31,490 --> 00:08:33,220
But suppose this
was a billing file.

190
00:08:33,220 --> 00:08:36,072
Then maybe the same problems
would be slightly more acute.

191
00:08:36,072 --> 00:08:36,571
Yeah?

192
00:08:36,571 --> 00:08:39,674
AUDIENCE: But you can always
make your [INAUDIBLE] stats

193
00:08:39,674 --> 00:08:40,340
you want, right?

194
00:08:40,340 --> 00:08:41,298
Instead of [INAUDIBLE].

195
00:08:44,940 --> 00:08:46,930
PROFESSOR: So in
some sense, yeah.

196
00:08:46,930 --> 00:08:48,960
If you're willing to
live with arbitrary

197
00:08:48,960 --> 00:08:51,262
stuff in your statistics
or logging file,

198
00:08:51,262 --> 00:08:52,220
then maybe that's true.

199
00:08:52,220 --> 00:08:54,212
AUDIENCE: Even if
you [INAUDIBLE],

200
00:08:54,212 --> 00:08:56,702
you can already make your C
code have whatever statistics

201
00:08:56,702 --> 00:08:57,994
that you'd want to be recorded.

202
00:08:57,994 --> 00:08:58,868
PROFESSOR: You could.

203
00:08:58,868 --> 00:08:59,585
Yeah.

204
00:08:59,585 --> 00:09:00,084
Yeah.

205
00:09:00,084 --> 00:09:01,524
So it might be
that in this case,

206
00:09:01,524 --> 00:09:03,940
it doesn't really matter that
you can log arbitrary stuff.

207
00:09:03,940 --> 00:09:05,240
So that's true.

208
00:09:05,240 --> 00:09:06,040
Yeah.

209
00:09:06,040 --> 00:09:08,480
So if you cared about who can
invoke this fortlog thing,

210
00:09:08,480 --> 00:09:10,063
could you really do
something about it

211
00:09:10,063 --> 00:09:12,484
in Unix, or not so much?

212
00:09:12,484 --> 00:09:12,984
Yeah?

213
00:09:12,984 --> 00:09:14,892
AUDIENCE: [INAUDIBLE].

214
00:09:14,892 --> 00:09:18,090
It would make both
of them setuid.

215
00:09:18,090 --> 00:09:23,120
Now the fortcc would
read that source files.

216
00:09:23,120 --> 00:09:26,430
It would switch back to the
saved UID, just the user UID.

217
00:09:26,430 --> 00:09:31,060
Remote fortlog in
a setuid, which has

218
00:09:31,060 --> 00:09:32,485
permissions to execute fortlog.

219
00:09:32,485 --> 00:09:37,812
And that fortlog would
setuid again [INAUDIBLE].

220
00:09:37,812 --> 00:09:38,520
PROFESSOR: Right.

221
00:09:38,520 --> 00:09:39,020
Yeah.

222
00:09:39,020 --> 00:09:42,710
So there is this rather
elaborate mechanism in Unix

223
00:09:42,710 --> 00:09:46,280
that we skipped on last
Monday's lecture, that

224
00:09:46,280 --> 00:09:48,780
actually allows an
application to switch

225
00:09:48,780 --> 00:09:50,190
between multiple UIDs.

226
00:09:50,190 --> 00:09:53,800
if it was setuid to some
user ID, then it could say,

227
00:09:53,800 --> 00:09:55,730
well, now I want to
run with this user ID.

228
00:09:55,730 --> 00:09:57,480
Now I want to run with
this other user ID.

229
00:09:57,480 --> 00:10:00,820
And it could sort of carefully
alternate between these.

230
00:10:00,820 --> 00:10:02,320
It's a little tricky
to do it right,

231
00:10:02,320 --> 00:10:04,213
but it's probably doable.

232
00:10:04,213 --> 00:10:06,224
So that's one potential design.

233
00:10:06,224 --> 00:10:08,140
I guess another hack you
could maybe try to do

234
00:10:08,140 --> 00:10:10,740
is make this fortlog
binary only executable

235
00:10:10,740 --> 00:10:14,790
to a particular group and
make fortcc a setgid binary

236
00:10:14,790 --> 00:10:15,622
to that group.

237
00:10:15,622 --> 00:10:17,830
It's not great, because it
obliterates whatever group

238
00:10:17,830 --> 00:10:19,950
list the user had initially.

239
00:10:19,950 --> 00:10:21,200
But who knows?

240
00:10:21,200 --> 00:10:24,190
Maybe that's better
than nothing.

241
00:10:24,190 --> 00:10:26,550
Anyway, so it's a
fairly tricky problem

242
00:10:26,550 --> 00:10:29,600
to solve in an entirely
satisfactory fashion

243
00:10:29,600 --> 00:10:31,812
with these Unix mechanisms.

244
00:10:31,812 --> 00:10:33,770
Although, maybe you should
rethink your problem

245
00:10:33,770 --> 00:10:35,640
and not worry about
your statistics

246
00:10:35,640 --> 00:10:38,970
file as much in the first place.

247
00:10:38,970 --> 00:10:45,150
But how do we think about what's
going wrong in the design?

248
00:10:45,150 --> 00:10:47,920
Well, there's two things we
could try to learn from this,

249
00:10:47,920 --> 00:10:49,925
or basically, what went wrong.

250
00:10:53,120 --> 00:10:58,180
And one interpretation that
one party wants us to take away

251
00:10:58,180 --> 00:11:01,560
is this notion that he
calls ambient authority.

252
00:11:06,730 --> 00:11:08,300
So what is ambient authority?

253
00:11:08,300 --> 00:11:10,230
Can anyone figure
out what they meant?

254
00:11:10,230 --> 00:11:12,230
They've never
exactly defined this.

255
00:11:12,230 --> 00:11:12,730
Yeah?

256
00:11:12,730 --> 00:11:14,248
AUDIENCE: It means you
have the authority given

257
00:11:14,248 --> 00:11:15,590
to you by the environment.

258
00:11:15,590 --> 00:11:19,464
So as if [INAUDIBLE]
user with no limitations.

259
00:11:19,464 --> 00:11:20,130
PROFESSOR: Yeah.

260
00:11:20,130 --> 00:11:24,040
So you're making an
operation, and you can specify

261
00:11:24,040 --> 00:11:25,177
what operation you want.

262
00:11:25,177 --> 00:11:27,760
But the decision of whether that
operation is going to succeed

263
00:11:27,760 --> 00:11:30,850
comes from some extra implicit
parameters in your process,

264
00:11:30,850 --> 00:11:31,660
for example.

265
00:11:31,660 --> 00:11:34,970
And in Unix, you can figure
out what this ambient authority

266
00:11:34,970 --> 00:11:36,490
check might look like.

267
00:11:36,490 --> 00:11:38,860
So if you make a system
call, then you probably

268
00:11:38,860 --> 00:11:41,080
supplied some sort of a
name to a system call.

269
00:11:41,080 --> 00:11:43,340
And inside of the
kernel, this gets

270
00:11:43,340 --> 00:11:45,570
mapped to some
sort of an object.

271
00:11:45,570 --> 00:11:48,580
And the object presumably has
some kind of an access control

272
00:11:48,580 --> 00:11:52,110
list on it, like the permissions
on a file, et cetera.

273
00:11:52,110 --> 00:11:53,930
So there are some
permissions that you

274
00:11:53,930 --> 00:11:56,460
can get from the object.

275
00:11:56,460 --> 00:11:58,770
And that should decide
whether an operation

276
00:11:58,770 --> 00:12:00,480
is going to be
allowed on this name

277
00:12:00,480 --> 00:12:02,180
of the application supplied.

278
00:12:02,180 --> 00:12:04,400
This is sort of what the
application gets to see.

279
00:12:04,400 --> 00:12:06,850
Inside of the
kernel, there's also

280
00:12:06,850 --> 00:12:09,780
the current user ID of the
process making the calls.

281
00:12:09,780 --> 00:12:12,600
So this is the current prox UID.

282
00:12:15,140 --> 00:12:18,250
And this thing goes
into the decision

283
00:12:18,250 --> 00:12:22,710
of whether to allow a
particular operation or not.

284
00:12:22,710 --> 00:12:24,770
So it's the current
process user ID

285
00:12:24,770 --> 00:12:27,210
that's this ambient privilege.

286
00:12:27,210 --> 00:12:29,240
Whatever operation you're
going to try to do,

287
00:12:29,240 --> 00:12:31,540
the kernel will actually
try, in some sense,

288
00:12:31,540 --> 00:12:35,815
as hard as possible to allow
it by using your current UID,

289
00:12:35,815 --> 00:12:39,410
and your current GID and
whatever other extra privileges

290
00:12:39,410 --> 00:12:40,500
you might have.

291
00:12:40,500 --> 00:12:43,120
And as long as there's some set
of privileges that allow you

292
00:12:43,120 --> 00:12:45,690
to do it, it'll let you do it.

293
00:12:45,690 --> 00:12:47,065
Which is maybe
not the best thing

294
00:12:47,065 --> 00:12:51,080
to do if you aren't fully aware
of what all these problems are.

295
00:12:51,080 --> 00:12:53,010
Maybe you don't want
to use all of them

296
00:12:53,010 --> 00:12:57,910
to open a particular file or
make some other operation.

297
00:12:57,910 --> 00:13:01,867
Does this make sense, roughly
what ambient privilege is?

298
00:13:01,867 --> 00:13:03,325
In the case of an
operating system,

299
00:13:03,325 --> 00:13:05,910
it basically ends up being
the fact that a process has

300
00:13:05,910 --> 00:13:07,680
some sort of a user ID.

301
00:13:07,680 --> 00:13:11,570
Are there non-OS examples
of ambient privilege

302
00:13:11,570 --> 00:13:12,710
you guys can think of?

303
00:13:12,710 --> 00:13:15,280
Like when you're making
an operation, something

304
00:13:15,280 --> 00:13:17,525
about the identity of
the caller, the terms of

305
00:13:17,525 --> 00:13:18,900
whether they'll succeed or not.

306
00:13:21,640 --> 00:13:23,765
Like one example is
probably firewalls, as well.

307
00:13:23,765 --> 00:13:25,610
So this is just an OS example.

308
00:13:25,610 --> 00:13:29,940
And in privilege, another is
the firewalls on the network.

309
00:13:29,940 --> 00:13:32,570
Because any operation
you do from a machine

310
00:13:32,570 --> 00:13:35,890
inside of a firewall is
going to be allowed because,

311
00:13:35,890 --> 00:13:37,410
well, you just have
that IP address,

312
00:13:37,410 --> 00:13:39,930
or you're on that
side of a network.

313
00:13:39,930 --> 00:13:43,870
And if you're outside, the same
operation will be disallowed.

314
00:13:43,870 --> 00:13:47,330
So it's also a solar problem.

315
00:13:47,330 --> 00:13:50,850
Say you visit some website,
and the website includes a link

316
00:13:50,850 --> 00:13:53,794
to some different
server, well, maybe you

317
00:13:53,794 --> 00:13:55,710
don't want to use the
privileges that you have

318
00:13:55,710 --> 00:13:58,500
or the inside of your
network to access that link.

319
00:13:58,500 --> 00:14:00,500
Because maybe it'll access
your internal printer

320
00:14:00,500 --> 00:14:02,470
and exploit it in some way.

321
00:14:02,470 --> 00:14:05,021
And really, the guy that
provided you the link

322
00:14:05,021 --> 00:14:06,396
shouldn't have
been able to reach

323
00:14:06,396 --> 00:14:08,397
the printer in the first
place, because they

324
00:14:08,397 --> 00:14:09,230
were on the outside.

325
00:14:09,230 --> 00:14:14,190
Or a firewall that your browser,
maybe by visiting uplink,

326
00:14:14,190 --> 00:14:15,885
will be tricked into doing this.

327
00:14:15,885 --> 00:14:19,510
It's sort of a moral
equivalent of this confused

328
00:14:19,510 --> 00:14:21,010
problem on the network models.

329
00:14:21,010 --> 00:14:22,010
Yeah?

330
00:14:22,010 --> 00:14:25,344
AUDIENCE: [INAUDIBLE] permission
are directly affected also.

331
00:14:25,344 --> 00:14:26,010
PROFESSOR: Yeah.

332
00:14:26,010 --> 00:14:28,070
AUDIENCE: Because it's
essentially DAC, potentially,

333
00:14:28,070 --> 00:14:28,830
in the Capsicum.

334
00:14:28,830 --> 00:14:29,280
PROFESSOR: Yeah.

335
00:14:29,280 --> 00:14:31,250
So this is pretty much
what the Capsicum guys

336
00:14:31,250 --> 00:14:33,550
think of as discretionary
access control.

337
00:14:33,550 --> 00:14:35,800
And the fact that it's
discretionary, well,

338
00:14:35,800 --> 00:14:38,697
this is not quite what
discretionary access control

339
00:14:38,697 --> 00:14:39,470
means.

340
00:14:39,470 --> 00:14:41,790
But what discretionary
access control means

341
00:14:41,790 --> 00:14:45,350
is that the user, or
the owner of an object,

342
00:14:45,350 --> 00:14:48,609
can decide what security policy
will look like to an object.

343
00:14:48,609 --> 00:14:51,025
Which seems very natural in a
Unix setting. it's my files,

344
00:14:51,025 --> 00:14:51,970
I can decide what I want.

345
00:14:51,970 --> 00:14:54,386
I can give them to you, or I
can keep them private, great.

346
00:14:55,960 --> 00:14:58,700
So almost all DAC
systems do look

347
00:14:58,700 --> 00:15:01,300
like this, because they want to
have some sort of permissions

348
00:15:01,300 --> 00:15:04,450
that a user could modify
to control the security

349
00:15:04,450 --> 00:15:07,800
policy for their files.

350
00:15:07,800 --> 00:15:11,910
The flip side is
mandatory access control.

351
00:15:11,910 --> 00:15:15,257
We'll talk about it in a little
while, but at some level,

352
00:15:15,257 --> 00:15:17,340
they have this very
philosophically different view

353
00:15:17,340 --> 00:15:17,881
of the world.

354
00:15:17,881 --> 00:15:20,000
They think, well,
you're the user.

355
00:15:20,000 --> 00:15:22,240
But someone else will
set the security policy

356
00:15:22,240 --> 00:15:24,460
for how you use this computer.

357
00:15:24,460 --> 00:15:29,000
And this sort of came out of the
military in the '70s or '80s,

358
00:15:29,000 --> 00:15:32,946
when they really wanted to have
classified computer systems

359
00:15:32,946 --> 00:15:34,654
where, well, you're
working on some stuff

360
00:15:34,654 --> 00:15:35,613
and it's marked secret.

361
00:15:35,613 --> 00:15:37,737
I'm working on some stuff
that's marked top secret.

362
00:15:37,737 --> 00:15:39,113
So my stuff just
can't go to you.

363
00:15:39,113 --> 00:15:41,112
It's not up to me whether
to set the permissions

364
00:15:41,112 --> 00:15:42,000
on a file, et cetera.

365
00:15:42,000 --> 00:15:44,830
It's just not allowed
by some guy in charge.

366
00:15:44,830 --> 00:15:46,630
So mandatory access
control is really

367
00:15:46,630 --> 00:15:49,640
trying to enforce these
different kinds of policies

368
00:15:49,640 --> 00:15:52,500
in the first place,
where there's

369
00:15:52,500 --> 00:15:54,610
the user and the
application developer.

370
00:15:54,610 --> 00:15:56,910
And then there's some guy
separate from the user

371
00:15:56,910 --> 00:15:59,472
and the developer
that sets the policy.

372
00:15:59,472 --> 00:16:02,492
And, as you can sort of guess,
it doesn't always work out.

373
00:16:02,492 --> 00:16:03,950
Well, we'll talk
about it in a bit.

374
00:16:03,950 --> 00:16:06,001
But that's what discretionary
versus mandatory

375
00:16:06,001 --> 00:16:10,110
means at this control.

376
00:16:10,110 --> 00:16:11,310
All right.

377
00:16:11,310 --> 00:16:14,480
So there's many other examples
that you could imagine where

378
00:16:14,480 --> 00:16:16,040
we have ambient authority.

379
00:16:16,040 --> 00:16:20,910
And it's not inherently bad,
law but it's just something

380
00:16:20,910 --> 00:16:22,637
that you have to be
very careful about.

381
00:16:22,637 --> 00:16:24,470
If you have a system
with ambient authority,

382
00:16:24,470 --> 00:16:27,020
you should probably
be very careful

383
00:16:27,020 --> 00:16:29,595
if you're performing
privileged operations.

384
00:16:29,595 --> 00:16:31,220
You should make sure
that you're really

385
00:16:31,220 --> 00:16:35,980
using the right authority
and not accidentally being

386
00:16:35,980 --> 00:16:39,146
tricked very much like this
Fortran compiler 20 years ago.

387
00:16:39,146 --> 00:16:41,580
25 now.

388
00:16:41,580 --> 00:16:42,450
All right.

389
00:16:42,450 --> 00:16:45,470
So this is one interpretation
of what goes wrong.

390
00:16:45,470 --> 00:16:47,487
And it's not
necessarily the only way

391
00:16:47,487 --> 00:16:49,070
to think about what
goes wrong, right?

392
00:16:49,070 --> 00:16:51,192
Another possibility
is that, well,

393
00:16:51,192 --> 00:16:53,400
wouldn't it be nice if it
was easy for an application

394
00:16:53,400 --> 00:16:56,440
to tell whether it should
access a file on behalf

395
00:16:56,440 --> 00:16:57,445
of some principle?

396
00:16:57,445 --> 00:17:00,700
So maybe another problem
is that the access control

397
00:17:00,700 --> 00:17:02,024
checks are complicated.

398
00:17:07,381 --> 00:17:10,294
So in some sense, when the
Fortran compiler is running,

399
00:17:10,294 --> 00:17:13,900
and it's opening a file
on behalf of a user,

400
00:17:13,900 --> 00:17:16,660
it basically needs to
replicate the same exact logic

401
00:17:16,660 --> 00:17:20,240
we see drawn out here, except
that the Fortran compiler needs

402
00:17:20,240 --> 00:17:22,490
to plug-in something else here.

403
00:17:22,490 --> 00:17:25,770
Instead of using its current
privileges, and all of them,

404
00:17:25,770 --> 00:17:27,470
it should just
replicate this check

405
00:17:27,470 --> 00:17:32,150
and try to make it with a
different set of privileges.

406
00:17:32,150 --> 00:17:34,110
So in Unix, this
turns out to be fairly

407
00:17:34,110 --> 00:17:36,920
tricky to do, because
there's many places

408
00:17:36,920 --> 00:17:38,500
where these security
checks happen.

409
00:17:38,500 --> 00:17:41,020
if you have symbolic links,
then the symbolic link

410
00:17:41,020 --> 00:17:43,660
gets looked up, and
that path name also

411
00:17:43,660 --> 00:17:47,540
gets evaluated with someone's
privileges, et cetera.

412
00:17:47,540 --> 00:17:50,220
But it might be
that, in some system,

413
00:17:50,220 --> 00:17:51,940
you could simplify
this access control

414
00:17:51,940 --> 00:17:55,632
check, where you could do it
yourself in an application.

415
00:17:55,632 --> 00:17:59,320
Does that seem like a
reasonable plan to you guys?

416
00:17:59,320 --> 00:18:01,960
Would you go with that?

417
00:18:01,960 --> 00:18:03,640
Any dangers of
replicating these checks?

418
00:18:03,640 --> 00:18:04,260
Yeah?

419
00:18:04,260 --> 00:18:06,865
AUDIENCE: Well, if you do the
checks in the application,

420
00:18:06,865 --> 00:18:08,594
you could just
not do the checks.

421
00:18:08,594 --> 00:18:09,260
PROFESSOR: Yeah.

422
00:18:09,260 --> 00:18:10,360
So you could easily
miss the checks.

423
00:18:10,360 --> 00:18:11,360
That's absolutely right.

424
00:18:11,360 --> 00:18:13,680
So in some sense, what the
Fortran compiler did here,

425
00:18:13,680 --> 00:18:15,370
well, they didn't even bother
trying to do the checks,

426
00:18:15,370 --> 00:18:16,659
now that they screwed them up.

427
00:18:16,659 --> 00:18:18,950
Another possibility, in
addition to missing the checks,

428
00:18:18,950 --> 00:18:21,589
is maybe the kernel
will change over time,

429
00:18:21,589 --> 00:18:23,380
and it will have slightly
different checks.

430
00:18:23,380 --> 00:18:25,100
It will introduce some
extra security measure,

431
00:18:25,100 --> 00:18:26,766
and the application
will be left behind.

432
00:18:26,766 --> 00:18:28,455
And it will implement
old style checks.

433
00:18:28,455 --> 00:18:31,280
And probably not a great plan.

434
00:18:31,280 --> 00:18:34,862
So recall, one good
idea in security

435
00:18:34,862 --> 00:18:36,590
is to have economy
of mechanisms.

436
00:18:36,590 --> 00:18:39,222
So there's only a small number
of places that are enforcing

437
00:18:39,222 --> 00:18:40,180
your security policies.

438
00:18:40,180 --> 00:18:41,890
You probably don't
want to replicate

439
00:18:41,890 --> 00:18:45,520
the same functionality in
applications in the kernel,

440
00:18:45,520 --> 00:18:46,020
et cetera.

441
00:18:46,020 --> 00:18:48,090
You really want to boil
it down to one place.

442
00:18:48,090 --> 00:18:50,900
That roughly makes sense?

443
00:18:50,900 --> 00:18:52,070
OK.

444
00:18:52,070 --> 00:18:56,980
So what is this
capability, I guess,

445
00:18:56,980 --> 00:19:02,220
idea where thinking might
solve this authority problem?

446
00:19:02,220 --> 00:19:05,150
Well, there's some formal
definition for the thing.

447
00:19:05,150 --> 00:19:08,570
But really, you can get very
close by thinking of Unix file

448
00:19:08,570 --> 00:19:11,270
descriptors as a capability.

449
00:19:11,270 --> 00:19:15,210
So I guess the alternative
to this picture,

450
00:19:15,210 --> 00:19:18,470
in capability world,
is that instead

451
00:19:18,470 --> 00:19:20,510
of having the
application supply name,

452
00:19:20,510 --> 00:19:22,510
and you look up an object,
you get a permission,

453
00:19:22,510 --> 00:19:24,180
you decide whether
to allow it based

454
00:19:24,180 --> 00:19:25,910
on some ambient
authority, instead,

455
00:19:25,910 --> 00:19:28,920
the capability is the
picture looks very simple.

456
00:19:28,920 --> 00:19:32,230
You have a capability, and
if you have a capability,

457
00:19:32,230 --> 00:19:35,270
it points to an object.

458
00:19:35,270 --> 00:19:37,482
And maybe the capability
has some small number

459
00:19:37,482 --> 00:19:40,450
of restrictions of what
you can do with an object.

460
00:19:40,450 --> 00:19:43,340
But basically, if you have
the capability to an object,

461
00:19:43,340 --> 00:19:44,830
you can access the object.

462
00:19:44,830 --> 00:19:46,420
It's actually very simple.

463
00:19:46,420 --> 00:19:49,280
So there's no ambient
authority that

464
00:19:49,280 --> 00:19:51,470
decides whether an
operation on a capability

465
00:19:51,470 --> 00:19:53,310
is going to be allowed.

466
00:19:53,310 --> 00:19:55,290
The only thing is that
maybe the capability has

467
00:19:55,290 --> 00:19:57,623
a couple of extra bits, or
this mass that they described

468
00:19:57,623 --> 00:19:59,629
in the paper, which
says, well, you

469
00:19:59,629 --> 00:20:02,240
have a capability for this
file, as it's restricted

470
00:20:02,240 --> 00:20:03,470
to read operations only.

471
00:20:03,470 --> 00:20:07,440
Or it's restricted to write
or append operations only.

472
00:20:07,440 --> 00:20:10,885
And then your security decisions
are all of a sudden very easy.

473
00:20:10,885 --> 00:20:12,260
Because if you
have a capability,

474
00:20:12,260 --> 00:20:13,410
you can do something.

475
00:20:13,410 --> 00:20:15,248
If you don't, you can't.

476
00:20:15,248 --> 00:20:17,940
Make sense?

477
00:20:17,940 --> 00:20:21,430
So I guess one important
property of capability

478
00:20:21,430 --> 00:20:25,000
is that they should
actually be unforgeable,

479
00:20:25,000 --> 00:20:27,396
as the papers talk about.

480
00:20:27,396 --> 00:20:29,020
So what does it mean
to be unforgeable,

481
00:20:29,020 --> 00:20:31,900
or why do we want this
in this capability world?

482
00:20:34,980 --> 00:20:37,700
Well, I guess this actually
may be almost too obvious here.

483
00:20:37,700 --> 00:20:39,324
Well, if you can make
up any capability

484
00:20:39,324 --> 00:20:41,275
you want-- I can make
up a capability for any

485
00:20:41,275 --> 00:20:42,849
of your guys' files
and go access it.

486
00:20:42,849 --> 00:20:44,640
So if I can make it
up, and I'll access it,

487
00:20:44,640 --> 00:20:47,760
and there's nothing else in the
security design, that stops me

488
00:20:47,760 --> 00:20:54,030
from accessing an object once
I can manufacture a capability.

489
00:20:54,030 --> 00:20:55,870
So it's important that
these capabilities

490
00:20:55,870 --> 00:20:58,765
can't be made up out of
thin air by the application

491
00:20:58,765 --> 00:21:01,340
or by whatever's running.

492
00:21:01,340 --> 00:21:05,170
How is this getting forced, if
we think of file descriptors

493
00:21:05,170 --> 00:21:07,249
as capabilities?

494
00:21:07,249 --> 00:21:09,040
So many of you guys
actually submitted this

495
00:21:09,040 --> 00:21:11,300
as the big question
about Capsicum.

496
00:21:11,300 --> 00:21:13,080
What do you think?

497
00:21:13,080 --> 00:21:17,490
What prevents an application
from synthesizing a capability

498
00:21:17,490 --> 00:21:20,490
in this file descriptor world?

499
00:21:20,490 --> 00:21:24,310
Could you synthesize
a capability?

500
00:21:24,310 --> 00:21:24,950
Yeah?

501
00:21:24,950 --> 00:21:26,949
AUDIENCE: Well, it was
probably like a structure

502
00:21:26,949 --> 00:21:29,364
and a construct
that says that they

503
00:21:29,364 --> 00:21:31,504
have a capability for
certain file descriptors.

504
00:21:31,504 --> 00:21:32,170
PROFESSOR: Yeah.

505
00:21:32,170 --> 00:21:35,510
So it's actually
fairly easy to see

506
00:21:35,510 --> 00:21:37,500
what goes on once you
look at what exactly

507
00:21:37,500 --> 00:21:38,666
is a file descriptor, right?

508
00:21:38,666 --> 00:21:40,230
So a file descriptor
is basically

509
00:21:40,230 --> 00:21:42,040
just some sort of an integer.

510
00:21:42,040 --> 00:21:44,756
And this integer--
like in Unix, you

511
00:21:44,756 --> 00:21:46,880
have file descriptor 0,
which refers to your input,

512
00:21:46,880 --> 00:21:48,796
file descriptor 1 which
refers to your output.

513
00:21:48,796 --> 00:21:49,470
Rockwell

514
00:21:49,470 --> 00:21:52,580
But really, these are just
integers in user space.

515
00:21:52,580 --> 00:21:56,120
And this is what the
application can presumably do,

516
00:21:56,120 --> 00:21:58,380
and it can choose
any integer it wants.

517
00:21:58,380 --> 00:22:00,190
But whenever you
try to do something

518
00:22:00,190 --> 00:22:02,570
to a file descriptor, which
is one of these integers,

519
00:22:02,570 --> 00:22:05,640
the kernel will always
interpret the integer

520
00:22:05,640 --> 00:22:08,680
according to your current
process's file descriptor

521
00:22:08,680 --> 00:22:09,490
table.

522
00:22:09,490 --> 00:22:12,430
So for every PID-- let's
say, well, this is PID,

523
00:22:12,430 --> 00:22:13,395
I don't know, 57.

524
00:22:13,395 --> 00:22:14,830
So I'm process running.

525
00:22:14,830 --> 00:22:18,750
It has an open file
table, and each integer

526
00:22:18,750 --> 00:22:20,560
from supply from
user space, refers

527
00:22:20,560 --> 00:22:23,185
to some entry in this table.

528
00:22:23,185 --> 00:22:26,650
And of course, the kernel
should check that the integer

529
00:22:26,650 --> 00:22:28,000
is in bounds in this stable.

530
00:22:28,000 --> 00:22:29,630
It isn't negative.

531
00:22:29,630 --> 00:22:31,890
It doesn't go past
the end of the table.

532
00:22:31,890 --> 00:22:34,050
Otherwise, it will have
the usual buffer overflow

533
00:22:34,050 --> 00:22:35,630
problems, et cetera.

534
00:22:35,630 --> 00:22:38,517
But if you carefully
check that the integer is

535
00:22:38,517 --> 00:22:41,380
in bounds in the
kernel implementation,

536
00:22:41,380 --> 00:22:44,670
then the only possible
things that the application

537
00:22:44,670 --> 00:22:46,550
can refer to by
a file descriptor

538
00:22:46,550 --> 00:22:48,910
are entries in this table.

539
00:22:48,910 --> 00:22:51,060
So presumably, the
kernel will somehow

540
00:22:51,060 --> 00:22:54,640
make sure that you legitimately
guard a particular capability.

541
00:22:54,640 --> 00:22:58,810
So when you, for example, open a
file outside of this capability

542
00:22:58,810 --> 00:23:03,240
model in Unix, well, the kernel,
after the open call succeeds,

543
00:23:03,240 --> 00:23:07,420
it's going to change that
file descriptor table

544
00:23:07,420 --> 00:23:10,090
entry to point to a
particular open file,

545
00:23:10,090 --> 00:23:11,126
like maybe open/etc/pwd.

546
00:23:14,350 --> 00:23:17,380
And now, the entry at
this slot on the table

547
00:23:17,380 --> 00:23:18,580
points to an open file.

548
00:23:18,580 --> 00:23:20,080
Some of them might
actually be null.

549
00:23:20,080 --> 00:23:23,260
Maybe you don't have an open
file with a particular index

550
00:23:23,260 --> 00:23:24,660
in this table.

551
00:23:24,660 --> 00:23:29,000
And as a result, what does it
mean to forge a capability?

552
00:23:29,000 --> 00:23:30,700
The only thing you
can do in user space

553
00:23:30,700 --> 00:23:32,460
is make up an integer.

554
00:23:32,460 --> 00:23:35,230
And the only integers that
would make sense to make up

555
00:23:35,230 --> 00:23:38,560
would be entries that point to
non-null entries in this table.

556
00:23:38,560 --> 00:23:42,910
And those guys are exactly the
capabilities that you have.

557
00:23:42,910 --> 00:23:45,620
So does that make sense why
it's difficult, in this file

558
00:23:45,620 --> 00:23:47,750
descriptor world, to
actually forge capabilities

559
00:23:47,750 --> 00:23:48,542
in the first place?

560
00:23:48,542 --> 00:23:49,708
So it's kind of cool, right?

561
00:23:49,708 --> 00:23:52,130
Like the only files that
you have opened are exactly

562
00:23:52,130 --> 00:23:53,420
the things you can operate on.

563
00:23:53,420 --> 00:23:56,740
And there's nothing else
that you can potentially

564
00:23:56,740 --> 00:23:59,996
touch and effect.

565
00:23:59,996 --> 00:24:00,820
Make sense?

566
00:24:00,820 --> 00:24:01,403
Any questions?

567
00:24:05,630 --> 00:24:06,610
All right.

568
00:24:06,610 --> 00:24:07,110
OK.

569
00:24:07,110 --> 00:24:09,990
So I guess, how
would capabilities

570
00:24:09,990 --> 00:24:12,540
help solve the ambient
authority problem

571
00:24:12,540 --> 00:24:14,820
that Norman Hardy is excited
about with his Fortran

572
00:24:14,820 --> 00:24:16,020
compiler?

573
00:24:16,020 --> 00:24:19,790
So what would be the file
descriptor moral equivalent

574
00:24:19,790 --> 00:24:22,600
solution to this
sysx/fort thing?

575
00:24:25,682 --> 00:24:27,140
Do they actually
solve the problem?

576
00:24:27,140 --> 00:24:28,016
Yeah?

577
00:24:28,016 --> 00:24:31,590
AUDIENCE: Well, they just use
the appropriate capabilities

578
00:24:31,590 --> 00:24:33,160
whenever they're needed.

579
00:24:33,160 --> 00:24:36,660
So when you have to access the
output file, in the statistics,

580
00:24:36,660 --> 00:24:39,378
you use the capability
[INAUDIBLE] file.

581
00:24:39,378 --> 00:24:42,320
But when you're accessing the
file you're about to read,

582
00:24:42,320 --> 00:24:44,714
you don't use that capability.

583
00:24:44,714 --> 00:24:45,380
PROFESSOR: Yeah.

584
00:24:45,380 --> 00:24:48,370
So I guess really what it
boils down to is that somehow

585
00:24:48,370 --> 00:24:51,560
the Fortran compiler should just
already have a file descriptor

586
00:24:51,560 --> 00:24:54,280
open for that /sysx/stat file.

587
00:24:54,280 --> 00:24:57,660
So they don't really describe,
in their short paper,

588
00:24:57,660 --> 00:24:59,950
about how we don't
get that capability.

589
00:24:59,950 --> 00:25:02,340
But it basically means
you shouldn't really

590
00:25:02,340 --> 00:25:04,250
pass file names around.

591
00:25:04,250 --> 00:25:05,925
You shouldn't set
past file descriptors.

592
00:25:05,925 --> 00:25:08,270
So you could actually come
up with a perhaps much more

593
00:25:08,270 --> 00:25:12,540
elegant design for our Unix
replacement on the Fortran

594
00:25:12,540 --> 00:25:14,290
compiler using capabilities.

595
00:25:14,290 --> 00:25:19,530
So maybe the plan is we should
just have a Fortran compiler

596
00:25:19,530 --> 00:25:22,310
front end that doesn't
have any extra privileges,

597
00:25:22,310 --> 00:25:25,750
and it takes all these arguments
you give it, and converts

598
00:25:25,750 --> 00:25:30,340
all the path names you supply to
it into open file descriptors.

599
00:25:30,340 --> 00:25:33,540
So the alternative design
I am thinking of here

600
00:25:33,540 --> 00:25:36,160
is that maybe we'd
have a program

601
00:25:36,160 --> 00:25:38,200
fort1, which is the front end.

602
00:25:38,200 --> 00:25:40,345
And it would take some
sort of a file, foo.f,

603
00:25:40,345 --> 00:25:45,390
and all the other
arguments, -o, whatever.

604
00:25:45,390 --> 00:25:48,470
And it doesn't actually
implement any of the compiler

605
00:25:48,470 --> 00:25:50,020
logic, anything else.

606
00:25:50,020 --> 00:25:52,080
All it looks for is path
names in its arguments,

607
00:25:52,080 --> 00:25:54,870
and it's going to open
them and establish

608
00:25:54,870 --> 00:25:55,991
file descriptors for them.

609
00:25:56,471 --> 00:25:58,054
And the cool thing
is that, because it

610
00:25:58,054 --> 00:26:01,570
has no extra privileges, if
the user can't have access

611
00:26:01,570 --> 00:26:03,520
to some file name,
then it will fail.

612
00:26:03,520 --> 00:26:04,720
Those are great.

613
00:26:04,720 --> 00:26:07,280
And then once this front end
has opened all these file

614
00:26:07,280 --> 00:26:10,990
descriptors, it can execute
some privileged extra component,

615
00:26:10,990 --> 00:26:14,500
like the actual setuid
Fortran compiler.

616
00:26:14,500 --> 00:26:16,520
So maybe then it'll run fort.

617
00:26:16,520 --> 00:26:19,075
This guy's maybe setuid to
some special user ID that

618
00:26:19,075 --> 00:26:21,230
has access to the stats file.

619
00:26:21,230 --> 00:26:23,750
But it doesn't actually accept
any path names as input.

620
00:26:23,750 --> 00:26:27,250
All it's going to do is
accept file descriptors.

621
00:26:27,250 --> 00:26:29,550
And, in that case,
the file descriptor

622
00:26:29,550 --> 00:26:33,980
is already prove that the
caller had access to open them.

623
00:26:33,980 --> 00:26:35,845
Does the property make sense?

624
00:26:35,845 --> 00:26:37,800
So it of course doesn't
solve every issue.

625
00:26:37,800 --> 00:26:40,570
I'm just sort of sketching out
how capabilities might help.

626
00:26:40,570 --> 00:26:43,565
But that's roughly the plan,
is that you should demonstrate

627
00:26:43,565 --> 00:26:45,760
the fact that you have
access to a particular name

628
00:26:45,760 --> 00:26:49,190
by just opening it and passing
a capability, instead of saying,

629
00:26:49,190 --> 00:26:51,140
why didn't you try
to open this file

630
00:26:51,140 --> 00:26:54,457
and maybe accidentally
use some extra privileges.

631
00:26:54,457 --> 00:26:54,956
Yes.

632
00:26:54,956 --> 00:26:56,354
AUDIENCE: So does
this generalize

633
00:26:56,354 --> 00:26:59,137
to having one process
per capability?

634
00:26:59,137 --> 00:27:00,470
PROFESSOR: Does this generalize?

635
00:27:00,470 --> 00:27:02,330
Well, of course you can have
as many processes as you want.

636
00:27:02,330 --> 00:27:04,288
You can have multiple
processes per capability,

637
00:27:04,288 --> 00:27:05,324
but I'm not sure--

638
00:27:05,324 --> 00:27:06,240
AUDIENCE: [INAUDIBLE].

639
00:27:12,930 --> 00:27:16,480
PROFESSOR: I'm still not sure
what you mean by one property.

640
00:27:16,480 --> 00:27:19,222
AUDIENCE: So we have [INAUDIBLE]
capabilities the user has.

641
00:27:19,801 --> 00:27:20,800
PROFESSOR: That's right.

642
00:27:20,800 --> 00:27:22,633
AUDIENCE: And then we
have the fort.s access

643
00:27:22,633 --> 00:27:24,211
to this past file.

644
00:27:24,211 --> 00:27:25,210
PROFESSOR: That's right.

645
00:27:25,210 --> 00:27:25,470
Yeah.

646
00:27:25,470 --> 00:27:27,595
So the way to think of it
is, you don't necessarily

647
00:27:27,595 --> 00:27:31,516
need a separate process
for every capability.

648
00:27:31,516 --> 00:27:35,140
Because here, the fort1
thing might open many files

649
00:27:35,140 --> 00:27:38,590
and might pass many capabilities
to the privileged fort

650
00:27:38,590 --> 00:27:40,435
component.

651
00:27:40,435 --> 00:27:42,060
The problem here--
the reason that this

652
00:27:42,060 --> 00:27:44,030
might seem like you
want a separate process

653
00:27:44,030 --> 00:27:48,427
for every capability
is that we're

654
00:27:48,427 --> 00:27:51,010
sort of dealing with this weird
interface between capabilities

655
00:27:51,010 --> 00:27:52,450
and ambient privileges.

656
00:27:52,450 --> 00:27:54,780
Because fort1 sort of does
have ambient privilege.

657
00:27:54,780 --> 00:27:56,155
And what we're
doing is basically

658
00:27:56,155 --> 00:27:59,100
we're converting this ambient
privilege into capabilities

659
00:27:59,100 --> 00:28:00,890
in this fort1 process.

660
00:28:00,890 --> 00:28:02,580
So if you have multiple
different kinds

661
00:28:02,580 --> 00:28:05,035
of ambient privilege, or
multiple different privileges

662
00:28:05,035 --> 00:28:07,730
that you want to carefully
use, then maybe what you want

663
00:28:07,730 --> 00:28:10,320
is a separate process
holding that privilege.

664
00:28:10,320 --> 00:28:12,820
And whenever you want to use a
particular set of privileges,

665
00:28:12,820 --> 00:28:14,520
you'll ask the
corresponding process

666
00:28:14,520 --> 00:28:16,800
to please perform a separation.

667
00:28:16,800 --> 00:28:19,120
And if it succeeds, give
me back the capability.

668
00:28:19,120 --> 00:28:21,210
So that's maybe one
way to think of this.

669
00:28:24,000 --> 00:28:26,336
There's been actually some
operating system designs that

670
00:28:26,336 --> 00:28:30,770
are entirely capability-based,
there are no ambient privileges

671
00:28:30,770 --> 00:28:31,564
whatsoever.

672
00:28:31,564 --> 00:28:32,480
And it's kind of cool.

673
00:28:32,480 --> 00:28:35,961
Unfortunately, it's more of
sort of an interesting reading

674
00:28:35,961 --> 00:28:36,460
experience.

675
00:28:36,460 --> 00:28:37,905
Like oh, yeah, you can do it.

676
00:28:37,905 --> 00:28:38,920
That's pretty cool.

677
00:28:38,920 --> 00:28:42,680
But it's probably not
really practical to use

678
00:28:42,680 --> 00:28:45,540
in a real system, unfortunately.

679
00:28:45,540 --> 00:28:48,300
It turns out that you
really do want not so much

680
00:28:48,300 --> 00:28:51,200
ambient privilege but being
able to name an object

681
00:28:51,200 --> 00:28:53,960
and tell someone about an object
without conveying necessarily

682
00:28:53,960 --> 00:28:56,060
the rights to that object.

683
00:28:56,060 --> 00:28:57,670
So maybe I don't
know what privileges

684
00:28:57,670 --> 00:29:00,599
you might have over some
shared document, but I do

685
00:29:00,599 --> 00:29:02,890
want to tell you, hey, well,
there's a shared document.

686
00:29:02,890 --> 00:29:04,230
If you can read it, read it.

687
00:29:04,230 --> 00:29:05,605
If you write it,
great, write it.

688
00:29:05,605 --> 00:29:07,830
But I don't want to
necessarily convey any rights.

689
00:29:07,830 --> 00:29:10,960
I just want to tell you, hey,
there's this thing, go try it.

690
00:29:10,960 --> 00:29:13,540
So it's a bit of a bummer
in a capability world

691
00:29:13,540 --> 00:29:16,930
that it really forces
you to never talk

692
00:29:16,930 --> 00:29:21,050
about objects without conveying
rights to that object.

693
00:29:21,050 --> 00:29:24,910
So it's an important
idea to know about,

694
00:29:24,910 --> 00:29:27,240
and to use it in some
parts of a system,

695
00:29:27,240 --> 00:29:29,639
but probably not the
be all end all solution

696
00:29:29,639 --> 00:29:31,930
to security, much like almost
anything else [INAUDIBLE]

697
00:29:31,930 --> 00:29:33,419
about here.

698
00:29:33,419 --> 00:29:33,918
Make sense?

699
00:29:33,918 --> 00:29:34,810
Yeah?

700
00:29:34,810 --> 00:29:37,720
AUDIENCE: So if the process
has capabilities given to it

701
00:29:37,720 --> 00:29:40,811
by some other process,
and it happens

702
00:29:40,811 --> 00:29:43,395
to already have the capability
to that object, that's greater.

703
00:29:43,395 --> 00:29:45,269
Can it compare them to
make sure that they're

704
00:29:45,269 --> 00:29:46,629
about the same object?

705
00:29:46,629 --> 00:29:48,420
Or will it just use
the one that's greater?

706
00:29:48,420 --> 00:29:50,919
PROFESSOR: So the thing is that
a process doesn't implicitly

707
00:29:50,919 --> 00:29:51,794
use the capabilities.

708
00:29:51,794 --> 00:29:53,627
So that's the cool thing
about capabilities.

709
00:29:53,627 --> 00:29:55,760
You have to explicitly name
which one you're using.

710
00:29:55,760 --> 00:29:57,680
So think of it in terms
of file descriptors.

711
00:29:57,680 --> 00:30:01,820
Suppose that I give you an open
file descriptor for some file,

712
00:30:01,820 --> 00:30:02,807
and it's read only.

713
00:30:02,807 --> 00:30:04,890
And then someone else gives
you another capability

714
00:30:04,890 --> 00:30:07,431
for some other-- maybe the same
filem maybe a different file,

715
00:30:07,431 --> 00:30:08,760
and it's read/write.

716
00:30:08,760 --> 00:30:10,390
It's not all of a
sudden that if you're

717
00:30:10,390 --> 00:30:12,869
trying to write to the
first file descriptor

718
00:30:12,869 --> 00:30:14,660
you had that was read
only, all of a sudden

719
00:30:14,660 --> 00:30:16,390
those will start
succeeding because you

720
00:30:16,390 --> 00:30:19,270
have this extra writeable
file descriptor open.

721
00:30:19,270 --> 00:30:21,407
So that's sort of
the cool thing.

722
00:30:21,407 --> 00:30:22,990
You don't want this
ambient privilege.

723
00:30:22,990 --> 00:30:24,920
Because if you think
of these capabilities

724
00:30:24,920 --> 00:30:27,245
as a bunch of privileges
that just keep accumulating

725
00:30:27,245 --> 00:30:29,190
in your process, then
you'll actually just

726
00:30:29,190 --> 00:30:30,690
end up with ambient
privilege again.

727
00:30:30,690 --> 00:30:32,849
You just have all these
magic capabilities,

728
00:30:32,849 --> 00:30:34,765
and people have actually
built such libraries.

729
00:30:34,765 --> 00:30:37,197
Basically, well, they manage
your capabilities for you.

730
00:30:37,197 --> 00:30:38,280
They sort of collect them.

731
00:30:38,280 --> 00:30:39,680
And when you try to
perform an operation,

732
00:30:39,680 --> 00:30:40,670
they look for the
capabilities and find

733
00:30:40,670 --> 00:30:42,250
the one that'll make it work.

734
00:30:42,250 --> 00:30:44,500
That exactly brings you back
to this ambient authority

735
00:30:44,500 --> 00:30:45,890
that you were trying to avoid.

736
00:30:45,890 --> 00:30:47,390
So the cool thing
about capabilities

737
00:30:47,390 --> 00:30:50,670
is that it's almost like
a programming construct,

738
00:30:50,670 --> 00:30:52,875
where it makes it
easy for you-- which

739
00:30:52,875 --> 00:30:54,875
is a rare thing in
security-- it makes it easier

740
00:30:54,875 --> 00:30:56,950
for you to write
code that specifies

741
00:30:56,950 --> 00:30:59,200
exactly what privileges you
want to do from a security

742
00:30:59,200 --> 00:30:59,700
standpoint.

743
00:30:59,700 --> 00:31:02,570
And it's actually a fairly
natural code to write.

744
00:31:02,570 --> 00:31:05,280
So if you get into that mindset
of always carrying around

745
00:31:05,280 --> 00:31:07,450
this privilege with the
object you're accessing,

746
00:31:07,450 --> 00:31:09,210
it seems like a
cool thing to do.

747
00:31:09,210 --> 00:31:12,750
It doesn't always make
sense, but sometimes it does.

748
00:31:12,750 --> 00:31:16,070
Any other questions?

749
00:31:16,070 --> 00:31:16,640
OK.

750
00:31:16,640 --> 00:31:20,150
So that's more on
the ambient authority

751
00:31:20,150 --> 00:31:21,730
that we've look at here.

752
00:31:21,730 --> 00:31:23,640
It turns out that
capabilities are also

753
00:31:23,640 --> 00:31:26,100
great for other
problems, as well.

754
00:31:26,100 --> 00:31:30,000
And in particular, the
problem of managing privileges

755
00:31:30,000 --> 00:31:33,700
often shows up when you want
to run some untrustworthy code.

756
00:31:33,700 --> 00:31:35,370
Because you want
to really control

757
00:31:35,370 --> 00:31:37,280
which privileges you
give it, because you

758
00:31:37,280 --> 00:31:40,590
think it will misuse any
privileges you give it at all.

759
00:31:40,590 --> 00:31:44,150
And this is the slightly
different point of view

760
00:31:44,150 --> 00:31:46,960
from which the authors
of the Capsicum paper

761
00:31:46,960 --> 00:31:50,640
are coming at capabilities.

762
00:31:50,640 --> 00:31:53,575
So they're of course clearly
aware of this ambient authority

763
00:31:53,575 --> 00:31:55,450
problem, but it's sort
of a different problem

764
00:31:55,450 --> 00:31:57,720
that you might or might
not care about solving.

765
00:31:57,720 --> 00:32:00,960
But the particular thing
they really care about

766
00:32:00,960 --> 00:32:04,776
is they have a really large
privileged application,

767
00:32:04,776 --> 00:32:06,150
and they worry
that there's going

768
00:32:06,150 --> 00:32:10,480
to be bugs in different parts
of that application source code.

769
00:32:10,480 --> 00:32:12,900
So they would like to
reduce the privileges

770
00:32:12,900 --> 00:32:16,380
of different components
of that application.

771
00:32:16,380 --> 00:32:20,480
So in that sense, the story
is very similar to OKWS.

772
00:32:20,480 --> 00:32:24,459
So you have-- for
sandboxing, you

773
00:32:24,459 --> 00:32:27,000
have some large application,
you break it up into components,

774
00:32:27,000 --> 00:32:30,270
and you will limit what
privileges each component has.

775
00:32:30,270 --> 00:32:31,520
So where does this make sense?

776
00:32:31,520 --> 00:32:34,140
Like OKWS is
clearly one example.

777
00:32:34,140 --> 00:32:36,010
What are other situations
where you might

778
00:32:36,010 --> 00:32:40,280
care about prileged separation?

779
00:32:40,280 --> 00:32:43,707
Well, I guess in the paper
they describe the examples I

780
00:32:43,707 --> 00:32:44,540
actually got to run.

781
00:32:44,540 --> 00:32:48,320
So things like tcpdump
and other applications

782
00:32:48,320 --> 00:32:50,285
that parse network data.

783
00:32:50,285 --> 00:32:53,890
So why do they worry so
much about applications

784
00:32:53,890 --> 00:32:56,000
that parse network inputs?

785
00:32:56,000 --> 00:32:57,580
What goes wrong in tcpdump?

786
00:32:57,580 --> 00:32:58,656
Why are they so paranoid?

787
00:32:58,656 --> 00:33:01,036
AUDIENCE: Well, an attacker
can control what's being sent

788
00:33:01,036 --> 00:33:01,988
and what's being called.

789
00:33:01,988 --> 00:33:02,470
PROFESSOR: Yeah.

790
00:33:02,470 --> 00:33:04,020
I think what they
really worry about is,

791
00:33:04,020 --> 00:33:06,603
very much like with OKWS, they
worry about that attack surface

792
00:33:06,603 --> 00:33:08,900
and how much can an attacker
really control the inputs?

793
00:33:08,900 --> 00:33:11,970
And with these network
parsing programs,

794
00:33:11,970 --> 00:33:14,698
there's a lot of control
that that factor has.

795
00:33:14,698 --> 00:33:16,100
They have the exact packet.

796
00:33:16,100 --> 00:33:18,355
And the reason that
this was so problematic

797
00:33:18,355 --> 00:33:21,400
is that if you're
writing code in C that

798
00:33:21,400 --> 00:33:23,920
has to parse data
structures, you're presumably

799
00:33:23,920 --> 00:33:26,100
going to do lots of
pointer manipulations,

800
00:33:26,100 --> 00:33:28,830
copying bites into
arrays, allocating memory.

801
00:33:28,830 --> 00:33:32,450
And as you are now experts,
this is super fragile.

802
00:33:32,450 --> 00:33:34,875
And you can easily have
memory management errors

803
00:33:34,875 --> 00:33:38,155
that lead to pretty
disastrous consequences.

804
00:33:38,155 --> 00:33:39,530
So this is the
reason why they're

805
00:33:39,530 --> 00:33:43,990
very excited about sandboxing
various network protocol,

806
00:33:43,990 --> 00:33:45,790
parsing things, et cetera.

807
00:33:45,790 --> 00:33:47,850
Another probably
real world instance

808
00:33:47,850 --> 00:33:50,070
where you really care about
this is in your browser.

809
00:33:50,070 --> 00:33:52,070
You probably want to
sandbox your Flash plug-in,

810
00:33:52,070 --> 00:33:54,960
or your Java
extension, or whatnot.

811
00:33:54,960 --> 00:33:56,570
Because they're
pretty large attack

812
00:33:56,570 --> 00:33:58,430
surfaces as well
that have gotten

813
00:33:58,430 --> 00:34:01,352
exploited pretty aggressively.

814
00:34:01,352 --> 00:34:02,810
So it seems like
a reasonable plan.

815
00:34:02,810 --> 00:34:04,726
Like if you're writing
some piece of software,

816
00:34:04,726 --> 00:34:06,980
you want to sandbox
different components of it.

817
00:34:06,980 --> 00:34:08,790
What about more generally,
if you download something

818
00:34:08,790 --> 00:34:10,498
from the internet,
and you want to run it

819
00:34:10,498 --> 00:34:12,889
with fewer privileges?

820
00:34:12,889 --> 00:34:16,989
Is this sort of Capsicum style
isolation a good plan for that?

821
00:34:16,989 --> 00:34:19,500
I could download some random
screensaver or some game

822
00:34:19,500 --> 00:34:20,290
from the internet.

823
00:34:20,290 --> 00:34:21,590
And I want to run
it on my computer,

824
00:34:21,590 --> 00:34:23,381
and I want to make sure
it doesn't screw up

825
00:34:23,381 --> 00:34:24,690
whatever I have laying around.

826
00:34:27,802 --> 00:34:28,760
Would you use Capsicum?

827
00:34:28,760 --> 00:34:31,588
Would this be a good plan?

828
00:34:31,588 --> 00:34:33,046
Yeah?

829
00:34:33,046 --> 00:34:35,476
AUDIENCE: You could write
a sandboxing program,

830
00:34:35,476 --> 00:34:38,878
which you'd use Capsicum
to sandbox [INAUDIBLE].

831
00:34:42,652 --> 00:34:43,360
PROFESSOR: Right.

832
00:34:43,360 --> 00:34:44,900
You could try to use Capsicum.

833
00:34:44,900 --> 00:34:46,150
So how would you use Capsicum?

834
00:34:46,150 --> 00:34:49,380
Well, you'd just enter into the
sandbox mode with cap_enter.

835
00:34:49,380 --> 00:34:53,330
And then you run the program.

836
00:34:53,330 --> 00:34:54,514
Would you expect it to work?

837
00:34:56,887 --> 00:34:59,220
I guess the problem is that
if the program wasn't really

838
00:34:59,220 --> 00:35:01,155
expecting to be
sandboxed with Capsicum,

839
00:35:01,155 --> 00:35:04,920
then all of a sudden the
program will try to open any

840
00:35:04,920 --> 00:35:07,460
simplified-- it'll
open a shared library,

841
00:35:07,460 --> 00:35:09,430
and it can't open
the shared library,

842
00:35:09,430 --> 00:35:11,570
because it can't
open/liv/ something else.

843
00:35:11,570 --> 00:35:13,810
That's not allowed
in capability mode.

844
00:35:13,810 --> 00:35:16,790
So it's a bit of a problem.

845
00:35:16,790 --> 00:35:18,800
So typically, these
sandboxing techniques

846
00:35:18,800 --> 00:35:21,685
that we're going to look at
here-- capabilities, style,

847
00:35:21,685 --> 00:35:24,850
stuff, and so on--
really are best

848
00:35:24,850 --> 00:35:27,400
used when the developer
is sort of building

849
00:35:27,400 --> 00:35:30,110
the application aware
that the code is

850
00:35:30,110 --> 00:35:31,882
going to run in this mode.

851
00:35:31,882 --> 00:35:34,260
There's probably other kinds
of sandboxing techniques

852
00:35:34,260 --> 00:35:36,550
that could be used
for unmodified code,

853
00:35:36,550 --> 00:35:40,270
but then the focus, or the
requirements, change a bit.

854
00:35:40,270 --> 00:35:42,410
So in Capsicum,
they don't really

855
00:35:42,410 --> 00:35:43,910
worry about backwards
compatibility.

856
00:35:43,910 --> 00:35:45,320
Well, we have to open
files differently?

857
00:35:45,320 --> 00:35:46,770
Sure, we'll open
them differently.

858
00:35:46,770 --> 00:35:48,820
Whereas, if you want
to write existing code,

859
00:35:48,820 --> 00:35:51,330
you probably want
something more like maybe

860
00:35:51,330 --> 00:35:52,450
a full virtual machine.

861
00:35:52,450 --> 00:35:55,040
So you could open a
VM and run it there.

862
00:35:55,040 --> 00:35:58,400
And it's very
compatible, and there's

863
00:35:58,400 --> 00:36:03,440
no question that it'll just
run, and probably not--

864
00:36:03,440 --> 00:36:07,060
Well, it's actually a
good thought exercise.

865
00:36:07,060 --> 00:36:11,970
Should we use virtual machines
to sandbox instead of Capsicum?

866
00:36:11,970 --> 00:36:12,886
AUDIENCE: [INAUDIBLE].

867
00:36:12,886 --> 00:36:13,690
PROFESSOR: Yeah.

868
00:36:13,690 --> 00:36:16,510
The overheads are probably
quite significant.

869
00:36:16,510 --> 00:36:20,715
So the memory overhead
is pretty bad.

870
00:36:20,715 --> 00:36:21,325
It could be.

871
00:36:21,325 --> 00:36:22,900
But what if we don't care
about memory overhead?

872
00:36:22,900 --> 00:36:24,691
So maybe virtual machines
gets really good,

873
00:36:24,691 --> 00:36:28,080
and they don't actually
use that much memory.

874
00:36:28,080 --> 00:36:30,210
Is it still a bad plan?

875
00:36:30,210 --> 00:36:32,708
AUDIENCE: [INAUDIBLE].

876
00:36:32,708 --> 00:36:33,374
PROFESSOR: Yeah.

877
00:36:33,374 --> 00:36:37,160
So it's kind of hard to control
what happens on the network,

878
00:36:37,160 --> 00:36:40,150
because either you give the
virtual machine no access

879
00:36:40,150 --> 00:36:42,570
to the network at all, or
you connect to a network

880
00:36:42,570 --> 00:36:45,800
through NAT mode or something
in Preview or VMware.

881
00:36:45,800 --> 00:36:47,550
And then it can access
the whole internet.

882
00:36:47,550 --> 00:36:52,652
So you have to much more
explicitly control network

883
00:36:52,652 --> 00:36:55,110
by maybe setting up firewall
rules for the virtual machine,

884
00:36:55,110 --> 00:36:55,797
et cetera.

885
00:36:55,797 --> 00:36:56,880
That's maybe not so great.

886
00:36:56,880 --> 00:36:58,890
What if you don't
care about network?

887
00:36:58,890 --> 00:37:04,240
What if you're some simple
video or tcpdump parser.

888
00:37:04,240 --> 00:37:05,260
You just spin up a VM.

889
00:37:05,260 --> 00:37:07,000
It's going to parse
your tcpdump packets

890
00:37:07,000 --> 00:37:09,490
and spit you back
after your presentation

891
00:37:09,490 --> 00:37:11,850
that tcpdump wants
to burn to the user.

892
00:37:11,850 --> 00:37:14,190
So there's no real
network I/O. Maybe you're,

893
00:37:14,190 --> 00:37:20,820
for some reason
[INAUDIBLE] still?

894
00:37:20,820 --> 00:37:23,340
AUDIENCE: Because the
initialization overhead

895
00:37:23,340 --> 00:37:24,656
is still large.

896
00:37:24,656 --> 00:37:25,490
PROFESSOR: Yeah.

897
00:37:25,490 --> 00:37:27,823
So it's maybe like an initial
overhead of starting a VM.

898
00:37:27,823 --> 00:37:28,620
So that's true.

899
00:37:28,620 --> 00:37:32,030
There's some performance stuff.

900
00:37:32,030 --> 00:37:32,530
Yeah.

901
00:37:32,530 --> 00:37:34,780
AUDIENCE: Well, you might
want to have database rights

902
00:37:34,780 --> 00:37:35,762
and things like that.

903
00:37:35,762 --> 00:37:36,200
PROFESSOR: Yeah.

904
00:37:36,200 --> 00:37:38,158
But even more generally,
what you're getting at

905
00:37:38,158 --> 00:37:41,140
is what if there's a real
data that you care about here?

906
00:37:41,140 --> 00:37:42,840
And it's really hard to share.

907
00:37:42,840 --> 00:37:45,990
So VMs are really
a much more sort

908
00:37:45,990 --> 00:37:50,040
of separation mechanism, where
you can't really share stuff

909
00:37:50,040 --> 00:37:51,970
across VMs very easily.

910
00:37:51,970 --> 00:37:53,640
So it's good for
situations where

911
00:37:53,640 --> 00:37:57,090
you have a very isolated program
you want to run, you basically

912
00:37:57,090 --> 00:37:59,470
don't want to share any
files with any directories,

913
00:37:59,470 --> 00:38:01,830
any processes, any pipes even.

914
00:38:01,830 --> 00:38:03,640
And you just let
it run separately.

915
00:38:03,640 --> 00:38:04,290
So it's great.

916
00:38:04,290 --> 00:38:07,340
It's probably, in some ways,
stronger isolation than what

917
00:38:07,340 --> 00:38:10,340
Capsicum provides, because
there's probably fewer

918
00:38:10,340 --> 00:38:12,865
ways for things to go wrong.

919
00:38:12,865 --> 00:38:14,240
And, you know,
all these problems

920
00:38:14,240 --> 00:38:15,640
we talked about so far.

921
00:38:15,640 --> 00:38:18,189
But it's also not applicable
in many of the situations

922
00:38:18,189 --> 00:38:19,730
where you might want
to use Capsicum,

923
00:38:19,730 --> 00:38:21,880
because in Capsicum,
you can actually

924
00:38:21,880 --> 00:38:26,645
share files that have very fine
granularity between sandbox

925
00:38:26,645 --> 00:38:30,342
[INAUDIBLE] by just giving it
capability to [INAUDIBLE] file.

926
00:38:30,342 --> 00:38:32,550
This is something that's
very easy to do in Capsicum,

927
00:38:32,550 --> 00:38:35,220
and would require quite
a bit of machinery

928
00:38:35,220 --> 00:38:37,280
in a virtual machine setting.

929
00:38:37,280 --> 00:38:40,720
That makes sense?

930
00:38:40,720 --> 00:38:43,200
Questions?

931
00:38:43,200 --> 00:38:44,330
All right.

932
00:38:44,330 --> 00:38:47,600
So does that seem like
a useful primitives

933
00:38:47,600 --> 00:38:49,340
to have to maybe sandbox stuff.

934
00:38:49,340 --> 00:38:53,040
So I guess we're going to
talk about different ways

935
00:38:53,040 --> 00:38:54,900
to try to sandbox something.

936
00:38:54,900 --> 00:38:58,060
And Capsicum in particular
is the new thing here

937
00:38:58,060 --> 00:38:59,270
that uses capabilities.

938
00:38:59,270 --> 00:39:05,810
But just by comparison,
I guess, you

939
00:39:05,810 --> 00:39:08,350
can do some sandboxing in
Unix, as we saw with OKWS.

940
00:39:08,350 --> 00:39:08,850
Right?

941
00:39:08,850 --> 00:39:13,170
It's just not great from
several standpoints.

942
00:39:13,170 --> 00:39:17,860
So let's maybe take
the example of tcpdump

943
00:39:17,860 --> 00:39:24,530
and see why tcpdump is difficult
to sandbox with Unix mechanism.

944
00:39:24,530 --> 00:39:27,880
So remember, in the Capsicum
paper, these guys took tcpdump.

945
00:39:27,880 --> 00:39:32,570
And the way tcpdump
works is that it

946
00:39:32,570 --> 00:39:39,080
opens some special sockets and
then runs basically parsing

947
00:39:39,080 --> 00:39:41,010
logic on network packets.

948
00:39:41,010 --> 00:39:44,860
And it proceeds and prints them
out to the users' terminal.

949
00:39:44,860 --> 00:39:51,180
So what would it take to sandbox
tcpdump with Unix primitives?

950
00:39:51,180 --> 00:39:54,066
Have you restricted privileges?

951
00:39:54,066 --> 00:39:55,870
So I guess the one
problem with Unix

952
00:39:55,870 --> 00:39:59,300
is that you basically have
to-- well, the only way

953
00:39:59,300 --> 00:40:01,890
to really change
privileges is to change

954
00:40:01,890 --> 00:40:04,152
the inputs into the
decision function that

955
00:40:04,152 --> 00:40:06,610
decides whether you can actually
access some object or not.

956
00:40:06,610 --> 00:40:09,160
And the only things
you can really change

957
00:40:09,160 --> 00:40:11,860
are, well, you can change
the privilges of the process,

958
00:40:11,860 --> 00:40:14,300
which means it sends
UID to something else.

959
00:40:14,300 --> 00:40:15,800
Or you could change
the permissions

960
00:40:15,800 --> 00:40:21,510
on various objects that are
laying around in your system.

961
00:40:21,510 --> 00:40:23,330
Or probably both,
in fact, right?

962
00:40:23,330 --> 00:40:25,110
If you wanted to
sandbox tcpdump,

963
00:40:25,110 --> 00:40:27,850
you'd probably have to
pick some extra user ID

964
00:40:27,850 --> 00:40:31,612
and switch to that
while you're running.

965
00:40:31,612 --> 00:40:36,660
Probably not an ideal
plan, because you probably

966
00:40:36,660 --> 00:40:39,340
don't mean for multiple
instances of tcpdump

967
00:40:39,340 --> 00:40:41,049
to run as the same user ID.

968
00:40:41,049 --> 00:40:42,840
So if I compromise one
instance of tcpdump,

969
00:40:42,840 --> 00:40:45,307
it doesn't really mean I
want to allow that factor

970
00:40:45,307 --> 00:40:47,515
to now control the other
instances of tcpdump running

971
00:40:47,515 --> 00:40:49,070
on my machine.

972
00:40:49,070 --> 00:40:53,614
So that's potentially a bad
part of using user IDs here.

973
00:40:53,614 --> 00:40:55,530
Another problem is that,
in Unix, you actually

974
00:40:55,530 --> 00:40:58,924
have to be root in
order to change the user

975
00:40:58,924 --> 00:41:01,215
ID of the process or something
else, or user privileges

976
00:41:01,215 --> 00:41:03,200
or switch them to
something else.

977
00:41:03,200 --> 00:41:05,060
That's not great either.

978
00:41:05,060 --> 00:41:08,080
And another problem
is that, regardless

979
00:41:08,080 --> 00:41:11,700
of what your user ID
is, there could be files

980
00:41:11,700 --> 00:41:13,830
that allow access to them.

981
00:41:13,830 --> 00:41:16,760
So there could be world
writable or world readable files

982
00:41:16,760 --> 00:41:17,800
in your file system.

983
00:41:17,800 --> 00:41:19,730
Like your etc password file.

984
00:41:19,730 --> 00:41:22,370
Regardless of what your
UID is, the process

985
00:41:22,370 --> 00:41:24,420
will still be able to
read that password.

986
00:41:24,420 --> 00:41:26,070
So that's not so nice.

987
00:41:26,070 --> 00:41:29,850
So the result, in order
to sandbox a unit,

988
00:41:29,850 --> 00:41:36,257
you probably have to do both--
some UID changing and maybe

989
00:41:36,257 --> 00:41:38,340
careful look at the
permissions of all the objects

990
00:41:38,340 --> 00:41:40,507
to convince yourself that
there's no world writeable

991
00:41:40,507 --> 00:41:41,714
file that's really sensitive.

992
00:41:41,714 --> 00:41:43,130
Or there's no
world readable file

993
00:41:43,130 --> 00:41:45,742
that you don't want that
hacker to get access to.

994
00:41:45,742 --> 00:41:48,200
And I guess [INAUDIBLE] true
that you get another mechanism

995
00:41:48,200 --> 00:41:49,530
unit that you can use.

996
00:41:49,530 --> 00:41:50,920
But it all starts to add up.

997
00:41:50,920 --> 00:41:52,420
If you see it
through, then it might

998
00:41:52,420 --> 00:41:56,681
be hard to share files or
share directories and so on.

999
00:41:56,681 --> 00:41:57,680
So does that make sense?

1000
00:41:57,680 --> 00:42:00,466
Just in terms of
contrast for what

1001
00:42:00,466 --> 00:42:02,393
Capsicum is trying to solve?

1002
00:42:02,393 --> 00:42:06,160
Any questions about Unix stuff?

1003
00:42:06,160 --> 00:42:06,950
All right.

1004
00:42:06,950 --> 00:42:10,650
So let's look at how Capsicum
tries to solve this problem.

1005
00:42:10,650 --> 00:42:13,680
So in Capsicum, as
we keep alluding to,

1006
00:42:13,680 --> 00:42:18,330
the plan is very much that once
you enter the sandboxing mode,

1007
00:42:18,330 --> 00:42:20,879
everything is going
to be accessed only

1008
00:42:20,879 --> 00:42:21,670
through capability.

1009
00:42:21,670 --> 00:42:23,490
So if you don't
have a capability,

1010
00:42:23,490 --> 00:42:27,610
you simply cannot
access any objects.

1011
00:42:27,610 --> 00:42:32,000
So these guys, in the
paper, make a huge deal

1012
00:42:32,000 --> 00:42:34,870
about global namespaces.

1013
00:42:34,870 --> 00:42:37,720
So what's this thing
about a global namespace,

1014
00:42:37,720 --> 00:42:39,460
and why are they so
worried about it?

1015
00:42:43,155 --> 00:42:44,780
What's an example of
a global namespace

1016
00:42:44,780 --> 00:42:47,379
these guys worry about?

1017
00:42:47,379 --> 00:42:48,534
AUDIENCE: [INAUDIBLE].

1018
00:42:48,534 --> 00:42:49,200
PROFESSOR: Yeah.

1019
00:42:49,200 --> 00:42:51,634
So a file system from them
is sort of the prime example

1020
00:42:51,634 --> 00:42:52,550
of a global namespace.

1021
00:42:52,550 --> 00:42:55,420
You can start a slash, and
you can basically enumerate

1022
00:42:55,420 --> 00:42:56,690
any file you could, right?

1023
00:42:56,690 --> 00:42:59,450
Like go to someone's
home directory--

1024
00:42:59,450 --> 00:43:03,748
/home/nickolai/
something, something.

1025
00:43:03,748 --> 00:43:04,860
Why is this bad?

1026
00:43:04,860 --> 00:43:08,470
Why are they against global
namespaces in Capsicum?

1027
00:43:14,350 --> 00:43:15,100
What do you think?

1028
00:43:15,100 --> 00:43:15,460
Yeah?

1029
00:43:15,460 --> 00:43:17,543
AUDIENCE: Well, if you
have the wrong permissions,

1030
00:43:17,543 --> 00:43:20,534
then use authorities, and
then you can get in trouble.

1031
00:43:20,534 --> 00:43:21,200
PROFESSOR: Yeah.

1032
00:43:21,200 --> 00:43:23,116
So the problem is that
this is Unix after all.

1033
00:43:23,116 --> 00:43:27,370
So there are still regular
permissions on file.

1034
00:43:27,370 --> 00:43:29,790
So maybe you really want
to sandbox some process

1035
00:43:29,790 --> 00:43:31,804
and can't read anything
at all in the system

1036
00:43:31,804 --> 00:43:32,970
and can't write to anything.

1037
00:43:32,970 --> 00:43:35,530
But if you can name a file
starting from scratch,

1038
00:43:35,530 --> 00:43:38,060
you'll find some stupid user
that has a world writable

1039
00:43:38,060 --> 00:43:39,970
file in their home directory.

1040
00:43:39,970 --> 00:43:43,874
And that would be not so great
for the sandboxing client.

1041
00:43:43,874 --> 00:43:46,290
And I guess more generally,
the way they're thinking of it

1042
00:43:46,290 --> 00:43:50,430
is that, with capabilities, you
could, in principle, enumerate

1043
00:43:50,430 --> 00:43:53,122
exactly all the objects
that a process has.

1044
00:43:53,122 --> 00:43:56,030
Because you could just
enumerate all the capabilities

1045
00:43:56,030 --> 00:43:58,350
in the file descriptor table,
or whatever it is that's

1046
00:43:58,350 --> 00:44:00,250
storing capabilities for you.

1047
00:44:00,250 --> 00:44:03,970
And those are the only things
that the process could ever

1048
00:44:03,970 --> 00:44:05,734
touch.

1049
00:44:05,734 --> 00:44:07,900
And if you ever have access
to our global namespace,

1050
00:44:07,900 --> 00:44:09,090
and this was
potentially unbounded.

1051
00:44:09,090 --> 00:44:10,540
Because you could--
even if you have

1052
00:44:10,540 --> 00:44:11,920
some limited set
of capabilities,

1053
00:44:11,920 --> 00:44:14,850
maybe you'll start from slash
again and find some new file,

1054
00:44:14,850 --> 00:44:16,510
and you'll never
really know what

1055
00:44:16,510 --> 00:44:19,745
is the set of
operations or objects

1056
00:44:19,745 --> 00:44:22,120
that a process could access.

1057
00:44:22,120 --> 00:44:25,370
So this is the reason they're so
worried about global namespaces

1058
00:44:25,370 --> 00:44:28,775
because it goes against their
goal of precisely controlling

1059
00:44:28,775 --> 00:44:33,880
all the things that a sandbox
process should have access to.

1060
00:44:33,880 --> 00:44:36,440
Make sense?

1061
00:44:36,440 --> 00:44:37,590
All right.

1062
00:44:37,590 --> 00:44:39,850
So they tried to eliminate
global namespaces

1063
00:44:39,850 --> 00:44:44,590
with a bunch of kernel changes
to the FreeBSD, in their case,

1064
00:44:44,590 --> 00:44:47,960
kernel to make sure that
all the operations go

1065
00:44:47,960 --> 00:44:52,220
through some kind of capability,
which is, in their case,

1066
00:44:52,220 --> 00:44:54,190
a file descriptor.

1067
00:44:54,190 --> 00:44:57,800
So just to double check, do
we really need kernel changes?

1068
00:44:57,800 --> 00:45:00,350
What if we just do
this in a library?

1069
00:45:00,350 --> 00:45:03,040
So we implement Capsicum, which
they already have a library.

1070
00:45:03,040 --> 00:45:05,700
And all we do is we change
all these functions,

1071
00:45:05,700 --> 00:45:08,590
like open, read, and write,
to all very exclusive use

1072
00:45:08,590 --> 00:45:09,927
capabilities.

1073
00:45:09,927 --> 00:45:12,010
So all operations will go
through some capability,

1074
00:45:12,010 --> 00:45:16,193
and look it up in the
file table, et cetera.

1075
00:45:16,193 --> 00:45:17,140
Does that work?

1076
00:45:17,140 --> 00:45:17,640
Yeah?

1077
00:45:17,640 --> 00:45:19,730
AUDIENCE: You could
always make a sys call.

1078
00:45:19,730 --> 00:45:20,010
PROFESSOR: Yeah.

1079
00:45:20,010 --> 00:45:22,551
So the problem is that there
was this existing set of systems

1080
00:45:22,551 --> 00:45:23,866
calls the kernel will accept.

1081
00:45:23,866 --> 00:45:25,866
And even if you
implement a nice library,

1082
00:45:25,866 --> 00:45:28,240
it doesn't prevent a bad
process or a compromised process

1083
00:45:28,240 --> 00:45:29,656
from making the
sys call directly.

1084
00:45:29,656 --> 00:45:32,270
And then you have to
have the kernel enforce

1085
00:45:32,270 --> 00:45:33,786
something or other.

1086
00:45:33,786 --> 00:45:34,286
Yeah?

1087
00:45:34,286 --> 00:45:36,724
AUDIENCE: [INAUDIBLE].

1088
00:45:36,724 --> 00:45:37,390
PROFESSOR: Yeah.

1089
00:45:37,390 --> 00:45:39,247
So I think it's a
question of-- I guess

1090
00:45:39,247 --> 00:45:40,330
what is your threat model?

1091
00:45:40,330 --> 00:45:40,830
Exactly.

1092
00:45:40,830 --> 00:45:42,580
So for the compiler,
the threat model

1093
00:45:42,580 --> 00:45:47,230
is that the programmer is
maybe not paying attention

1094
00:45:47,230 --> 00:45:50,240
a whole lot, but it's not really
a compromised compiler process,

1095
00:45:50,240 --> 00:45:51,710
not an arbitrary code.

1096
00:45:51,710 --> 00:45:54,750
So if we just help the
well-meaning developer do

1097
00:45:54,750 --> 00:45:58,590
the right thing, then a
library will probably suffice.

1098
00:45:58,590 --> 00:46:00,990
On the other hand, if we're
talking about a process that

1099
00:46:00,990 --> 00:46:03,110
could be our executing
arbitrary code

1100
00:46:03,110 --> 00:46:05,610
and could be trying to
bypass our mechanisms

1101
00:46:05,610 --> 00:46:07,210
in any possible
way, then we have

1102
00:46:07,210 --> 00:46:09,370
to have a strong
enforcement boundary.

1103
00:46:09,370 --> 00:46:12,160
And a library doesn't provide
any kind of strong enforcement

1104
00:46:12,160 --> 00:46:12,660
guarantees.

1105
00:46:12,660 --> 00:46:16,311
Whereas a kernel, in
our case, would do that.

1106
00:46:16,311 --> 00:46:16,810
OK.

1107
00:46:16,810 --> 00:46:20,805
So what do they actually make in
terms of changes to the kernel?

1108
00:46:20,805 --> 00:46:25,270
So I guess the first
thing is this system call

1109
00:46:25,270 --> 00:46:26,780
that they call cap_enter.

1110
00:46:30,750 --> 00:46:33,049
And what happens once
you run cap_enter?

1111
00:46:33,049 --> 00:46:35,215
Once you've [INAUDIBLE]
cap_enter from your process?

1112
00:46:38,309 --> 00:46:39,850
So as far as I can
tell, what happens

1113
00:46:39,850 --> 00:46:44,950
is that the kernel will stop
accepting any system calls that

1114
00:46:44,950 --> 00:46:47,635
refer to global namespaces.

1115
00:46:47,635 --> 00:46:49,260
And the only thing
you'll be able to do

1116
00:46:49,260 --> 00:46:52,650
is refer to existing
file descriptors

1117
00:46:52,650 --> 00:46:54,810
that you have open
in your process.

1118
00:46:54,810 --> 00:46:58,340
So cap_enter will put your
process in a special mode where

1119
00:46:58,340 --> 00:47:02,265
you cannot use the regular
system called open,

1120
00:47:02,265 --> 00:47:06,059
and instead you have to
do things like openat.

1121
00:47:06,059 --> 00:47:07,475
So there's this
new sort of family

1122
00:47:07,475 --> 00:47:10,830
of systems called, in Unix
like operating systems, where

1123
00:47:10,830 --> 00:47:13,280
instead of having open
take a single path name,

1124
00:47:13,280 --> 00:47:15,850
you can actually
you openat, where

1125
00:47:15,850 --> 00:47:17,560
you pass it a first
argument which

1126
00:47:17,560 --> 00:47:20,110
is a file descriptor
for a directory

1127
00:47:20,110 --> 00:47:23,640
and the second is
some sort of a name.

1128
00:47:23,640 --> 00:47:27,610
And the open at system
call will open this name

1129
00:47:27,610 --> 00:47:31,250
relative to whatever directory
the file descriptor points to.

1130
00:47:31,250 --> 00:47:33,430
So this is a much more
capability-like version

1131
00:47:33,430 --> 00:47:36,930
of open, where you can still
have file descriptors pointing

1132
00:47:36,930 --> 00:47:42,580
to directories, but
you can-- well, sorry.

1133
00:47:42,580 --> 00:47:44,795
You can still direct
your operation.

1134
00:47:44,795 --> 00:47:46,170
But in order to
do this, you have

1135
00:47:46,170 --> 00:47:47,872
to have a capability
to the directory

1136
00:47:47,872 --> 00:47:49,830
in the form of an open
file descriptor for that

1137
00:47:49,830 --> 00:47:51,200
[INAUDIBLE].

1138
00:47:51,200 --> 00:47:53,944
Make sense?

1139
00:47:53,944 --> 00:47:55,290
OK.

1140
00:47:55,290 --> 00:47:58,480
So do they need any
other kernel changes?

1141
00:47:58,480 --> 00:48:00,630
Is there anything
else they worry about?

1142
00:48:04,520 --> 00:48:06,086
So I guess there's
another-- yeah?

1143
00:48:06,086 --> 00:48:07,650
AUDIENCE: [INAUDIBLE].

1144
00:48:07,650 --> 00:48:08,316
PROFESSOR: Yeah.

1145
00:48:08,316 --> 00:48:10,274
So what do they do about
network access, right?

1146
00:48:10,274 --> 00:48:12,073
So what happens in
capability mode?

1147
00:48:12,073 --> 00:48:14,281
AUDIENCE: I guess they have
capabilities for security

1148
00:48:14,281 --> 00:48:17,365
packets [INAUDIBLE].

1149
00:48:17,365 --> 00:48:17,990
PROFESSOR: Yes.

1150
00:48:17,990 --> 00:48:19,365
So I think the
way they basically

1151
00:48:19,365 --> 00:48:22,682
do it is that they treat the
network as a global namespace,

1152
00:48:22,682 --> 00:48:23,890
very much like a file system.

1153
00:48:23,890 --> 00:48:28,020
So I think once you
enter capability mode,

1154
00:48:28,020 --> 00:48:30,660
you cannot create a new socket.

1155
00:48:30,660 --> 00:48:33,320
Or you cannot create a new
socket and connect to some

1156
00:48:33,320 --> 00:48:36,321
arbitrary machine, or to some
arbitrary address or fort

1157
00:48:36,321 --> 00:48:36,820
number.

1158
00:48:36,820 --> 00:48:40,710
You have to basically create all
the connections you want ahead

1159
00:48:40,710 --> 00:48:42,420
of time and fill them
in as capabilities.

1160
00:48:42,420 --> 00:48:44,670
Or maybe you'd have to get
them from someone that will

1161
00:48:44,670 --> 00:48:46,185
pass you a file descriptor.

1162
00:48:46,185 --> 00:48:48,655
But basically, once
you're in capability mode,

1163
00:48:48,655 --> 00:48:51,280
the set of file descriptors you
have open completely enumerates

1164
00:48:51,280 --> 00:48:52,821
all the machines
you'll ever talk to.

1165
00:48:52,821 --> 00:48:54,430
So you can find
open connections.

1166
00:48:54,430 --> 00:48:55,846
Maybe you're
listening on a forge.

1167
00:48:55,846 --> 00:48:57,050
That's OK.

1168
00:48:57,050 --> 00:48:59,790
But you cannot connect
to an address specified

1169
00:48:59,790 --> 00:49:02,453
by an absolute name, kind of
like a global namespace would

1170
00:49:02,453 --> 00:49:03,866
allow you to do it.

1171
00:49:03,866 --> 00:49:05,150
That make sense?

1172
00:49:05,150 --> 00:49:09,310
So it's access through the
networking namespace, as well.

1173
00:49:09,310 --> 00:49:11,840
What do they do for processes?

1174
00:49:11,840 --> 00:49:14,400
So another global
namespace, I guess, in Unix,

1175
00:49:14,400 --> 00:49:16,670
is the the PIDs themselves.

1176
00:49:16,670 --> 00:49:18,875
So the example of a
system call that operates

1177
00:49:18,875 --> 00:49:20,090
in this name space is "kill."

1178
00:49:20,090 --> 00:49:22,549
So I could kill PID 25.

1179
00:49:22,549 --> 00:49:24,840
And I could-- well, presumably
I'll put a single number

1180
00:49:24,840 --> 00:49:26,110
in there, too.

1181
00:49:26,110 --> 00:49:31,040
But I could actually kill a
process by its PID number.

1182
00:49:31,040 --> 00:49:35,320
How do they fix
this in Capsicum?

1183
00:49:35,320 --> 00:49:36,130
What's their plan?

1184
00:49:41,553 --> 00:49:42,269
Yeah?

1185
00:49:42,269 --> 00:49:44,018
AUDIENCE: File descriptors
with processes.

1186
00:49:44,018 --> 00:49:44,520
PROFESSOR: Yeah.

1187
00:49:44,520 --> 00:49:45,130
It's actually kind of cool.

1188
00:49:45,130 --> 00:49:47,300
It's like, I wish Unix
had this all along.

1189
00:49:47,300 --> 00:49:50,640
Which is that, instead of
having these different kinds

1190
00:49:50,640 --> 00:49:54,630
of numbers or PIDs, instead,
when you fork off a process,

1191
00:49:54,630 --> 00:49:56,620
actually having
new variant of fork

1192
00:49:56,620 --> 00:50:01,300
called pdfork, or
Process Descriptor Fork.

1193
00:50:01,300 --> 00:50:04,560
And what it does is when
it creates a child process,

1194
00:50:04,560 --> 00:50:07,700
it actually sticks a reference
to that child process

1195
00:50:07,700 --> 00:50:10,320
into your file descriptor
table somewhere.

1196
00:50:10,320 --> 00:50:11,730
And this is your new process.

1197
00:50:11,730 --> 00:50:13,700
And you can operate
on a child process

1198
00:50:13,700 --> 00:50:15,409
by specifying the file
descriptor number.

1199
00:50:15,409 --> 00:50:17,491
Well, it would be pretty
cool, because you can now

1200
00:50:17,491 --> 00:50:19,550
pass your child
process to someone else

1201
00:50:19,550 --> 00:50:21,580
and say, well, if you
can go and kill them now,

1202
00:50:21,580 --> 00:50:24,230
or you can manage this
process however you want,

1203
00:50:24,230 --> 00:50:26,560
you'll get notifications
when the process dies.

1204
00:50:26,560 --> 00:50:31,000
It'll look like a readable
file descriptor, et cetera.

1205
00:50:31,000 --> 00:50:34,530
So they really try to
homogenize everything

1206
00:50:34,530 --> 00:50:38,930
into looking like a file
descriptor of some sort here.

1207
00:50:38,930 --> 00:50:40,695
And with these
kernel changes, you

1208
00:50:40,695 --> 00:50:43,300
can finally have all
the functionalities

1209
00:50:43,300 --> 00:50:44,330
you might care about.

1210
00:50:44,330 --> 00:50:46,110
You have the support
for sockets already,

1211
00:50:46,110 --> 00:50:48,160
process descriptors, et cetera.

1212
00:50:48,160 --> 00:50:52,350
And you have a way
of constraining

1213
00:50:52,350 --> 00:50:53,840
what the process can do.

1214
00:50:53,840 --> 00:50:56,470
Because it cannot refer to any
of the global names anymore

1215
00:50:56,470 --> 00:50:59,690
after [INAUDIBLE].

1216
00:50:59,690 --> 00:51:00,610
All right.

1217
00:51:00,610 --> 00:51:03,050
Any questions?

1218
00:51:03,050 --> 00:51:05,700
So here's an interesting puzzle.

1219
00:51:05,700 --> 00:51:07,820
I was trying to
understand from the paper.

1220
00:51:07,820 --> 00:51:10,410
They make a big
deal about dot dot

1221
00:51:10,410 --> 00:51:12,820
in looking up directory names.

1222
00:51:12,820 --> 00:51:16,210
So they basically say, well,
once you're in capability mode,

1223
00:51:16,210 --> 00:51:19,430
when you pass a
particular name to openat,

1224
00:51:19,430 --> 00:51:21,416
you cannot use dot
dot in those names.

1225
00:51:21,416 --> 00:51:23,040
And presumably, if
you have a Simulink,

1226
00:51:23,040 --> 00:51:25,205
if a Simulink's target
contains dot dot,

1227
00:51:25,205 --> 00:51:28,780
they will reject it if
you're in capability mode.

1228
00:51:28,780 --> 00:51:31,830
So is this strictly required?

1229
00:51:31,830 --> 00:51:33,980
Could you imagine a
safe design in principle

1230
00:51:33,980 --> 00:51:35,610
that allows the use of dot dot?

1231
00:51:40,330 --> 00:51:41,040
Yeah.

1232
00:51:41,040 --> 00:51:43,664
AUDIENCE: Well, you'd need to be
able to find whether they have

1233
00:51:43,664 --> 00:51:46,892
a file or a capability that
allows the masses to the parent

1234
00:51:46,892 --> 00:51:47,880
directory.

1235
00:51:47,880 --> 00:51:48,640
PROFESSOR: Right.

1236
00:51:48,640 --> 00:51:50,181
AUDIENCE: So it's
trivial to go down,

1237
00:51:50,181 --> 00:51:52,908
because any subdirectory--
you already have access to it

1238
00:51:52,908 --> 00:51:53,490
by having the capability.

1239
00:51:53,490 --> 00:51:54,190
PROFESSOR: That's right.

1240
00:51:54,190 --> 00:51:54,740
Yeah.

1241
00:51:54,740 --> 00:51:56,050
AUDIENCE: But going
up, you need to see

1242
00:51:56,050 --> 00:51:58,050
whether you have any
capabilities for the parent

1243
00:51:58,050 --> 00:51:58,810
directory.

1244
00:51:58,810 --> 00:51:59,810
PROFESSOR: That's right.

1245
00:51:59,810 --> 00:52:00,300
Yeah.

1246
00:52:00,300 --> 00:52:01,060
AUDIENCE: Search for it somehow.

1247
00:52:01,060 --> 00:52:01,220
PROFESSOR: Yeah.

1248
00:52:01,220 --> 00:52:01,955
So that's a little bit tricky.

1249
00:52:01,955 --> 00:52:03,503
And also, it goes
against the grain

1250
00:52:03,503 --> 00:52:06,490
of this whole explicit
authority thing.

1251
00:52:06,490 --> 00:52:09,895
What about if you're
using dot dot inside sort

1252
00:52:09,895 --> 00:52:11,415
of a single open call?

1253
00:52:11,415 --> 00:52:15,800
So for example, what if you
call something like openat some

1254
00:52:15,800 --> 00:52:18,050
particular directory or
file descriptor number,

1255
00:52:18,050 --> 00:52:20,332
and you open something like,
I don't know, b/c/../..?

1256
00:52:26,690 --> 00:52:28,910
In principle, this
might be safe, right?

1257
00:52:28,910 --> 00:52:31,290
Because you go down some
directory, and then you just

1258
00:52:31,290 --> 00:52:33,770
climb back up out of it.

1259
00:52:33,770 --> 00:52:34,660
Yeah?

1260
00:52:34,660 --> 00:52:36,824
AUDIENCE: What if
c is [INAUDIBLE]?

1261
00:52:36,824 --> 00:52:37,490
PROFESSOR: Yeah.

1262
00:52:37,490 --> 00:52:38,560
So it's a little bit
tricky, of course,

1263
00:52:38,560 --> 00:52:40,570
to define exactly what
it means to be safe.

1264
00:52:40,570 --> 00:52:41,070
Right?

1265
00:52:41,070 --> 00:52:44,350
You probably have to make sure
that c isn't a Simulink that

1266
00:52:44,350 --> 00:52:46,160
goes somewhere else and so on.

1267
00:52:46,160 --> 00:52:46,660
Yeah.

1268
00:52:46,660 --> 00:52:48,190
That's a fairly tricky
proposition, to get this right.

1269
00:52:48,190 --> 00:52:50,106
And I think, in the
paper, what they basically

1270
00:52:50,106 --> 00:52:52,000
argue about is
that it's actually

1271
00:52:52,000 --> 00:52:54,630
quite difficult in practice
to implement a set of checks

1272
00:52:54,630 --> 00:52:57,990
that's sufficient and
bypasses all the possible rate

1273
00:52:57,990 --> 00:52:59,640
conditions here.

1274
00:52:59,640 --> 00:53:02,020
So they basically just
do the conservative thing

1275
00:53:02,020 --> 00:53:04,190
and disallow any
dot dot at any time

1276
00:53:04,190 --> 00:53:07,520
once you're in capability mode.

1277
00:53:07,520 --> 00:53:09,330
There's some interesting
rate conditions

1278
00:53:09,330 --> 00:53:10,496
you could come up with here.

1279
00:53:10,496 --> 00:53:14,000
The lecture notes
have more details.

1280
00:53:14,000 --> 00:53:16,010
But basically I
think these guys are

1281
00:53:16,010 --> 00:53:18,560
being extra cautious in
defining what's allowed

1282
00:53:18,560 --> 00:53:22,700
and what's not allowed
in capability mode.

1283
00:53:22,700 --> 00:53:23,567
OK.

1284
00:53:23,567 --> 00:53:25,620
So here, to answer
your question,

1285
00:53:25,620 --> 00:53:27,036
once you enter
capability mode, it

1286
00:53:27,036 --> 00:53:30,505
seems to be all controlled
by your file table.

1287
00:53:30,505 --> 00:53:33,641
Does your UID still matter,
once you enter capability mode?

1288
00:53:41,020 --> 00:53:43,340
[INAUDIBLE]

1289
00:53:43,340 --> 00:53:44,080
Yeah?

1290
00:53:44,080 --> 00:53:46,080
AUDIENCE: Well, you could
still launch a process

1291
00:53:46,080 --> 00:53:48,077
that doesn't use capabilities.

1292
00:53:48,077 --> 00:53:48,660
PROFESSOR: No.

1293
00:53:48,660 --> 00:53:50,187
Actually, no, you can't.

1294
00:53:50,187 --> 00:53:52,520
You have to make sure that--
otherwise you could escape,

1295
00:53:52,520 --> 00:53:54,811
like well, I can't access--
why don't you run this guy?

1296
00:53:54,811 --> 00:53:56,258
[INAUDIBLE]

1297
00:53:56,258 --> 00:53:59,535
So yeah, cap_enter is inherited
by all the children, which

1298
00:53:59,535 --> 00:54:01,550
is actually hugely important.

1299
00:54:01,550 --> 00:54:02,050
Yeah?

1300
00:54:06,000 --> 00:54:09,190
Anyone else?

1301
00:54:09,190 --> 00:54:10,990
So what if we kill the UID?

1302
00:54:10,990 --> 00:54:13,591
So it's supposed to be
like going to cap_enter,

1303
00:54:13,591 --> 00:54:15,590
and we just kill the UID
of the current process.

1304
00:54:15,590 --> 00:54:17,476
We don't actually care
what it is anymore.

1305
00:54:17,476 --> 00:54:19,225
And then the process
tries to open a file.

1306
00:54:19,225 --> 00:54:22,350
What checks should apply?

1307
00:54:22,350 --> 00:54:22,850
Yeah?

1308
00:54:22,850 --> 00:54:25,191
AUDIENCE: Oh, I was
thinking that the UID is

1309
00:54:25,191 --> 00:54:26,690
useful for logging
purposes as well,

1310
00:54:26,690 --> 00:54:28,580
like being able to tell
if you did something.

1311
00:54:28,580 --> 00:54:29,130
PROFESSOR: So
yeah, you're right.

1312
00:54:29,130 --> 00:54:29,460
Actually, yeah.

1313
00:54:29,460 --> 00:54:30,930
So that would be actually
kind of damaging, right?

1314
00:54:30,930 --> 00:54:33,500
Like I spawned some sandbox
process on my machine

1315
00:54:33,500 --> 00:54:34,669
and it loses the UID.

1316
00:54:34,669 --> 00:54:36,460
I'm like I have a
hundred processes running

1317
00:54:36,460 --> 00:54:38,730
on my machine, and I have
no idea what they are.

1318
00:54:38,730 --> 00:54:40,400
So that's probably not a good
plan for a management purpose.

1319
00:54:40,400 --> 00:54:41,555
You're absolutely right.

1320
00:54:41,555 --> 00:54:44,170
But I'm just sort of
hypothetically saying, well,

1321
00:54:44,170 --> 00:54:45,920
do we need it for
access control, I guess.

1322
00:54:45,920 --> 00:54:46,750
Yeah?

1323
00:54:46,750 --> 00:54:48,280
AUDIENCE: Maybe if
this UID is only

1324
00:54:48,280 --> 00:54:50,790
supposed to be able to
access this file by reading

1325
00:54:50,790 --> 00:54:54,075
or whatever, but you have
the file descriptor for it,

1326
00:54:54,075 --> 00:54:55,450
but then if you
lose the UID, you

1327
00:54:55,450 --> 00:54:57,960
might get permissions to write
[INAUDIBLE] or something?

1328
00:54:57,960 --> 00:54:58,780
PROFESSOR: Yeah.

1329
00:54:58,780 --> 00:55:03,410
I think actually what it
shows up in is in directories.

1330
00:55:03,410 --> 00:55:05,287
Because once you add a
capability to a file,

1331
00:55:05,287 --> 00:55:06,120
that's basically it.

1332
00:55:06,120 --> 00:55:08,600
You have it open with particular
privileges, et cetera.

1333
00:55:08,600 --> 00:55:11,519
But the problem is that they
have this hybrid design where

1334
00:55:11,519 --> 00:55:13,560
they say, well, you can
actually add capabilities

1335
00:55:13,560 --> 00:55:15,510
to directories, and
you can open a new file

1336
00:55:15,510 --> 00:55:17,030
as you're running along.

1337
00:55:17,030 --> 00:55:19,375
And it might be the case
that you add a capability

1338
00:55:19,375 --> 00:55:22,200
to a directory, like /etc.

1339
00:55:22,200 --> 00:55:24,450
And you don't have access
to necessarily all the files

1340
00:55:24,450 --> 00:55:25,520
in /etc.

1341
00:55:25,520 --> 00:55:27,440
But once you enter
capability mode,

1342
00:55:27,440 --> 00:55:29,860
you can now try to open
those files by saying, well,

1343
00:55:29,860 --> 00:55:31,840
I have access to
the /etc directory.

1344
00:55:31,840 --> 00:55:32,850
It's open already.

1345
00:55:32,850 --> 00:55:34,620
Why don't you give
me the file named

1346
00:55:34,620 --> 00:55:36,060
password in that directory?

1347
00:55:36,060 --> 00:55:38,780
And the kernel still needs to
make an access control decision

1348
00:55:38,780 --> 00:55:42,090
on whether to allow you to
open a file in that directory

1349
00:55:42,090 --> 00:55:45,010
with either read mode or
write mode or what have you.

1350
00:55:45,010 --> 00:55:47,490
So I think this is the one
place where you still need

1351
00:55:47,490 --> 00:55:50,620
this ambient privilege, to
some extent, because they're

1352
00:55:50,620 --> 00:55:53,140
trying to build this
compatible design where

1353
00:55:53,140 --> 00:55:56,780
you can have semi-natural
semantics for how directories

1354
00:55:56,780 --> 00:55:57,670
work.

1355
00:55:57,670 --> 00:55:59,410
Does that make sense?

1356
00:55:59,410 --> 00:56:02,920
it's like one leftover place,
kind of for compatibility

1357
00:56:02,920 --> 00:56:05,884
reasons, or at least the
way that Unix file systems

1358
00:56:05,884 --> 00:56:07,660
are typically set up.

1359
00:56:07,660 --> 00:56:09,649
AUDIENCE: Are there
any other places?

1360
00:56:09,649 --> 00:56:10,690
PROFESSOR: Good question.

1361
00:56:10,690 --> 00:56:12,240
I couldn't think
of one off hand,

1362
00:56:12,240 --> 00:56:14,531
but I guess I would have to
get their previous desource

1363
00:56:14,531 --> 00:56:17,980
code to really figure
out what's going on.

1364
00:56:17,980 --> 00:56:20,150
I think most of the
other situations

1365
00:56:20,150 --> 00:56:22,069
don't really
require a UID check.

1366
00:56:22,069 --> 00:56:23,860
Because for networking,
it doesn't show up.

1367
00:56:23,860 --> 00:56:27,406
I think for process descriptors
it doesn't show up, either.

1368
00:56:27,406 --> 00:56:29,660
If you have it, then
you just have it.

1369
00:56:29,660 --> 00:56:33,421
So I think it probably is
just file system operations.

1370
00:56:33,421 --> 00:56:35,920
For shared memory, it's also--
once you have a shared memory

1371
00:56:35,920 --> 00:56:37,760
segment, you have it open.

1372
00:56:41,232 --> 00:56:41,841
Yeah?

1373
00:56:41,841 --> 00:56:43,216
AUDIENCE: Could
you explain again

1374
00:56:43,216 --> 00:56:47,404
how exactly the user ID matters
if you have a capability?

1375
00:56:47,404 --> 00:56:48,070
PROFESSOR: Yeah.

1376
00:56:48,070 --> 00:56:51,810
So I think where it
matters is, you have

1377
00:56:51,810 --> 00:56:54,910
a capability to a directory.

1378
00:56:54,910 --> 00:56:57,770
The question is, what does
the capability represent?

1379
00:56:57,770 --> 00:57:01,260
So one interpretation that--
for example, some capability

1380
00:57:01,260 --> 00:57:03,130
system state, not Capsicum.

1381
00:57:03,130 --> 00:57:04,130
Pure capability systems.

1382
00:57:04,130 --> 00:57:06,870
They say, well, if you have
a capability to a directory,

1383
00:57:06,870 --> 00:57:08,828
then of course you have
access to all the files

1384
00:57:08,828 --> 00:57:11,392
in that directory, no
questions about it.

1385
00:57:11,392 --> 00:57:13,225
And in Unix, this is
typically not the case.

1386
00:57:13,225 --> 00:57:16,110
You can open a
directory like /etc,

1387
00:57:16,110 --> 00:57:18,670
but there's lots of system
files in there that are maybe

1388
00:57:18,670 --> 00:57:21,917
private, like the private key of
your server is stored in there.

1389
00:57:21,917 --> 00:57:24,250
And just because you can look
at a directory and open it

1390
00:57:24,250 --> 00:57:26,820
and list it doesn't mean that
you cannot open the files

1391
00:57:26,820 --> 00:57:28,310
in that directory.

1392
00:57:28,310 --> 00:57:32,392
So in Capsicum, if you
open a directory like /etc,

1393
00:57:32,392 --> 00:57:33,850
and then you enter
capability mode.

1394
00:57:33,850 --> 00:57:35,190
And then you say,
well, hey, I don't

1395
00:57:35,190 --> 00:57:36,200
know what this directory is.

1396
00:57:36,200 --> 00:57:37,658
I just add a file
descriptor to it.

1397
00:57:37,658 --> 00:57:39,342
There's a file in
there called "key."

1398
00:57:39,342 --> 00:57:41,390
Why don't you open
that file "key"?

1399
00:57:41,390 --> 00:57:44,070
And at this point,
you probably don't

1400
00:57:44,070 --> 00:57:46,270
want to allow this
capability-based processor

1401
00:57:46,270 --> 00:57:48,480
to just open it, because
that wasn't the intent.

1402
00:57:48,480 --> 00:57:52,060
They'll allow you to bypass
the Unix permissions on a file.

1403
00:57:52,060 --> 00:57:54,250
So I think the
authors of this paper

1404
00:57:54,250 --> 00:57:59,850
are careful to design a
system which would not violate

1405
00:57:59,850 --> 00:58:01,600
existing security mechanisms.

1406
00:58:01,600 --> 00:58:04,462
AUDIENCE: So you're saying
that you can, in some cases,

1407
00:58:04,462 --> 00:58:06,370
use a combination of the two?

1408
00:58:06,370 --> 00:58:08,760
So even though it'll be able
to change it to directory,

1409
00:58:08,760 --> 00:58:10,760
inside the directory,
which files you can access

1410
00:58:10,760 --> 00:58:11,839
depends on your user ID?

1411
00:58:11,839 --> 00:58:12,880
PROFESSOR: Yeah, exactly.

1412
00:58:12,880 --> 00:58:16,645
So in Capsicum, the way they
get it to work in practice

1413
00:58:16,645 --> 00:58:19,890
is that, actually, before
you enter capability mode,

1414
00:58:19,890 --> 00:58:20,666
you have to guess.

1415
00:58:20,666 --> 00:58:22,415
Well, what files am I
going to need later?

1416
00:58:22,415 --> 00:58:23,970
I'm going to need
some shared libraries.

1417
00:58:23,970 --> 00:58:25,060
I'll need some text files.

1418
00:58:25,060 --> 00:58:26,644
I'll need some templates.

1419
00:58:26,644 --> 00:58:28,560
I'll need some network
connections, et cetera.

1420
00:58:28,560 --> 00:58:30,960
So you open all these
things ahead of time.

1421
00:58:30,960 --> 00:58:33,970
And you don't always necessarily
know which exact file you need.

1422
00:58:33,970 --> 00:58:35,754
So what these guys
support as well,

1423
00:58:35,754 --> 00:58:38,045
you can actually just open
a directory file descriptor,

1424
00:58:38,045 --> 00:58:38,780
as well.

1425
00:58:38,780 --> 00:58:41,460
And then I can look up the
particular files later.

1426
00:58:41,460 --> 00:58:42,960
But it might be
that the files don't

1427
00:58:42,960 --> 00:58:44,209
have all the same permissions.

1428
00:58:44,209 --> 00:58:46,760
So that's exactly
the reason, yeah.

1429
00:58:46,760 --> 00:58:49,610
Make sense?

1430
00:58:49,610 --> 00:58:50,940
All right.

1431
00:58:50,940 --> 00:58:55,560
So this is the kernel
mechanism part of it.

1432
00:58:55,560 --> 00:59:01,830
Why do they also need this
library for libcapsicum?

1433
00:59:01,830 --> 00:59:04,410
I guess there's two things that
they support in that library,

1434
00:59:04,410 --> 00:59:07,330
as far as I can tell,
or two main things.

1435
00:59:07,330 --> 00:59:15,342
One is that they implement this
function they call lch_start

1436
00:59:15,342 --> 00:59:21,930
that you should use
instead of cap_enter.

1437
00:59:21,930 --> 00:59:25,600
And the other sort of
feature the library provides

1438
00:59:25,600 --> 00:59:31,120
in libcapsicum is this
notion called fd lists

1439
00:59:31,120 --> 00:59:33,600
instead of passing file
descriptors by number.

1440
00:59:33,600 --> 00:59:35,030
So this fd list
thing is probably

1441
00:59:35,030 --> 00:59:36,460
the easiest thing to explain.

1442
00:59:36,460 --> 00:59:40,940
It's basically a generalization,
or maybe a clean up,

1443
00:59:40,940 --> 00:59:43,520
of how Unix manages
and passes file

1444
00:59:43,520 --> 00:59:46,220
descriptors between process.

1445
00:59:46,220 --> 00:59:49,580
So in traditional
Unix and Linux,

1446
00:59:49,580 --> 00:59:52,910
how you use it today, typically
when you launch a process,

1447
00:59:52,910 --> 00:59:54,550
you can pass it some
file descriptors.

1448
00:59:54,550 --> 00:59:56,020
You just open some
file descriptors

1449
00:59:56,020 --> 00:59:58,485
at particular integer
numbers in this table

1450
00:59:58,485 --> 01:00:00,610
and you run the child
process that you want to run.

1451
01:00:00,610 --> 01:00:03,180
Or you run a particular
binary, and it

1452
01:00:03,180 --> 01:00:08,000
inherits all these open
slots in the fd table.

1453
01:00:08,000 --> 01:00:10,370
But there's no real good
way to name these things

1454
01:00:10,370 --> 01:00:11,730
other than by number.

1455
01:00:11,730 --> 01:00:15,244
So the somewhat
surprising convention,

1456
01:00:15,244 --> 01:00:16,660
if you haven't
[INAUDIBLE] before,

1457
01:00:16,660 --> 01:00:18,750
is that, well, slot
0 is your input.

1458
01:00:18,750 --> 01:00:20,940
Slot 1 is your output.

1459
01:00:20,940 --> 01:00:24,010
Slot 2 is where you should
print error messages to.

1460
01:00:24,010 --> 01:00:27,370
And that's how
Unix sort of works.

1461
01:00:27,370 --> 01:00:32,240
And it sort of works OK if you
are just passing these three

1462
01:00:32,240 --> 01:00:35,430
files or streams to a process.

1463
01:00:35,430 --> 01:00:37,570
But in Capsicum,
what's happening

1464
01:00:37,570 --> 01:00:41,140
is that you're passing down many
more file descriptors around.

1465
01:00:41,140 --> 01:00:43,894
So you're passing a file
descriptor for some files.

1466
01:00:43,894 --> 01:00:46,310
You're passing a file descriptor
for a network connection,

1467
01:00:46,310 --> 01:00:49,320
for a shared library,
what have you.

1468
01:00:49,320 --> 01:00:52,060
And it becomes much more tedious
to manage all these numbers.

1469
01:00:52,060 --> 01:00:55,370
So basically, libcapsicum
provides an abstraction

1470
01:00:55,370 --> 01:00:59,460
for naming these past file
descriptors between processes

1471
01:00:59,460 --> 01:01:01,810
by some sort of a
hierarchical name,

1472
01:01:01,810 --> 01:01:06,980
instead of just these opaque
integers, if you will.

1473
01:01:06,980 --> 01:01:08,410
So that's one sort
of simple thing

1474
01:01:08,410 --> 01:01:10,240
that they provide
in their library.

1475
01:01:10,240 --> 01:01:13,260
So I can pass a file
descriptor to a process

1476
01:01:13,260 --> 01:01:14,100
and give it a name.

1477
01:01:14,100 --> 01:01:16,100
And it doesn't really
matter what number it has,

1478
01:01:16,100 --> 01:01:16,982
a little easier.

1479
01:01:16,982 --> 01:01:17,968
That make sense?

1480
01:01:17,968 --> 01:01:19,450
OK.

1481
01:01:19,450 --> 01:01:21,120
So then they have
this other mechanism,

1482
01:01:21,120 --> 01:01:25,906
this much more elaborate
way to start a sandbox.

1483
01:01:25,906 --> 01:01:29,740
This lch, libcapsicum Host,
API for starting a sandbox,

1484
01:01:29,740 --> 01:01:33,342
instead of just entering
the capability mode.

1485
01:01:33,342 --> 01:01:34,050
So what happened?

1486
01:01:34,050 --> 01:01:36,396
Why do they need something
more than just entering

1487
01:01:36,396 --> 01:01:37,392
capability mode?

1488
01:01:37,392 --> 01:01:39,950
What are you worried about
on creating a sandbox?

1489
01:01:39,950 --> 01:01:40,810
Yeah?

1490
01:01:40,810 --> 01:01:43,502
AUDIENCE: It erases
all the inherited stuff

1491
01:01:43,502 --> 01:01:45,524
to give you a clean start.

1492
01:01:45,524 --> 01:01:46,190
PROFESSOR: Yeah.

1493
01:01:46,190 --> 01:01:48,430
So I think they
worry about trying

1494
01:01:48,430 --> 01:01:51,230
to enumerate what are all the
things the sandbox has access

1495
01:01:51,230 --> 01:01:51,870
to.

1496
01:01:51,870 --> 01:01:56,160
And the problem is that if
you just call cap_enter,

1497
01:01:56,160 --> 01:01:58,560
technically, at the kernel
mechanism level, as we talked

1498
01:01:58,560 --> 01:01:59,285
about just now, it worked.

1499
01:01:59,285 --> 01:01:59,785
Right?

1500
01:01:59,785 --> 01:02:02,270
It just prevents you from
opening any new capabilities.

1501
01:02:02,270 --> 01:02:05,230
But the problem is that there
might be lots of existing stuff

1502
01:02:05,230 --> 01:02:08,780
that the process
already has access to.

1503
01:02:08,780 --> 01:02:11,256
So I guess the simplest
example is maybe

1504
01:02:11,256 --> 01:02:13,930
there are some file descriptors
that you forgot you had opened,

1505
01:02:13,930 --> 01:02:17,310
and it'll just get
inherited by this process.

1506
01:02:17,310 --> 01:02:20,470
So one example is they
were looking at tcpdump.

1507
01:02:20,470 --> 01:02:23,950
And they realized that-- well,
first, they changed tcpdump

1508
01:02:23,950 --> 01:02:27,500
just by calling
cap_enter at the point

1509
01:02:27,500 --> 01:02:30,594
just before they were about to
parse all the network input.

1510
01:02:30,594 --> 01:02:32,760
So this works well, in some
sense, because you can't

1511
01:02:32,760 --> 01:02:34,290
get any more capabilities.

1512
01:02:34,290 --> 01:02:36,331
But then they looked at
the open file descriptor,

1513
01:02:36,331 --> 01:02:39,285
and they realized that you have
complete access to the user's

1514
01:02:39,285 --> 01:02:41,720
terminal, because you have an
open file descriptor to it.

1515
01:02:41,720 --> 01:02:43,145
So you can actually
sniff all the keystrokes

1516
01:02:43,145 --> 01:02:45,225
that the user is typing
and all that stuff.

1517
01:02:45,225 --> 01:02:48,602
So it's probably not a
great plan for tcpdump.

1518
01:02:48,602 --> 01:02:51,060
This compromise you probably
don't want sniffing everything

1519
01:02:51,060 --> 01:02:52,950
you're typing.

1520
01:02:52,950 --> 01:02:56,520
So instead they-- well,
in tcpdump's case,

1521
01:02:56,520 --> 01:03:00,900
they manually changed
these file descriptors

1522
01:03:00,900 --> 01:03:03,010
to add some capability
bits to them,

1523
01:03:03,010 --> 01:03:05,360
to restrict what kinds
of operations you can do.

1524
01:03:05,360 --> 01:03:07,990
So remember, the capability,
at least in Capsicum,

1525
01:03:07,990 --> 01:03:11,030
has these extra bits that say,
here's the class of operations

1526
01:03:11,030 --> 01:03:13,310
you can perform on
a file descriptor.

1527
01:03:13,310 --> 01:03:17,650
So they basically take what
used to be file descriptor 0.

1528
01:03:17,650 --> 01:03:20,700
It pointed to the
user's terminal, tty.

1529
01:03:20,700 --> 01:03:23,670
And originally, this was
just a direct pointer

1530
01:03:23,670 --> 01:03:25,880
to the tty structure
in the kernel.

1531
01:03:25,880 --> 01:03:27,570
What they do is they
actually-- in order

1532
01:03:27,570 --> 01:03:30,070
to limit the kind of operations
you can perform on this file

1533
01:03:30,070 --> 01:03:31,930
descriptor, they basically
introduced some extra beta

1534
01:03:31,930 --> 01:03:32,930
structure in the middle.

1535
01:03:32,930 --> 01:03:34,810
This guy will point
to the terminal.

1536
01:03:34,810 --> 01:03:36,730
And the file
descriptor itself will

1537
01:03:36,730 --> 01:03:39,950
point to some sort of
a capability structure.

1538
01:03:39,950 --> 01:03:43,040
And inside of it is the
pointer to the real file

1539
01:03:43,040 --> 01:03:46,685
that you're trying to access,
as well as some restricted bits

1540
01:03:46,685 --> 01:03:51,590
or permissions on
that file descriptor

1541
01:03:51,590 --> 01:03:53,280
object that you can do.

1542
01:03:53,280 --> 01:03:55,740
In their case, they basically
can say for tcpdumps standard

1543
01:03:55,740 --> 01:03:57,585
input, you cannot
do anything on it.

1544
01:03:57,585 --> 01:03:59,602
You can just see that it
exists, and that's it.

1545
01:03:59,602 --> 01:04:01,564
For the output file
descriptor, they say,

1546
01:04:01,564 --> 01:04:03,980
well, you can write to it, but
you maybe can't reposition.

1547
01:04:03,980 --> 01:04:07,710
You can't [INAUDIBLE]
back and forth, et cetera.

1548
01:04:07,710 --> 01:04:10,280
Make sense?

1549
01:04:10,280 --> 01:04:11,900
So what else would
you worry about,

1550
01:04:11,900 --> 01:04:12,570
in terms of starting a sandbox?

1551
01:04:12,570 --> 01:04:14,810
So there is, I guess, the
file descriptor state.

1552
01:04:14,810 --> 01:04:16,234
Anything else that matters?

1553
01:04:21,448 --> 01:04:24,320
Well, I guess in Unix it's
file descriptors and memory.

1554
01:04:24,320 --> 01:04:25,670
That's pretty much it.

1555
01:04:25,670 --> 01:04:29,400
So the other thing that
these guys worry about

1556
01:04:29,400 --> 01:04:32,250
is that it might be that
in your address space,

1557
01:04:32,250 --> 01:04:34,600
you previously allocated
some sensitive data.

1558
01:04:34,600 --> 01:04:36,920
And the process
that your sandbox

1559
01:04:36,920 --> 01:04:38,830
is going to be able to
read all its memory.

1560
01:04:38,830 --> 01:04:40,205
So if there's
maybe some password

1561
01:04:40,205 --> 01:04:42,420
that you checked before when
the user was logging in,

1562
01:04:42,420 --> 01:04:44,150
and you haven't
cleared that yet,

1563
01:04:44,150 --> 01:04:45,749
well, the sandbox
process will be

1564
01:04:45,749 --> 01:04:47,165
able to read that
and do something

1565
01:04:47,165 --> 01:04:49,050
maybe interesting to that.

1566
01:04:49,050 --> 01:04:50,920
So the way they
solved this problem

1567
01:04:50,920 --> 01:04:55,100
is, in lch_start, you basically
have to start a program fresh.

1568
01:04:55,100 --> 01:04:57,270
You basically take a program.

1569
01:04:57,270 --> 01:04:59,197
You explicitly package
up all the arguments

1570
01:04:59,197 --> 01:05:00,030
you want to give it.

1571
01:05:00,030 --> 01:05:01,590
You explicitly package up
all the file descriptors

1572
01:05:01,590 --> 01:05:02,860
you want to give it.

1573
01:05:02,860 --> 01:05:04,235
And then you start
a new process,

1574
01:05:04,235 --> 01:05:06,410
or you would call
executives to reinitialize

1575
01:05:06,410 --> 01:05:09,200
your whole virtual memory space.

1576
01:05:09,200 --> 01:05:11,080
And then there's no
question about what

1577
01:05:11,080 --> 01:05:14,370
is the set of sensitive
data of extra privileges

1578
01:05:14,370 --> 01:05:15,510
that this process has.

1579
01:05:15,510 --> 01:05:18,160
It's exactly what you
passed to lch_start,

1580
01:05:18,160 --> 01:05:22,040
in terms of a program name,
arguments, and capabilities.

1581
01:05:22,040 --> 01:05:24,540
Does that make sense?

1582
01:05:24,540 --> 01:05:27,160
AUDIENCE: What would happen
if the process that you're

1583
01:05:27,160 --> 01:05:29,494
starting is a setuid 0 binary?

1584
01:05:29,494 --> 01:05:30,160
PROFESSOR: Yeah.

1585
01:05:30,160 --> 01:05:35,380
I think these guys say
that they don't actually

1586
01:05:35,380 --> 01:05:38,020
allow setuid binaries
in capability mode,

1587
01:05:38,020 --> 01:05:39,860
just to avoid some
weird interactions that

1588
01:05:39,860 --> 01:05:40,905
would show up.

1589
01:05:40,905 --> 01:05:42,940
I think the rules
that they implement

1590
01:05:42,940 --> 01:05:45,263
is that you could have
a setuid program that

1591
01:05:45,263 --> 01:05:47,770
gets its privileges
from a setuid binary,

1592
01:05:47,770 --> 01:05:50,950
and then it can call
capenter or lch_start.

1593
01:05:50,950 --> 01:05:52,890
But once you're in
capability mode,

1594
01:05:52,890 --> 01:05:54,640
you cannot regain
extra privileges.

1595
01:05:54,640 --> 01:05:58,110
In principle, this could work,
but it would be very weird.

1596
01:05:58,110 --> 01:06:00,680
Because remember, the only
place where the UID matters,

1597
01:06:00,680 --> 01:06:02,275
once you're in
capability mode, is

1598
01:06:02,275 --> 01:06:04,150
in opening these files
inside of a directory.

1599
01:06:04,150 --> 01:06:07,080
So it's not clear this
is really a great plan

1600
01:06:07,080 --> 01:06:10,850
for getting more privileges
or [INAUDIBLE] there.

1601
01:06:10,850 --> 01:06:11,350
Make sense?

1602
01:06:11,350 --> 01:06:12,790
Yeah?

1603
01:06:12,790 --> 01:06:14,270
AUDIENCE: We talked
about earlier

1604
01:06:14,270 --> 01:06:17,575
why the library doesn't really
support strict separation

1605
01:06:17,575 --> 01:06:19,165
between those two.

1606
01:06:19,165 --> 01:06:21,390
And then we just mentioned
all these problems

1607
01:06:21,390 --> 01:06:23,800
that you could use
[INAUDIBLE], so we're still

1608
01:06:23,800 --> 01:06:26,680
not under a restriction to use
lch_start necessarily, right?

1609
01:06:26,680 --> 01:06:27,680
PROFESSOR: That's right.

1610
01:06:27,680 --> 01:06:28,179
Yeah.

1611
01:06:28,179 --> 01:06:30,510
So lch_start, here's sort
of the way to think of it.

1612
01:06:30,510 --> 01:06:32,960
So you have an application,
like maybe tcpdump.

1613
01:06:32,960 --> 01:06:36,309
Or gzip is the other
thing they work with.

1614
01:06:36,309 --> 01:06:37,725
And what you're
basically assuming

1615
01:06:37,725 --> 01:06:40,390
is the application is
probably not compromised,

1616
01:06:40,390 --> 01:06:42,960
and there are some core part
of the application that you

1617
01:06:42,960 --> 01:06:44,730
worry about sandboxing.

1618
01:06:44,730 --> 01:06:47,570
In tcpdump's case, it's
actually parsing packets

1619
01:06:47,570 --> 01:06:48,730
coming from the network.

1620
01:06:48,730 --> 01:06:50,660
In gzip's case, it's
actually taking the file

1621
01:06:50,660 --> 01:06:51,915
and decompressing it.

1622
01:06:51,915 --> 01:06:54,250
And you're basically assuming,
well, up until a point,

1623
01:06:54,250 --> 01:06:56,250
the process is probably
doing all the right things.

1624
01:06:56,250 --> 01:06:57,041
It's not exploited.

1625
01:06:57,041 --> 01:06:59,420
There's probably not a bug
yet for the [INAUDIBLE] even.

1626
01:06:59,420 --> 01:07:00,795
So at that point,
you're trusting

1627
01:07:00,795 --> 01:07:04,210
that it will run lch_start
correctly and correctly set up

1628
01:07:04,210 --> 01:07:06,580
the image, correctly set
up all the capabilities,

1629
01:07:06,580 --> 01:07:09,870
and then restrict itself from
making any further system calls

1630
01:07:09,870 --> 01:07:11,840
outside its capability mode.

1631
01:07:11,840 --> 01:07:13,490
And then you run
the dangerous stuff.

1632
01:07:13,490 --> 01:07:16,590
And by then, this setup
has happened correctly,

1633
01:07:16,590 --> 01:07:20,252
and there's no way to
escape out of that sandbox.

1634
01:07:20,252 --> 01:07:22,570
Make sense?

1635
01:07:22,570 --> 01:07:23,690
All right.

1636
01:07:23,690 --> 01:07:28,230
So I guess let's look at how
you actually use capability mode

1637
01:07:28,230 --> 01:07:30,584
to sandbox applications.

1638
01:07:30,584 --> 01:07:32,250
So we talked a little
bit about tcpdump.

1639
01:07:32,250 --> 01:07:36,005
How do you isolate this process?

1640
01:07:36,005 --> 01:07:38,410
Another interesting
example they had

1641
01:07:38,410 --> 01:07:44,660
was this gzip program that
compresses, decompresses files.

1642
01:07:44,660 --> 01:07:47,010
So why do they worry
about sandboxing it?

1643
01:07:47,010 --> 01:07:50,420
I guess they worry that the
decompression code is going

1644
01:07:50,420 --> 01:07:52,740
to be potentially
buggy, or maybe there's

1645
01:07:52,740 --> 01:07:54,880
some memory management
errors in how

1646
01:07:54,880 --> 01:07:58,100
they manage the buffers during
decompression, et cetera.

1647
01:07:58,100 --> 01:08:05,450
So could they-- well, one
interesting question, I guess,

1648
01:08:05,450 --> 01:08:10,390
is why are the changes to
gzip seemingly much more

1649
01:08:10,390 --> 01:08:16,109
complicated than for tcpdump?

1650
01:08:23,670 --> 01:08:24,170
Any guesses?

1651
01:08:26,655 --> 01:08:28,029
Well as far as
you can tell, it's

1652
01:08:28,029 --> 01:08:31,640
mostly just a question of how
the application is structured

1653
01:08:31,640 --> 01:08:32,439
internally, right?

1654
01:08:32,439 --> 01:08:39,170
So if you had a application
that simply compressed

1655
01:08:39,170 --> 01:08:42,029
a single file, or
decompressed a single file,

1656
01:08:42,029 --> 01:08:48,125
then it might be OK for us to
just run it in capability mode

1657
01:08:48,125 --> 01:08:49,249
without really changing it.

1658
01:08:49,249 --> 01:08:52,540
You just give it a new standard
in for something to decompress,

1659
01:08:52,540 --> 01:08:55,830
and the standard out goes
to the decompressed output,

1660
01:08:55,830 --> 01:08:57,300
and that would work fine.

1661
01:08:57,300 --> 01:08:59,830
The problem, as is
almost always the case

1662
01:08:59,830 --> 01:09:01,899
here with these kind of
sandboxing techniques,

1663
01:09:01,899 --> 01:09:04,830
is that the application actually
has much more complicated logic

1664
01:09:04,830 --> 01:09:05,330
around it.

1665
01:09:05,330 --> 01:09:07,359
So gzip, for
example, can compress

1666
01:09:07,359 --> 01:09:09,490
multiple files, et cetera.

1667
01:09:09,490 --> 01:09:13,580
And in that case, you have some
sort of a driver process on top

1668
01:09:13,580 --> 01:09:15,450
which actually has
these extra privileges

1669
01:09:15,450 --> 01:09:18,899
to open multiple files, to
create things, et cetera.

1670
01:09:18,899 --> 01:09:22,300
And the core logic needs to be
often another helper process.

1671
01:09:22,300 --> 01:09:24,600
And it was just so
the case in gzip

1672
01:09:24,600 --> 01:09:27,359
that the application
wasn't structured

1673
01:09:27,359 --> 01:09:29,890
in a way where this was already
a separate process doing

1674
01:09:29,890 --> 01:09:31,689
all the decompression
or compression.

1675
01:09:31,689 --> 01:09:36,020
So they had to change
gzip's core implementation,

1676
01:09:36,020 --> 01:09:42,050
and, well, some structure of
the gzip application, instead

1677
01:09:42,050 --> 01:09:44,560
of just passing the data
to the decompression

1678
01:09:44,560 --> 01:09:47,060
function to actually
send it over an RPC call

1679
01:09:47,060 --> 01:09:49,859
or really just write it to
some almost file descriptor

1680
01:09:49,859 --> 01:09:52,660
to help process the
problems on the side

1681
01:09:52,660 --> 01:09:54,200
and performs all
the decompression

1682
01:09:54,200 --> 01:09:55,940
with almost no privileges.

1683
01:09:55,940 --> 01:09:57,760
The only thing it
can do is return

1684
01:09:57,760 --> 01:10:00,090
the decompressed data,
or the compressed data,

1685
01:10:00,090 --> 01:10:02,670
back to the caller process.

1686
01:10:02,670 --> 01:10:03,670
That roughly make sense?

1687
01:10:03,670 --> 01:10:06,230
What's going on in gzip?

1688
01:10:06,230 --> 01:10:07,820
All right.

1689
01:10:07,820 --> 01:10:12,180
So I guess one thing we asked
for the homework is how do you

1690
01:10:12,180 --> 01:10:13,667
actually use Capsicum in OKWS?

1691
01:10:13,667 --> 01:10:14,750
So what do you guys think?

1692
01:10:14,750 --> 01:10:17,025
Would it be useful?

1693
01:10:17,025 --> 01:10:19,385
Would the OKWS guys
have been excited

1694
01:10:19,385 --> 01:10:23,980
and switched to FreeBSD because
this was much easier to use?

1695
01:10:23,980 --> 01:10:25,590
Or is this a wash?

1696
01:10:25,590 --> 01:10:26,777
So what do you think?

1697
01:10:26,777 --> 01:10:28,360
How would you use
Capsicum in FreeBSD?

1698
01:10:28,360 --> 01:10:30,954
Would this be much different?

1699
01:10:30,954 --> 01:10:31,890
Yeah.

1700
01:10:31,890 --> 01:10:33,765
AUDIENCE: So it means
you can get rid of some

1701
01:10:33,765 --> 01:10:36,944
of the jailing [INAUDIBLE].

1702
01:10:36,944 --> 01:10:37,610
PROFESSOR: Yeah.

1703
01:10:37,610 --> 01:10:38,109
That's true.

1704
01:10:38,109 --> 01:10:40,600
So truth seems to be completely
superseded by this plan

1705
01:10:40,600 --> 01:10:42,980
of having directory file
descriptors and capabilities.

1706
01:10:42,980 --> 01:10:43,646
So that's great.

1707
01:10:43,646 --> 01:10:45,980
So you don't need the
chroots setting it up.

1708
01:10:45,980 --> 01:10:46,770
That seems messy.

1709
01:10:46,770 --> 01:10:48,270
And this is much
more precise, also.

1710
01:10:48,270 --> 01:10:49,996
Because you can--
instead of having

1711
01:10:49,996 --> 01:10:51,870
a chroot with lots of
little things in there,

1712
01:10:51,870 --> 01:10:54,397
you have to maybe set the
permissions on there carefully.

1713
01:10:54,397 --> 01:10:56,480
You can just open exactly
the files that you need.

1714
01:10:56,480 --> 01:10:58,800
So that seems like a plus.

1715
01:10:58,800 --> 01:11:00,788
Any other benefits?

1716
01:11:00,788 --> 01:11:01,288
Yeah.

1717
01:11:01,288 --> 01:11:02,236
AUDIENCE: [INAUDIBLE].

1718
01:11:06,502 --> 01:11:08,120
PROFESSOR: In OKWS, you mean?

1719
01:11:08,120 --> 01:11:09,036
AUDIENCE: [INAUDIBLE].

1720
01:11:09,036 --> 01:11:09,438
PROFESSOR: Yeah.

1721
01:11:09,438 --> 01:11:11,880
So in OKWS, right, you have
this OK launcher daemon that

1722
01:11:11,880 --> 01:11:14,150
had to launch all these guys.

1723
01:11:14,150 --> 01:11:15,870
And it was the parent process.

1724
01:11:15,870 --> 01:11:18,030
Only when they die,
the signal goes back

1725
01:11:18,030 --> 01:11:22,197
to this okld to restart
the crash process.

1726
01:11:22,197 --> 01:11:24,155
And that thing had to
run this root, because it

1727
01:11:24,155 --> 01:11:25,700
had to sandbox things.

1728
01:11:25,700 --> 01:11:28,140
There's actually a number of
things you could do better

1729
01:11:28,140 --> 01:11:31,240
with Capsicum in OKWS.

1730
01:11:31,240 --> 01:11:33,200
So one example is
you could probably

1731
01:11:33,200 --> 01:11:35,410
have okld have many
fewer privileges.

1732
01:11:35,410 --> 01:11:39,410
Because it might need to be
root initially to get fort 80.

1733
01:11:39,410 --> 01:11:42,516
But after that, it could set
up sandboxes for everyone else

1734
01:11:42,516 --> 01:11:43,640
without being root anymore.

1735
01:11:43,640 --> 01:11:44,670
So that's kind of cool.

1736
01:11:44,670 --> 01:11:46,620
And maybe you can
even delegate the job

1737
01:11:46,620 --> 01:11:48,870
of responding a process
to someone else,

1738
01:11:48,870 --> 01:11:50,930
maybe a per service
monitor Damion

1739
01:11:50,930 --> 01:11:54,430
that just has this
process descriptor handle,

1740
01:11:54,430 --> 01:11:56,950
or process descriptor
for child process,

1741
01:11:56,950 --> 01:11:58,870
and whenever it crashes,
starts a new one.

1742
01:11:58,870 --> 01:12:02,745
So I think this process
[INAUDIBLE] helps things a lot.

1743
01:12:02,745 --> 01:12:06,160
And the fact that you can create
a sandbox without being root

1744
01:12:06,160 --> 01:12:09,542
is also quite helpful, as well.

1745
01:12:09,542 --> 01:12:11,000
Any other stuff,
what you could do?

1746
01:12:11,000 --> 01:12:11,440
Yeah?

1747
01:12:11,440 --> 01:12:12,320
AUDIENCE: You
could give each one

1748
01:12:12,320 --> 01:12:14,387
a file descriptor with
append only mode to the log.

1749
01:12:14,387 --> 01:12:15,053
PROFESSOR: Yeah.

1750
01:12:15,053 --> 01:12:16,750
So that's pretty cool.

1751
01:12:16,750 --> 01:12:19,560
So as we were talking
last time, in OKWS,

1752
01:12:19,560 --> 01:12:23,675
well, the oklogd maybe could
hamper with the log file.

1753
01:12:23,675 --> 01:12:25,373
And who knows what
the kernel will

1754
01:12:25,373 --> 01:12:27,710
allow it to do once it has
a file descriptor on the log

1755
01:12:27,710 --> 01:12:28,670
file itself.

1756
01:12:28,670 --> 01:12:30,090
But here, the fact
that we can do

1757
01:12:30,090 --> 01:12:33,010
much more of a
precise capability map

1758
01:12:33,010 --> 01:12:35,562
on a file descriptor, well,
we could give it a log file

1759
01:12:35,562 --> 01:12:37,895
and say, well, you could just
write to it, but not seek.

1760
01:12:37,895 --> 01:12:40,150
So that basically
means append only,

1761
01:12:40,150 --> 01:12:41,935
if you're the only
writer to that file.

1762
01:12:41,935 --> 01:12:43,060
So that seems kind of nice.

1763
01:12:43,060 --> 01:12:45,270
And you could prevent
it from reading a file.

1764
01:12:45,270 --> 01:12:47,140
You could say, well, you can
only write, but not read,

1765
01:12:47,140 --> 01:12:48,270
which is something
that's probably

1766
01:12:48,270 --> 01:12:50,519
difficult to do with Unix
permissions alone right now.

1767
01:12:53,253 --> 01:12:54,630
Make sense?

1768
01:12:54,630 --> 01:12:57,120
Any other ideas for how
Capsicum might help?

1769
01:12:59,680 --> 01:13:01,815
Would you wish there was
more stuff in Capsicum?

1770
01:13:01,815 --> 01:13:03,670
I guess we always wish
there was more stuff.

1771
01:13:03,670 --> 01:13:05,128
AUDIENCE: So one
thing that perhaps

1772
01:13:05,128 --> 01:13:07,326
may be tricky is the
service team daemons need

1773
01:13:07,326 --> 01:13:11,617
to connected to their
backend databases somehow.

1774
01:13:11,617 --> 01:13:13,470
Which might be remotely.

1775
01:13:13,470 --> 01:13:15,235
But you don't want
the launch daemon

1776
01:13:15,235 --> 01:13:17,235
to know about which
services each service

1777
01:13:17,235 --> 01:13:18,722
is going to connect to.

1778
01:13:18,722 --> 01:13:19,680
PROFESSOR: Maybe, yeah.

1779
01:13:19,680 --> 01:13:20,930
That's a good question, right?

1780
01:13:20,930 --> 01:13:23,990
So in Capsicum, as we
were talking about,

1781
01:13:23,990 --> 01:13:25,780
the network is in
global namespace.

1782
01:13:25,780 --> 01:13:27,570
You have to have
existing file descriptors

1783
01:13:27,570 --> 01:13:29,910
for all the outstanding
connections ahead of time.

1784
01:13:29,910 --> 01:13:30,576
AUDIENCE: Right.

1785
01:13:30,576 --> 01:13:33,675
But you don't necessarily want
okld to open up all the sockets

1786
01:13:33,675 --> 01:13:34,700
for all the services.

1787
01:13:34,700 --> 01:13:37,940
Because it might not know where
the services are connected.

1788
01:13:37,940 --> 01:13:38,140
PROFESSOR: That's right.

1789
01:13:38,140 --> 01:13:38,510
Yeah.

1790
01:13:38,510 --> 01:13:39,960
So that's a little bit
of an awkward thing.

1791
01:13:39,960 --> 01:13:40,850
I absolutely agree.

1792
01:13:40,850 --> 01:13:42,700
And this is part
of the reason why

1793
01:13:42,700 --> 01:13:44,950
I think capabilities
haven't completely

1794
01:13:44,950 --> 01:13:46,830
subsumed everything
in the security world,

1795
01:13:46,830 --> 01:13:48,350
is because they are
kind of awkward to use.

1796
01:13:48,350 --> 01:13:50,430
Because the guy that gives
you all the privileges

1797
01:13:50,430 --> 01:13:52,638
has to know exactly what
things you're going to need,

1798
01:13:52,638 --> 01:13:55,100
like these connections
to backend servers.

1799
01:13:55,100 --> 01:13:58,150
So at some level, maybe this
is not such a huge problem

1800
01:13:58,150 --> 01:13:58,650
in OKWS.

1801
01:13:58,650 --> 01:14:01,330
Because the launcher
daemon has to read a Config

1802
01:14:01,330 --> 01:14:03,610
file and is going to pass
the token to the service

1803
01:14:03,610 --> 01:14:04,401
in the first place.

1804
01:14:04,401 --> 01:14:07,070
So maybe the token is going
to contain the host and port

1805
01:14:07,070 --> 01:14:08,580
number to which
you're connected to.

1806
01:14:08,580 --> 01:14:09,080
But I agree.

1807
01:14:09,080 --> 01:14:10,360
It's not great.

1808
01:14:10,360 --> 01:14:12,590
Because especially,
suppose the database server

1809
01:14:12,590 --> 01:14:13,780
disconnects you.

1810
01:14:13,780 --> 01:14:15,150
Well, you're kind of stuck now.

1811
01:14:15,150 --> 01:14:17,135
The file server is
not connected anymore,

1812
01:14:17,135 --> 01:14:18,060
and you can't
connect to a new one.

1813
01:14:18,060 --> 01:14:20,476
So basically, if the database
server crashes, or restarts,

1814
01:14:20,476 --> 01:14:22,130
or the network
breaks, you basically

1815
01:14:22,130 --> 01:14:24,500
have to terminate it,
get yourself response,

1816
01:14:24,500 --> 01:14:27,230
so you can get a new one of
these connections past you.

1817
01:14:27,230 --> 01:14:29,104
So it's maybe not a
great plan in that sense.

1818
01:14:29,104 --> 01:14:32,518
AUDIENCE: Could we wrap the
system call, the function

1819
01:14:32,518 --> 01:14:35,144
[INAUDIBLE] to open a
socket so that it faults

1820
01:14:35,144 --> 01:14:37,602
the middleman instead of the
socket that the users send out

1821
01:14:37,602 --> 01:14:39,254
to [INAUDIBLE]?

1822
01:14:39,254 --> 01:14:39,920
PROFESSOR: Yeah.

1823
01:14:39,920 --> 01:14:43,130
This is what I think the
FreeBSD guys have done since.

1824
01:14:43,130 --> 01:14:46,312
Well, there's a
bunch of situations

1825
01:14:46,312 --> 01:14:48,770
like this, where you want to
open some file after the fact,

1826
01:14:48,770 --> 01:14:50,728
or you want to connect
to something after going

1827
01:14:50,728 --> 01:14:51,880
into capability mode.

1828
01:14:51,880 --> 01:14:54,060
So the FreeBSD
developers have added

1829
01:14:54,060 --> 01:14:58,250
this daemon called Casper, that
every capability based process

1830
01:14:58,250 --> 01:14:59,470
has a handle on.

1831
01:14:59,470 --> 01:15:03,010
And this Casper daemon runs
outside of capability mode,

1832
01:15:03,010 --> 01:15:04,470
and basically
listens to requests

1833
01:15:04,470 --> 01:15:06,380
from sandbox processes.

1834
01:15:06,380 --> 01:15:09,790
And if you want
to open some file,

1835
01:15:09,790 --> 01:15:12,400
or if you want to send a
network connection, or a packet,

1836
01:15:12,400 --> 01:15:14,980
or something, but you didn't
have the right capability

1837
01:15:14,980 --> 01:15:18,250
beforehand, then this Casper
daemon will do it for you.

1838
01:15:18,250 --> 01:15:21,022
But it carefully
maintains a list of things

1839
01:15:21,022 --> 01:15:22,980
that every sandbox process
should or should not

1840
01:15:22,980 --> 01:15:24,010
be able to do.

1841
01:15:24,010 --> 01:15:25,870
So it's like a systems service.

1842
01:15:25,870 --> 01:15:28,400
So when you start a
capability process,

1843
01:15:28,400 --> 01:15:30,900
or enter capability
mode, by default,

1844
01:15:30,900 --> 01:15:33,620
this Casper thing will not allow
you to do anything extra funny.

1845
01:15:33,620 --> 01:15:35,250
But you could say,
well, hey, I'm

1846
01:15:35,250 --> 01:15:37,050
going to start the
sandbox process.

1847
01:15:37,050 --> 01:15:40,750
And you can ask Casper,
well, please allow my process

1848
01:15:40,750 --> 01:15:42,977
to do the following
things later.

1849
01:15:42,977 --> 01:15:43,810
So you could, right?

1850
01:15:43,810 --> 01:15:46,240
And the cool thing is that
you can pass file descriptors

1851
01:15:46,240 --> 01:15:48,700
or capabilities through
fd passing in Unix.

1852
01:15:48,700 --> 01:15:51,520
So once you have a handle
on this Casper guy,

1853
01:15:51,520 --> 01:15:55,120
you can get more
capabilities later on.

1854
01:15:55,120 --> 01:15:58,680
So it's, again, trade off
between being pure capability

1855
01:15:58,680 --> 01:16:04,330
world versus actually being
programmable or easy to use.

1856
01:16:04,330 --> 01:16:06,110
So it seems to be working out.

1857
01:16:06,110 --> 01:16:10,230
I think the particular thing
they use it for in FreeBSD,

1858
01:16:10,230 --> 01:16:13,350
or the thing that shows up
often, is making DNS queries.

1859
01:16:13,350 --> 01:16:15,600
So you want to be able to
make DNS queries once you're

1860
01:16:15,600 --> 01:16:16,150
in a sandbox.

1861
01:16:16,150 --> 01:16:18,608
And actually, this is a problem
they ran into with tcpdump.

1862
01:16:18,608 --> 01:16:20,850
Because when tcpdump is
printing your packets,

1863
01:16:20,850 --> 01:16:22,580
it wants to print the host
name for an IP address.

1864
01:16:22,580 --> 01:16:24,680
In order to do this, it has
to talk to a DNS server.

1865
01:16:24,680 --> 01:16:26,263
But you probably
don't want to connect

1866
01:16:26,263 --> 01:16:28,940
to a DNS server ahead of
time, or to every DNS server

1867
01:16:28,940 --> 01:16:30,320
you might ever need.

1868
01:16:30,320 --> 01:16:32,230
So instead, they use
this helper daemon

1869
01:16:32,230 --> 01:16:35,440
that's going to make
DNS queries for you.

1870
01:16:35,440 --> 01:16:37,388
Make sense?

1871
01:16:37,388 --> 01:16:38,750
All right.

1872
01:16:38,750 --> 01:16:42,905
So I guess the last thing
I wanted to talk about

1873
01:16:42,905 --> 01:16:46,310
is what are the security
guarantees that Capsicum

1874
01:16:46,310 --> 01:16:46,810
provides?

1875
01:16:46,810 --> 01:16:49,120
So should you trust it?

1876
01:16:49,120 --> 01:16:50,700
How could Capsicum go wrong?

1877
01:16:53,399 --> 01:16:55,440
Presumably you can always
have security problems,

1878
01:16:55,440 --> 01:16:57,870
regardless of what mechanism
you're using underneath.

1879
01:16:57,870 --> 01:16:59,370
But what particular
things should we

1880
01:16:59,370 --> 01:17:01,930
worry about in
Capsicum when we're

1881
01:17:01,930 --> 01:17:03,310
building some system here?

1882
01:17:06,710 --> 01:17:08,680
Suppose you have to
attack this thing.

1883
01:17:08,680 --> 01:17:11,970
You have to attack this
tcpdump thing, or gzip,

1884
01:17:11,970 --> 01:17:14,060
or whatever it is
that they implemented.

1885
01:17:14,060 --> 01:17:18,039
What would you look at, in
terms of bugs or problems?

1886
01:17:18,039 --> 01:17:19,872
AUDIENCE: Well, it
depends on the developers

1887
01:17:19,872 --> 01:17:21,524
knowing what they're doing.

1888
01:17:21,524 --> 01:17:24,220
So they might give
a bad capability.

1889
01:17:24,220 --> 01:17:25,220
PROFESSOR: That's right.

1890
01:17:25,220 --> 01:17:25,350
Yeah.

1891
01:17:25,350 --> 01:17:27,710
So it's actually one
interesting property of Capsicum

1892
01:17:27,710 --> 01:17:30,640
is that it's not a guarantee
that the user of the system

1893
01:17:30,640 --> 01:17:31,430
gets.

1894
01:17:31,430 --> 01:17:33,290
It's really a tool
that the developer

1895
01:17:33,290 --> 01:17:38,260
has to build more trustworthy
or better application software.

1896
01:17:38,260 --> 01:17:40,095
But I, as a user of the
system, have no idea

1897
01:17:40,095 --> 01:17:41,553
whether this is a
good or bad thing

1898
01:17:41,553 --> 01:17:43,178
that the application
is using Capsicum.

1899
01:17:43,178 --> 01:17:46,440
You could totally misuse it,
as you're absolutely right.

1900
01:17:46,440 --> 01:17:49,170
So maybe one example is,
as they show in the paper,

1901
01:17:49,170 --> 01:17:51,490
you could give too many
privileges to the sandbox

1902
01:17:51,490 --> 01:17:51,990
process.

1903
01:17:51,990 --> 01:17:53,810
Like the the TCP
helper, or maybe

1904
01:17:53,810 --> 01:17:55,030
it has access to my console.

1905
01:17:55,030 --> 01:17:57,900
And that's not so great,
but it's hard for me

1906
01:17:57,900 --> 01:18:01,130
as a user to really tell this
in a general purpose fashion.

1907
01:18:01,130 --> 01:18:01,828
Yeah?

1908
01:18:01,828 --> 01:18:05,443
AUDIENCE: It might also be that
when you set the permissions

1909
01:18:05,443 --> 01:18:09,100
to the masks on any
given file descriptor

1910
01:18:09,100 --> 01:18:11,304
that you set two
permission masks.

1911
01:18:11,304 --> 01:18:11,970
PROFESSOR: Yeah.

1912
01:18:11,970 --> 01:18:12,170
Right.

1913
01:18:12,170 --> 01:18:13,720
So it's not just the
file descriptors.

1914
01:18:13,720 --> 01:18:15,610
Also, what can you do with
those file descriptors?

1915
01:18:15,610 --> 01:18:16,220
You're right.

1916
01:18:16,220 --> 01:18:16,450
Yes.

1917
01:18:16,450 --> 01:18:18,140
These maps are another
part of the story

1918
01:18:18,140 --> 01:18:21,460
that you have to watch out for.

1919
01:18:21,460 --> 01:18:21,980
OK.

1920
01:18:21,980 --> 01:18:23,594
So suppose we got
the masks right.

1921
01:18:23,594 --> 01:18:25,010
We got the file
descriptors right.

1922
01:18:25,010 --> 01:18:26,120
We haven't used lth_start.

1923
01:18:26,120 --> 01:18:28,740
There's nothing extra in memory.

1924
01:18:28,740 --> 01:18:30,532
AUDIENCE: [INAUDIBLE].

1925
01:18:30,532 --> 01:18:31,490
PROFESSOR: That's true.

1926
01:18:31,490 --> 01:18:31,990
Yes.

1927
01:18:31,990 --> 01:18:34,030
So maybe there's like
something before you even

1928
01:18:34,030 --> 01:18:35,950
add the capability
mode that's damaging.

1929
01:18:35,950 --> 01:18:39,030
So it only helps
once you jump in.

1930
01:18:39,030 --> 01:18:42,240
And one slightly
annoying thing is

1931
01:18:42,240 --> 01:18:47,360
that it seems like it can't do
a whole lot inside of capability

1932
01:18:47,360 --> 01:18:51,560
mode, not in the sense that you
can't run large computations,

1933
01:18:51,560 --> 01:18:55,010
but you can't really put a large
part of a complicated system

1934
01:18:55,010 --> 01:18:55,900
into capability mode.

1935
01:18:55,900 --> 01:18:57,358
Because inevitably,
in Unix, you'll

1936
01:18:57,358 --> 01:18:59,820
need to do something
with new processes,

1937
01:18:59,820 --> 01:19:01,870
opening network
connections, et cetera.

1938
01:19:01,870 --> 01:19:03,487
And you'll probably
need to use some

1939
01:19:03,487 --> 01:19:05,130
of these global
namespaces that are not

1940
01:19:05,130 --> 01:19:06,790
available in capability mode.

1941
01:19:06,790 --> 01:19:08,330
So it's probably
going to be quite

1942
01:19:08,330 --> 01:19:12,790
difficult to put large chunks
of logic or intricate system

1943
01:19:12,790 --> 01:19:15,370
code inside of capability mode.

1944
01:19:15,370 --> 01:19:19,760
So only well-defined
chunks of an application

1945
01:19:19,760 --> 01:19:22,500
are likely to be running
in capability mode.

1946
01:19:22,500 --> 01:19:23,000
It depends.

1947
01:19:23,000 --> 01:19:25,520
I don't know if this is
entirely true or not.

1948
01:19:25,520 --> 01:19:27,180
In Chrome, for example,
large processes

1949
01:19:27,180 --> 01:19:30,460
do run in capability
mode in their design.

1950
01:19:30,460 --> 01:19:32,960
It might be that
you basically have

1951
01:19:32,960 --> 01:19:37,190
to have non-capability mode
chunks of your application

1952
01:19:37,190 --> 01:19:40,390
because you wanted to
incorporate nicely with Unix,

1953
01:19:40,390 --> 01:19:44,330
or whatever is is you're
running alongside of it.

1954
01:19:44,330 --> 01:19:44,910
OK.

1955
01:19:44,910 --> 01:19:48,460
Any other thing you
should worry about?

1956
01:19:48,460 --> 01:19:49,170
Yeah?

1957
01:19:49,170 --> 01:19:51,450
AUDIENCE: Well, whether they
implemented capabilities

1958
01:19:51,450 --> 01:19:52,090
correctly.

1959
01:19:52,090 --> 01:19:53,012
PROFESSOR: Yeah.

1960
01:19:53,012 --> 01:19:55,320
AUDIENCE: Whether they've
covered all the system calls.

1961
01:19:55,320 --> 01:19:55,750
PROFESSOR: That's right.

1962
01:19:55,750 --> 01:19:56,010
Yes.

1963
01:19:56,010 --> 01:19:58,220
So that's actually a huge
problem, in some sense,

1964
01:19:58,220 --> 01:19:58,980
already.

1965
01:19:58,980 --> 01:20:01,230
If you think about
it, there's probably

1966
01:20:01,230 --> 01:20:03,960
hundreds of system calls
that the kernel provides you.

1967
01:20:03,960 --> 01:20:06,529
And they're not especially
precisely documented,

1968
01:20:06,529 --> 01:20:08,695
so you probably have to
look at their implementation

1969
01:20:08,695 --> 01:20:11,242
and see if, for every
system call, if there's

1970
01:20:11,242 --> 01:20:13,650
some way for the
applications to get

1971
01:20:13,650 --> 01:20:16,010
the system call to
perform some operation

1972
01:20:16,010 --> 01:20:18,600
on some extra object that didn't
have a file descriptor to it.

1973
01:20:18,600 --> 01:20:20,490
And most Unix
system calls weren't

1974
01:20:20,490 --> 01:20:22,870
written with the
expectation of everything

1975
01:20:22,870 --> 01:20:24,600
has to be operation
on a file descriptor.

1976
01:20:24,600 --> 01:20:27,160
So you really have to get
every system all right.

1977
01:20:27,160 --> 01:20:30,100
And probably more worryingly
is that the kernel has

1978
01:20:30,100 --> 01:20:32,300
to be free of bugs,
like buffer overflows

1979
01:20:32,300 --> 01:20:34,884
or whatever other memory
corruption like you guys

1980
01:20:34,884 --> 01:20:35,800
explained [INAUDIBLE].

1981
01:20:35,800 --> 01:20:37,940
Otherwise, all of this
is complete nonsense.

1982
01:20:37,940 --> 01:20:40,300
You just are on arbitrary
assembly code in the kernel,

1983
01:20:40,300 --> 01:20:43,298
and you have full
control of the machine.

1984
01:20:43,298 --> 01:20:44,214
AUDIENCE: [INAUDIBLE].

1985
01:20:52,225 --> 01:20:53,300
PROFESSOR: Yeah.

1986
01:20:53,300 --> 01:20:54,100
I guess, yeah.

1987
01:20:54,100 --> 01:20:55,900
So the one thing I didn't
get a chance to talk about

1988
01:20:55,900 --> 01:20:56,816
is alternative things.

1989
01:20:56,816 --> 01:20:58,150
So this is in FreeBSD.

1990
01:20:58,150 --> 01:20:59,990
Linux has this thing
called [INAUDIBLE],

1991
01:20:59,990 --> 01:21:04,140
that allows you to specify which
system calls you can operate.

1992
01:21:04,140 --> 01:21:06,070
If you squinted, it's
kind of like Capsicum

1993
01:21:06,070 --> 01:21:08,190
but very different, in
the sense that Capsicum

1994
01:21:08,190 --> 01:21:09,731
talks about specific
file descriptors

1995
01:21:09,731 --> 01:21:11,010
that you can operate.

1996
01:21:11,010 --> 01:21:12,812
And in Linux, the
[INAUDIBLE] mechanism

1997
01:21:12,812 --> 01:21:14,520
lets you talk about
specific system calls

1998
01:21:14,520 --> 01:21:16,040
that you could run.

1999
01:21:16,040 --> 01:21:18,670
So it's probably
less fine grained,

2000
01:21:18,670 --> 01:21:22,110
but it's what's
available in Linux today.

2001
01:21:22,110 --> 01:21:24,289
And it's actually
probably a good idea

2002
01:21:24,289 --> 01:21:26,622
to look at your applications
and see what system call do

2003
01:21:26,622 --> 01:21:29,450
you expect it to make
and then code in a filter

2004
01:21:29,450 --> 01:21:31,770
and allow it to make
only those system calls.

2005
01:21:31,770 --> 01:21:34,311
The problem is that if you have
any interesting applications,

2006
01:21:34,311 --> 01:21:36,215
it'll probably run exec
and open and write,

2007
01:21:36,215 --> 01:21:38,680
and that's probably enough
to do quite a bit of damage

2008
01:21:38,680 --> 01:21:39,264
to the system.

2009
01:21:39,264 --> 01:21:41,763
So that's why you probably want
the more fine-grained system

2010
01:21:41,763 --> 01:21:43,170
like Capsicum,
where you can say,

2011
01:21:43,170 --> 01:21:45,630
well, you can run right,
but only on this thing,

2012
01:21:45,630 --> 01:21:49,350
not on my entire home directory.

2013
01:21:49,350 --> 01:21:49,850
All right.

2014
01:21:49,850 --> 01:21:51,520
So I guess we're out of
time to talk about Capsicum.

2015
01:21:51,520 --> 01:21:53,250
So let's talk about
native clients

2016
01:21:53,250 --> 01:21:56,840
on Wednesday and a different
way to sandbox programs.