1
00:00:00,850 --> 00:00:03,220
The following content is
provided under a Creative

2
00:00:03,220 --> 00:00:04,610
Commons license.

3
00:00:04,610 --> 00:00:06,820
Your support will help
MIT OpenCourseWare

4
00:00:06,820 --> 00:00:10,910
continue to offer high quality
educational resources for free.

5
00:00:10,910 --> 00:00:13,480
To make a donation or to
view additional materials

6
00:00:13,480 --> 00:00:17,440
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,440 --> 00:00:18,313
at ocw.mit.edu.

8
00:00:22,100 --> 00:00:23,350
ALIN TOMESCU: My name is Alin.

9
00:00:23,350 --> 00:00:25,590
I work in Stata, in
the Stata Center.

10
00:00:25,590 --> 00:00:27,760
I'm a PhD student
there in my fifth year.

11
00:00:27,760 --> 00:00:30,730
And today we're going
to be talking about one

12
00:00:30,730 --> 00:00:32,740
of our research
project called Catena.

13
00:00:32,740 --> 00:00:35,020
And Catena is a really
nice way of using bitcoin

14
00:00:35,020 --> 00:00:37,390
to build append-only logs.

15
00:00:37,390 --> 00:00:39,250
Bitcoin itself is
an append-only log.

16
00:00:39,250 --> 00:00:43,270
And a lot of people have been
using it to put data in it.

17
00:00:43,270 --> 00:00:45,040
And I'll describe a
really efficient way

18
00:00:45,040 --> 00:00:47,200
of doing that and
its applications.

19
00:00:47,200 --> 00:00:49,930
And if there is time, we'll talk
about a tax and maybe colored

20
00:00:49,930 --> 00:00:51,580
coins and some other stuff.

21
00:00:51,580 --> 00:00:53,830
We'll talk about the
what, the how and the why.

22
00:00:53,830 --> 00:00:57,040
And that's the overview
of the presentation.

23
00:00:57,040 --> 00:00:59,780
So let's talk about this
problem called the equivocation

24
00:00:59,780 --> 00:01:00,280
problem.

25
00:01:00,280 --> 00:01:02,380
So what is this?

26
00:01:02,380 --> 00:01:05,740
In general, non-equivocation
means saying the same thing

27
00:01:05,740 --> 00:01:06,850
to everybody.

28
00:01:06,850 --> 00:01:09,160
So for example, if you
have a malicious service

29
00:01:09,160 --> 00:01:11,980
and you have Alice
and Bob, the service

30
00:01:11,980 --> 00:01:14,170
should say the same
thing to Alice and Bob.

31
00:01:14,170 --> 00:01:16,060
So it would make a
bunch of statements.

32
00:01:16,060 --> 00:01:19,180
Let's say s1, s2, s3 over
time and Alice and Bob

33
00:01:19,180 --> 00:01:21,047
would see all of
these statements.

34
00:01:21,047 --> 00:01:23,380
So this is very similar to
what bitcoin provides, right?

35
00:01:23,380 --> 00:01:26,050
In bitcoin you see block
one, block two, block three.

36
00:01:26,050 --> 00:01:28,570
And everybody agrees on these
blocks in sequence, right?

37
00:01:28,570 --> 00:01:30,230
Does that make sense?

38
00:01:30,230 --> 00:01:32,290
So this is non-equivocation
and in some sense

39
00:01:32,290 --> 00:01:34,840
this is what bitcoin
already offers.

40
00:01:34,840 --> 00:01:37,363
And in general with
non-equivocation,

41
00:01:37,363 --> 00:01:39,280
what you might get is
some of these statements

42
00:01:39,280 --> 00:01:41,450
might actually be
false or incorrect.

43
00:01:41,450 --> 00:01:43,240
Non-equivocation
doesn't guarantee you

44
00:01:43,240 --> 00:01:45,132
that this statement is
a correct statement.

45
00:01:45,132 --> 00:01:46,840
But it just guarantees
you that everybody

46
00:01:46,840 --> 00:01:47,882
sees the same statements.

47
00:01:47,882 --> 00:01:49,900
And then they can detect
incorrect statements.

48
00:01:49,900 --> 00:01:51,220
In bitcoin you get
a little bit more.

49
00:01:51,220 --> 00:01:52,970
You actually know that
if this is a block,

50
00:01:52,970 --> 00:01:55,930
it's a valid block, assuming
there are enough blocks on top

51
00:01:55,930 --> 00:01:57,400
of it, right?

52
00:01:57,400 --> 00:02:00,520
So equivocation means saying
the same thing to everybody.

53
00:02:00,520 --> 00:02:03,610
So for example, this malicious
service at time four,

54
00:02:03,610 --> 00:02:06,520
he might show Bob a different
statement than Alice.

55
00:02:06,520 --> 00:02:09,550
So Bob sees s4 and
Alice sees s4 prime.

56
00:02:09,550 --> 00:02:11,590
This is what happens
in bitcoin sometimes.

57
00:02:11,590 --> 00:02:13,750
And that's how you can
double spend in bitcoin

58
00:02:13,750 --> 00:02:16,660
by putting the transaction here
sending money to the merchant,

59
00:02:16,660 --> 00:02:18,160
and then putting
another transaction

60
00:02:18,160 --> 00:02:20,110
here sending money back to you.

61
00:02:20,110 --> 00:02:21,960
Right, are you guys
familiar with this?

62
00:02:21,960 --> 00:02:23,820
Yeah, OK.

63
00:02:23,820 --> 00:02:25,360
All right, so why
does this matter?

64
00:02:25,360 --> 00:02:26,800
Let me give you a silly example.

65
00:02:26,800 --> 00:02:29,860
Suppose we have Jimmy and we
have Jimmy's mom and Jimmy's

66
00:02:29,860 --> 00:02:30,695
dad, right?

67
00:02:30,695 --> 00:02:32,320
And Jimmy wants to
go outside and play.

68
00:02:32,320 --> 00:02:36,070
But he knows that mom and dad
usually don't let him play.

69
00:02:36,070 --> 00:02:39,400
So what he does is he
tells dad, hey dad,

70
00:02:39,400 --> 00:02:41,210
mom said I can go outside.

71
00:02:41,210 --> 00:02:44,110
Right, and then he
tells mom, hey mom.

72
00:02:44,110 --> 00:02:45,470
Dad said I can go outside.

73
00:02:45,470 --> 00:02:47,470
And let's say mom and dad
are in different rooms

74
00:02:47,470 --> 00:02:48,910
and they're watching soap
operas and they're not

75
00:02:48,910 --> 00:02:50,020
talking to one another.

76
00:02:50,020 --> 00:02:52,530
So they can actually
confirm that.

77
00:02:52,530 --> 00:02:54,713
You know, mom can confirm
that dad really said that.

78
00:02:54,713 --> 00:02:56,630
And dad can really confirm
that mom said that.

79
00:02:56,630 --> 00:02:58,642
But they both trust Jimmy.

80
00:02:58,642 --> 00:03:00,850
So you see how equivocation
can be really problematic

81
00:03:00,850 --> 00:03:02,740
because now mom and
dad will say sure,

82
00:03:02,740 --> 00:03:05,860
go outside as long as the
other person said that, right?

83
00:03:05,860 --> 00:03:08,540
But let me give you a
more practical example.

84
00:03:08,540 --> 00:03:11,580
So let's look at something
called a public-key directory.

85
00:03:11,580 --> 00:03:15,200
A public-key directory allows
you to map user's public keys--

86
00:03:15,200 --> 00:03:16,367
a user name to a public key.

87
00:03:16,367 --> 00:03:18,283
Right, so here I have
the public key for Alice

88
00:03:18,283 --> 00:03:19,930
and here I have the
public key for Bob.

89
00:03:19,930 --> 00:03:22,138
And they look up each other's
keys in this directory.

90
00:03:22,138 --> 00:03:24,490
And then they can set
up a secure channel.

91
00:03:24,490 --> 00:03:27,370
How many of you guys use
Whatsapp, for example?

92
00:03:27,370 --> 00:03:30,340
So the Whatsapp server has
a public-key directory.

93
00:03:30,340 --> 00:03:32,050
And when I want to
send you a message,

94
00:03:32,050 --> 00:03:34,480
I look up your phone
number in that directory

95
00:03:34,480 --> 00:03:36,100
and I get your
public key, right?

96
00:03:36,100 --> 00:03:38,940
If that directory equivocates,
the following thing can happen.

97
00:03:38,940 --> 00:03:41,590
What the directory can do is
it can create a new directory

98
00:03:41,590 --> 00:03:44,650
at time two where he puts
a fake public key for Bob

99
00:03:44,650 --> 00:03:47,650
and he shows this
to Alice, right?

100
00:03:47,650 --> 00:03:51,730
And at time two also, he creates
another directory for Bob

101
00:03:51,730 --> 00:03:55,000
where he puts a fake public
key for Alice, right?

102
00:03:55,000 --> 00:03:57,430
So now the problem
here is that when

103
00:03:57,430 --> 00:04:00,640
Alice checks in this directory,
she looks at her own public key

104
00:04:00,640 --> 00:04:02,695
to make sure she's
not impersonated.

105
00:04:02,695 --> 00:04:04,570
And Alice looks in this
version and sees, OK.

106
00:04:04,570 --> 00:04:05,390
That is my public key.

107
00:04:05,390 --> 00:04:05,770
I'm good.

108
00:04:05,770 --> 00:04:06,940
She looks in this version, OK.

109
00:04:06,940 --> 00:04:07,857
This is my public key.

110
00:04:07,857 --> 00:04:08,495
I'm good.

111
00:04:08,495 --> 00:04:10,120
So now I'm ready to
use this directory.

112
00:04:10,120 --> 00:04:12,580
And I'll look up Bob and
I'll get his public key.

113
00:04:12,580 --> 00:04:14,680
But Alice will actually
get the wrong public key.

114
00:04:14,680 --> 00:04:17,070
Does everybody see that?

115
00:04:17,070 --> 00:04:19,420
And similarly Bob
will do the same.

116
00:04:19,420 --> 00:04:22,990
So Bob will look in his fork
of the directory, right?

117
00:04:22,990 --> 00:04:25,493
And he looks up his key
here and his key here.

118
00:04:25,493 --> 00:04:26,410
And he thinks he's OK.

119
00:04:26,410 --> 00:04:27,327
He's not impersonated.

120
00:04:27,327 --> 00:04:30,310
But in fact, Alice has
impersonated there.

121
00:04:30,310 --> 00:04:31,540
OK?

122
00:04:31,540 --> 00:04:35,320
And now as a result, they
will obtain fake keys

123
00:04:35,320 --> 00:04:36,340
for each other.

124
00:04:36,340 --> 00:04:37,960
And this man in the
middle, attacker

125
00:04:37,960 --> 00:04:39,970
who knows the
corresponding secret keys

126
00:04:39,970 --> 00:04:42,190
for these public
keys can basically

127
00:04:42,190 --> 00:04:45,890
read all of their
communications.

128
00:04:45,890 --> 00:04:47,460
Any questions about this?

129
00:04:47,460 --> 00:04:49,850
This is just one example
of how equivocation

130
00:04:49,850 --> 00:04:51,680
can be really disastrous.

131
00:04:51,680 --> 00:04:53,900
So in a public-key directory,
if you can equivocate,

132
00:04:53,900 --> 00:04:55,910
you can show fake
public keys for people

133
00:04:55,910 --> 00:04:58,263
and impersonate them.

134
00:04:58,263 --> 00:04:59,930
So in other words,
it's really important

135
00:04:59,930 --> 00:05:02,720
that Alice and Bob both
see the same directory.

136
00:05:02,720 --> 00:05:04,520
Because if Bob saw
this directory,

137
00:05:04,520 --> 00:05:08,030
the same one Alice saw, then Bob
would notice that this is not

138
00:05:08,030 --> 00:05:09,635
the public key he had.

139
00:05:09,635 --> 00:05:11,510
He would notice his
first public key and then

140
00:05:11,510 --> 00:05:12,980
that there's a second one there.

141
00:05:12,980 --> 00:05:14,688
And then he would know
he's impersonated.

142
00:05:14,688 --> 00:05:17,270
And he could let's say,
talk to the New York Times,

143
00:05:17,270 --> 00:05:18,200
and say, look.

144
00:05:18,200 --> 00:05:20,040
This directory is
impersonating me.

145
00:05:23,330 --> 00:05:25,640
So in conclusion, equivocation
can be pretty bad.

146
00:05:25,640 --> 00:05:27,390
So this idea that you
say different things

147
00:05:27,390 --> 00:05:29,360
to different people can
be pretty disastrous.

148
00:05:29,360 --> 00:05:31,950
And what Catena does
is it prevents that.

149
00:05:31,950 --> 00:05:35,150
So in general, if you have
this malicious service that is

150
00:05:35,150 --> 00:05:38,535
backed by Catena, if it
wants to say different things

151
00:05:38,535 --> 00:05:40,160
to different people.
it cannot do that.

152
00:05:40,160 --> 00:05:42,980
It has to show the same
thing to everybody.

153
00:05:42,980 --> 00:05:45,890
And the way we achieve that is
by building on top of bitcoin.

154
00:05:45,890 --> 00:05:48,300
And that's what we're going
to be talking about today.

155
00:05:48,300 --> 00:05:51,320
So any questions about
sort of the general setting

156
00:05:51,320 --> 00:05:54,030
of the problem and
our goals here?

157
00:05:54,030 --> 00:05:55,260
So let's move on then.

158
00:05:55,260 --> 00:05:56,925
So why does this matter?

159
00:05:56,925 --> 00:05:58,800
So this matters for a
bunch of other reasons,

160
00:05:58,800 --> 00:06:01,230
not just public-key directories
and secure messaging.

161
00:06:01,230 --> 00:06:04,890
It matters because when you want
to do secure software update,

162
00:06:04,890 --> 00:06:06,120
equivocation is a problem.

163
00:06:06,120 --> 00:06:07,540
And I'll talk about that later.

164
00:06:07,540 --> 00:06:09,150
So for example, at
some point bitcoin

165
00:06:09,150 --> 00:06:12,240
was concerned about
malicious bitcoin binaries

166
00:06:12,240 --> 00:06:14,240
being published on
the web and people

167
00:06:14,240 --> 00:06:15,990
like you and me
downloading those binaries

168
00:06:15,990 --> 00:06:17,810
and getting our coins stolen.

169
00:06:17,810 --> 00:06:20,310
Right, and it turns out that
that's an equivocation problem.

170
00:06:20,310 --> 00:06:22,620
Somebody is equivocating, right?

171
00:06:22,620 --> 00:06:24,610
It's equivocating about
the bitcoin binary.

172
00:06:24,610 --> 00:06:26,670
It's showing us a
fake version and maybe

173
00:06:26,670 --> 00:06:28,758
other people the real version.

174
00:06:28,758 --> 00:06:30,300
Secure messaging,
like I said before,

175
00:06:30,300 --> 00:06:33,540
has applications here and.

176
00:06:33,540 --> 00:06:35,340
Not just secure messaging
but also the web.

177
00:06:35,340 --> 00:06:36,715
Like when you go
on Facebook.com,

178
00:06:36,715 --> 00:06:38,617
you're looking up
Facebook's public key.

179
00:06:38,617 --> 00:06:40,700
And if somebody lies to
you about that public key,

180
00:06:40,700 --> 00:06:43,020
you could be going to
a malicious service

181
00:06:43,020 --> 00:06:46,390
and you could be giving
them your Facebook password.

182
00:06:46,390 --> 00:06:48,930
Does that make sense?

183
00:06:48,930 --> 00:06:50,860
And also it has
applications in the sense

184
00:06:50,860 --> 00:06:53,830
that if you have a way of
building a append-only log,

185
00:06:53,830 --> 00:06:56,470
you really have a way of
building a blockchain right

186
00:06:56,470 --> 00:06:58,280
for whatever purpose you want.

187
00:06:58,280 --> 00:07:00,830
And we'll talk
about that as well.

188
00:07:00,830 --> 00:07:03,940
So the 10,000 feet
view of the system

189
00:07:03,940 --> 00:07:07,060
is we built this bitcoin
based append-only log.

190
00:07:07,060 --> 00:07:09,380
And the way to think about
is that bitcoin is already

191
00:07:09,380 --> 00:07:10,832
an append-only log.

192
00:07:10,832 --> 00:07:12,790
It's just that it's kind
of inefficient to look

193
00:07:12,790 --> 00:07:13,305
in that log.

194
00:07:13,305 --> 00:07:15,430
If you want to pick certain
things from the Bitcoin

195
00:07:15,430 --> 00:07:18,010
blockchain, like you put your
certain bits and pieces of data

196
00:07:18,010 --> 00:07:19,060
there, you have to
kind of download

197
00:07:19,060 --> 00:07:21,435
the whole thing to make sure
you're not missing anything.

198
00:07:21,435 --> 00:07:23,020
And I'll tell you why soon.

199
00:07:23,020 --> 00:07:25,450
So instead of
putting stuff naively

200
00:07:25,450 --> 00:07:26,950
in the Bitcoin
blockchain, we put it

201
00:07:26,950 --> 00:07:28,720
in a more principled way.

202
00:07:28,720 --> 00:07:31,540
And in a sense we get
a log and another log.

203
00:07:31,540 --> 00:07:34,660
We get our log and
the bitcoin log.

204
00:07:34,660 --> 00:07:36,910
And this generalizes to
other cryptocurrencies.

205
00:07:36,910 --> 00:07:38,530
Like you could do
this in light coin,

206
00:07:38,530 --> 00:07:40,215
for example, or
even in ethereum.

207
00:07:40,215 --> 00:07:41,590
Though I don't
think you guys yet

208
00:07:41,590 --> 00:07:45,090
talked about how the
ethereum blockchain works.

209
00:07:45,090 --> 00:07:48,310
And the cool thing about this
Catena log that we're building

210
00:07:48,310 --> 00:07:51,560
is that the Catena log is as
hard to fork as the bitcoin

211
00:07:51,560 --> 00:07:52,060
blockchain.

212
00:07:52,060 --> 00:07:54,880
If you want to fork our log,
you have to fork bitcoin.

213
00:07:54,880 --> 00:07:56,740
However, unlike the
Bitcoin blockchain,

214
00:07:56,740 --> 00:07:59,140
Catena is super
efficient to verify.

215
00:07:59,140 --> 00:08:02,770
So in particular, remember,
I described the log

216
00:08:02,770 --> 00:08:04,250
in terms of the
statements in it.

217
00:08:04,250 --> 00:08:07,840
So if you have 10
statements, each statement

218
00:08:07,840 --> 00:08:09,572
will be 600 bytes to audit.

219
00:08:09,572 --> 00:08:11,530
So you don't have to
download the whole Bitcoin

220
00:08:11,530 --> 00:08:14,820
blockchain to make sure you're
not missing a statement.

221
00:08:14,820 --> 00:08:18,190
And you also have to download
80 bytes per bitcoin block.

222
00:08:18,190 --> 00:08:20,020
And we have a Java
implementation of this.

223
00:08:20,020 --> 00:08:23,710
And if you guys are curious,
you can go to my GitHub page,

224
00:08:23,710 --> 00:08:25,630
and take a look at the code.

225
00:08:25,630 --> 00:08:27,880
All right, so before we
start, I know you guys already

226
00:08:27,880 --> 00:08:29,410
know a lot about how
the Bitcoin blockchain,

227
00:08:29,410 --> 00:08:31,720
but it's important that I
reintroduce some terminology

228
00:08:31,720 --> 00:08:33,490
just so we're on the same page.

229
00:08:33,490 --> 00:08:35,000
So this is the
bitcoin blockchain.

230
00:08:35,000 --> 00:08:36,490
We have a bunch of
blocks connected

231
00:08:36,490 --> 00:08:37,690
by hash chain pointers.

232
00:08:37,690 --> 00:08:40,840
And we have Merkle
trees of transactions.

233
00:08:40,840 --> 00:08:43,610
And you know, these arrows
indicate hash pointers.

234
00:08:43,610 --> 00:08:47,840
It means that block n stores a
hash of block n minus 1 in it,

235
00:08:47,840 --> 00:08:48,340
right?

236
00:08:48,340 --> 00:08:49,690
So in that sense.

237
00:08:49,690 --> 00:08:52,500
Block n has a hash pointer
to block n minus 1.

238
00:08:55,320 --> 00:08:57,802
All right, and like I said,
each tree has a Merkle tree.

239
00:08:57,802 --> 00:08:59,010
Each block has a Merkle tree.

240
00:08:59,010 --> 00:09:00,960
And everybody
agrees on this chain

241
00:09:00,960 --> 00:09:03,870
of blocks via proof of work
consensus, which all of you

242
00:09:03,870 --> 00:09:05,160
already know.

243
00:09:05,160 --> 00:09:08,670
And importantly, in the Merkle
trees, we have transactions.

244
00:09:08,670 --> 00:09:10,230
And transactions
can do two things.

245
00:09:10,230 --> 00:09:13,050
Right, a transaction can
mint coins, create new coins.

246
00:09:13,050 --> 00:09:14,590
So here I have transaction a.

247
00:09:14,590 --> 00:09:17,670
It created four coins by
storing them in an output.

248
00:09:17,670 --> 00:09:18,920
Are you familiar with outputs?

249
00:09:18,920 --> 00:09:21,462
So you guys already discussed
transaction outputs and inputs.

250
00:09:21,462 --> 00:09:25,260
So that's what we're going
over here again real quick.

251
00:09:25,260 --> 00:09:27,450
So an output specifies
the number of coins

252
00:09:27,450 --> 00:09:30,300
and the public key of
the owner of those coins,

253
00:09:30,300 --> 00:09:31,800
as you already know.

254
00:09:31,800 --> 00:09:33,630
And the second purpose
of transactions

255
00:09:33,630 --> 00:09:35,940
is that they can transfer
coins, right, and pay fees

256
00:09:35,940 --> 00:09:36,900
in the process.

257
00:09:36,900 --> 00:09:39,150
So here if you have
a transaction b,

258
00:09:39,150 --> 00:09:42,510
this transaction b
might say, hey, here's

259
00:09:42,510 --> 00:09:45,750
a signature from the
owner of these coins here.

260
00:09:45,750 --> 00:09:47,850
Here's a signature
and transaction b.

261
00:09:47,850 --> 00:09:49,670
And here's the new
owner with public key

262
00:09:49,670 --> 00:09:52,200
b of three of those four coins.

263
00:09:52,200 --> 00:09:54,090
And one of those coins
I'll just give it

264
00:09:54,090 --> 00:09:57,775
to the miners as
a transaction fee.

265
00:09:57,775 --> 00:09:58,650
Does this make sense?

266
00:09:58,650 --> 00:10:00,620
How many of you are with me?

267
00:10:00,620 --> 00:10:03,770
All right, any questions
about how transaction inputs

268
00:10:03,770 --> 00:10:04,658
and outputs work?

269
00:10:04,658 --> 00:10:06,200
So if you remember
in the input here,

270
00:10:06,200 --> 00:10:09,520
you have a hash pointer
to the output here.

271
00:10:09,520 --> 00:10:12,470
All right.

272
00:10:12,470 --> 00:10:14,263
So and the high
level idea here is

273
00:10:14,263 --> 00:10:15,680
that the output
is just the number

274
00:10:15,680 --> 00:10:16,877
of coins in a public key.

275
00:10:16,877 --> 00:10:18,710
And the input is a hash
pointer to an output

276
00:10:18,710 --> 00:10:24,410
plus a digital signature from
that output's public key, OK?

277
00:10:24,410 --> 00:10:26,180
And yeah, so what
happens here is

278
00:10:26,180 --> 00:10:30,920
that transaction b is spending
transaction a's first output.

279
00:10:30,920 --> 00:10:33,415
Right, that's the terminology
that we're going to use.

280
00:10:33,415 --> 00:10:34,790
And I think you
guys have already

281
00:10:34,790 --> 00:10:37,790
used terminology like this.

282
00:10:37,790 --> 00:10:40,430
And in addition, what we're
going to talk about today

283
00:10:40,430 --> 00:10:42,440
is the fact that in these
bitcoin transactions

284
00:10:42,440 --> 00:10:43,670
you can actually embed data.

285
00:10:43,670 --> 00:10:48,080
And I think you touched briefly
on this concept of op return

286
00:10:48,080 --> 00:10:50,060
transaction outputs.

287
00:10:50,060 --> 00:10:54,740
Right, so this is an output that
sends coins to a public key.

288
00:10:54,740 --> 00:10:57,800
But here I can have an output
that sends coins to nobody.

289
00:10:57,800 --> 00:10:59,810
It just specifies some data.

290
00:10:59,810 --> 00:11:02,480
And in fact, I'll use that
data to specify the statements

291
00:11:02,480 --> 00:11:04,870
that I was talking
about earlier.

292
00:11:04,870 --> 00:11:06,290
So that the high
level point here

293
00:11:06,290 --> 00:11:10,340
is that you can embed data
in bitcoin transactions using

294
00:11:10,340 --> 00:11:11,102
these operations.

295
00:11:11,102 --> 00:11:13,310
And there's a bunch of other
ways to do it, actually.

296
00:11:13,310 --> 00:11:15,500
Like initially
what people did is

297
00:11:15,500 --> 00:11:17,272
they put the data
as the public key.

298
00:11:17,272 --> 00:11:18,980
They just set the
public key to the data.

299
00:11:18,980 --> 00:11:22,570
And in that sense, they
kind of wasted bitcoins.

300
00:11:22,570 --> 00:11:25,280
Right, they said hey,
send these three bitcoins

301
00:11:25,280 --> 00:11:27,890
to this public key, which
is just some random data.

302
00:11:27,890 --> 00:11:30,080
But nobody would know the
corresponding secret key

303
00:11:30,080 --> 00:11:31,580
of that public key.

304
00:11:31,580 --> 00:11:33,860
So therefore those
coins would be burned.

305
00:11:33,860 --> 00:11:36,770
Did you did you cover
this in class already?

306
00:11:36,770 --> 00:11:37,880
Maybe, maybe a little bit.

307
00:11:37,880 --> 00:11:39,602
So but that's kind
of inefficient

308
00:11:39,602 --> 00:11:41,060
because if you
remember, the miners

309
00:11:41,060 --> 00:11:43,460
have to build this UTXO set.

310
00:11:43,460 --> 00:11:45,770
And they have to
keep this output that

311
00:11:45,770 --> 00:11:47,930
has this bad public
key in their memory

312
00:11:47,930 --> 00:11:50,673
forever because nobody's
going to be able to spend it.

313
00:11:50,673 --> 00:11:51,840
So we don't want to do that.

314
00:11:51,840 --> 00:11:55,730
We want to build a nice system,
a system that treats bitcoin

315
00:11:55,730 --> 00:11:57,060
nicely, the bitcoin miners.

316
00:11:57,060 --> 00:12:00,265
So that's why we use
these return outputs.

317
00:12:00,265 --> 00:12:01,640
All right, so the
high level here

318
00:12:01,640 --> 00:12:03,290
is that Alice gives
Bob three bitcoins.

319
00:12:03,290 --> 00:12:06,083
And the miners collect
a bitcoin as a fee.

320
00:12:06,083 --> 00:12:08,000
And of course, you can
keep doing this, right?

321
00:12:08,000 --> 00:12:10,940
Like Bob can give Carol these
two bitcoins later by creating

322
00:12:10,940 --> 00:12:14,460
another transaction with an
input referring to that output

323
00:12:14,460 --> 00:12:17,830
and with an output specifying
Carol's public key.

324
00:12:17,830 --> 00:12:20,930
All right, and the high level
idea of bitcoin is that you

325
00:12:20,930 --> 00:12:22,610
don't-- you cannot
double spend coins.

326
00:12:22,610 --> 00:12:25,640
And what that means is that
a transaction output can only

327
00:12:25,640 --> 00:12:28,492
be referred to by a
single transaction input.

328
00:12:28,492 --> 00:12:30,950
So this thing right here in
bitcoin where I have two inputs

329
00:12:30,950 --> 00:12:34,440
spending an output
cannot happen, right?

330
00:12:34,440 --> 00:12:36,870
How many of you are
familiar with this already?

331
00:12:36,870 --> 00:12:37,495
OK, good.

332
00:12:37,495 --> 00:12:39,120
So actually this is
the essential trick

333
00:12:39,120 --> 00:12:40,170
that Catena leverages.

334
00:12:40,170 --> 00:12:42,010
And we'll talk about it soon.

335
00:12:42,010 --> 00:12:43,560
And yeah, the
moral of the story,

336
00:12:43,560 --> 00:12:46,290
the reason I told you all
of this is because you know,

337
00:12:46,290 --> 00:12:48,990
basically if you have proof of
work consensus in the bitcoin

338
00:12:48,990 --> 00:12:50,700
sense, you cannot
do double spends.

339
00:12:50,700 --> 00:12:52,890
So this thing right
here, that I said before,

340
00:12:52,890 --> 00:12:56,157
just cannot occur in the Bitcoin
blockchain unless you break

341
00:12:56,157 --> 00:12:58,740
the assumptions, right, unless
you have more mining power than

342
00:12:58,740 --> 00:13:01,050
you should.

343
00:13:01,050 --> 00:13:03,270
In that case, you either
have to tx2 or tx2

344
00:13:03,270 --> 00:13:05,750
prime, but not both, right?

345
00:13:05,750 --> 00:13:08,880
And what Catena
really realizes is

346
00:13:08,880 --> 00:13:11,790
that if I put statements
in these transactions now,

347
00:13:11,790 --> 00:13:14,370
what that means is that I can
only have a second statement.

348
00:13:14,370 --> 00:13:16,110
I cannot have two
second statements.

349
00:13:16,110 --> 00:13:19,110
I cannot equivocate about the
second statements if I just

350
00:13:19,110 --> 00:13:22,737
restrict my way of issuing
statements in this way.

351
00:13:22,737 --> 00:13:24,570
I put the first statement
in the transaction

352
00:13:24,570 --> 00:13:25,945
and the second
statement I put it

353
00:13:25,945 --> 00:13:28,830
in a transaction that
spends the first one.

354
00:13:28,830 --> 00:13:31,403
So does everybody agree that
if I do things in this way

355
00:13:31,403 --> 00:13:33,570
and I want to equivocate
about the second statement,

356
00:13:33,570 --> 00:13:36,520
I would have to double spend?

357
00:13:36,520 --> 00:13:40,680
So that's the key insight
behind our system.

358
00:13:40,680 --> 00:13:42,220
Any questions about this so far?

359
00:13:46,147 --> 00:13:48,480
You know it's really hard to
talk if you guys talk back,

360
00:13:48,480 --> 00:13:51,150
it's way easier.

361
00:13:51,150 --> 00:13:51,650
Yeah?

362
00:13:51,650 --> 00:13:52,525
AUDIENCE: A question.

363
00:13:52,525 --> 00:13:55,078
And maybe, I think I'm
getting this right.

364
00:13:55,078 --> 00:13:56,870
If you're setting up
the first transaction,

365
00:13:56,870 --> 00:13:58,890
then you're adding data on?

366
00:13:58,890 --> 00:14:01,130
You're burning bitcoins
every time you're doing it,

367
00:14:01,130 --> 00:14:02,880
so at some point,
you're going to run out.

368
00:14:02,880 --> 00:14:03,810
ALIN TOMESCU: That's
an excellent question.

369
00:14:03,810 --> 00:14:05,790
So let's-- we'll go over that.

370
00:14:05,790 --> 00:14:07,950
But the idea is
that in this output,

371
00:14:07,950 --> 00:14:10,260
I won't burn any bitcoins.

372
00:14:10,260 --> 00:14:12,610
I'll actually specify
my own public key here.

373
00:14:12,610 --> 00:14:14,820
And I'll just send the
bitcoins back to myself.

374
00:14:14,820 --> 00:14:18,450
And in the process I'll pay a
fee to issue this transaction.

375
00:14:18,450 --> 00:14:19,860
Does that make sense?

376
00:14:19,860 --> 00:14:20,400
Yeah?

377
00:14:20,400 --> 00:14:22,690
And we'll talk about it later.

378
00:14:22,690 --> 00:14:25,280
OK, so real quickly,
what did previous work

379
00:14:25,280 --> 00:14:26,030
do regarding this?

380
00:14:26,030 --> 00:14:29,090
So how many of you are
familiar with blockstack?

381
00:14:29,090 --> 00:14:31,900
1, 2, only two people?

382
00:14:31,900 --> 00:14:33,820
OK how many of you are
familiar with Keybase?

383
00:14:33,820 --> 00:14:37,950
OK, so blockstack and Keybase
actually post statements

384
00:14:37,950 --> 00:14:39,240
in the Bitcoin blockchain.

385
00:14:39,240 --> 00:14:41,210
And they're both
public directories.

386
00:14:41,210 --> 00:14:43,380
They map user names
to public keys.

387
00:14:43,380 --> 00:14:45,870
And for example, Keybase,
what Keybase does

388
00:14:45,870 --> 00:14:47,922
is they take this
Merkle route hash,

389
00:14:47,922 --> 00:14:49,380
and they put it in
the transaction.

390
00:14:49,380 --> 00:14:51,720
And then six hours later,
they take the new route hash,

391
00:14:51,720 --> 00:14:54,030
and they put it in another
transaction, and so on.

392
00:14:54,030 --> 00:14:56,280
Every six hours, they
post the transaction.

393
00:14:56,280 --> 00:14:59,640
But unfortunately, they don't
actually do what Catena does.

394
00:14:59,640 --> 00:15:03,360
So they don't have their new
transaction spend the old one.

395
00:15:03,360 --> 00:15:05,010
And as a result,
if you're trying

396
00:15:05,010 --> 00:15:09,210
to make sure you see all of
the statements for Keybase,

397
00:15:09,210 --> 00:15:10,860
you don't have a lot
of recourse other

398
00:15:10,860 --> 00:15:14,910
than just downloading each
block and looking in the block

399
00:15:14,910 --> 00:15:17,752
for all of the
relevant transactions.

400
00:15:17,752 --> 00:15:19,710
Another thing you could
do is you could sort of

401
00:15:19,710 --> 00:15:21,750
trust the bitcoin miners--

402
00:15:21,750 --> 00:15:25,200
the bitcoin full nodes to
filter the blocks for you.

403
00:15:25,200 --> 00:15:28,877
So you could contact a
bunch of bitcoin full nodes

404
00:15:28,877 --> 00:15:29,460
and say, look.

405
00:15:29,460 --> 00:15:31,260
I'm only interested
in transactions

406
00:15:31,260 --> 00:15:34,478
that have a certain IP
return prefix in the data.

407
00:15:34,478 --> 00:15:35,770
And they could do that for you.

408
00:15:35,770 --> 00:15:37,312
But unfortunately,
bitcoin full nodes

409
00:15:37,312 --> 00:15:39,060
could also lie to
you very easily.

410
00:15:39,060 --> 00:15:41,040
And there is no cost
for them to lie to you.

411
00:15:41,040 --> 00:15:43,080
Everybody can be a
bitcoin full node.

412
00:15:43,080 --> 00:15:45,300
So then it becomes a very
bandwidth intensive process

413
00:15:45,300 --> 00:15:48,840
because you have to
ask a lot of full nodes

414
00:15:48,840 --> 00:15:51,310
to deal with the fact that
someone might lie to you.

415
00:15:51,310 --> 00:15:53,340
But you either need to
download full blocks

416
00:15:53,340 --> 00:15:55,930
to find let's say, a missing
statement like this one

417
00:15:55,930 --> 00:15:58,505
that a bitcoin full node
might hide from you,

418
00:15:58,505 --> 00:16:00,630
or you can trust the majority
of bitcoin full nodes

419
00:16:00,630 --> 00:16:03,302
to not hide statements, which
is not very good, right?

420
00:16:03,302 --> 00:16:05,010
So I don't want to
trust these full nodes

421
00:16:05,010 --> 00:16:06,750
because all of you guys
could run a full node right

422
00:16:06,750 --> 00:16:07,630
now in the bitcoin network.

423
00:16:07,630 --> 00:16:08,838
It doesn't cost you anything.

424
00:16:08,838 --> 00:16:12,220
And if I talk to your malicious
full node, I could be screwed.

425
00:16:12,220 --> 00:16:15,750
So our work just says, look.

426
00:16:15,750 --> 00:16:19,350
Instead of issuing
transactions in sort

427
00:16:19,350 --> 00:16:23,430
of an uncorrelated fashion,
just do the following thing.

428
00:16:23,430 --> 00:16:25,860
Every transaction you issue
should spend the previous one.

429
00:16:25,860 --> 00:16:27,780
So as a result, if someone
wants to equivocate

430
00:16:27,780 --> 00:16:29,520
about the third
statement, they have

431
00:16:29,520 --> 00:16:32,880
to double spend
like I said before.

432
00:16:32,880 --> 00:16:33,570
Right?

433
00:16:33,570 --> 00:16:34,350
Yeah?

434
00:16:34,350 --> 00:16:35,975
AUDIENCE: So what's
the connection back

435
00:16:35,975 --> 00:16:38,050
to Keybase and blockstack again?

436
00:16:38,050 --> 00:16:41,890
ALIN TOMESCU: Yeah, so
Keybase and blockstack

437
00:16:41,890 --> 00:16:42,930
are public directories.

438
00:16:42,930 --> 00:16:47,800
And what they do is they want
to prevent themselves from--

439
00:16:47,800 --> 00:16:49,990
is there a whiteboard
I can draw here?

440
00:16:49,990 --> 00:16:54,990
Yeah so, let's say, you know,
if you remember the picture

441
00:16:54,990 --> 00:16:57,360
from the beginning, I can
have a public directory

442
00:16:57,360 --> 00:16:59,220
that evolves over time, right?

443
00:16:59,220 --> 00:17:01,200
So this is v1 of the
directory, and it

444
00:17:01,200 --> 00:17:03,790
might have the right public
keys for Alice and Bob.

445
00:17:03,790 --> 00:17:05,770
But at v2, the
directory might do this.

446
00:17:05,770 --> 00:17:07,890
It might do v2.

447
00:17:07,890 --> 00:17:10,980
It might have Alice, Bob, but
then put a fake key for Alice.

448
00:17:10,980 --> 00:17:14,369
And at V2 prime, it might
do Alice, Bob as before,

449
00:17:14,369 --> 00:17:16,109
and put a fake key for Bob.

450
00:17:16,109 --> 00:17:19,050
So in other words it keeps
the directory append-only.

451
00:17:19,050 --> 00:17:22,470
But it just adds fake public
keys for the right people

452
00:17:22,470 --> 00:17:23,500
in the right version.

453
00:17:23,500 --> 00:17:26,650
So here this one
is shown to Bob.

454
00:17:26,650 --> 00:17:28,860
And here this one
is shown to Alice.

455
00:17:28,860 --> 00:17:32,190
So now Alice will use
this fake key for Bob.

456
00:17:32,190 --> 00:17:36,330
So she'll encrypt a message
with b prime for Bob.

457
00:17:36,330 --> 00:17:40,980
And now the attacker can
easily decrypt this message

458
00:17:40,980 --> 00:17:42,780
because he has the secret key.

459
00:17:42,780 --> 00:17:45,810
So what the attacker can
do then is re-encrypt it

460
00:17:45,810 --> 00:17:48,570
with the right
public key for Bob

461
00:17:48,570 --> 00:17:50,822
and now he can read
Alice's messages.

462
00:17:50,822 --> 00:17:52,530
And the whole idea is
that the attacker--

463
00:17:52,530 --> 00:17:57,690
you know, this b prime is the
public key, is pk b prime,

464
00:17:57,690 --> 00:17:58,410
let's say.

465
00:17:58,410 --> 00:18:01,140
But the attacker
has this sk b prime.

466
00:18:01,140 --> 00:18:04,110
He knows sk b prime because
the attacker put this in there.

467
00:18:04,110 --> 00:18:06,568
The attacker being really,
blockstack or Keybase.

468
00:18:06,568 --> 00:18:08,610
And of course, they're
not attackers in the sense

469
00:18:08,610 --> 00:18:09,730
that they want to be good guys.

470
00:18:09,730 --> 00:18:11,640
But they're going to be
compromised eventually.

471
00:18:11,640 --> 00:18:13,057
So they want to
prevent themselves

472
00:18:13,057 --> 00:18:14,750
from doing things like these.

473
00:18:14,750 --> 00:18:17,300
Is that sort of
answer your question?

474
00:18:17,300 --> 00:18:18,870
Yeah.

475
00:18:18,870 --> 00:18:22,130
All right, so yeah, a
really simple summary.

476
00:18:22,130 --> 00:18:24,630
If I had two slides to summarize
our work, this would be it.

477
00:18:24,630 --> 00:18:26,310
Right, they would be these two.

478
00:18:26,310 --> 00:18:27,730
Look, don't do things this way.

479
00:18:27,730 --> 00:18:29,180
Do them this way.

480
00:18:29,180 --> 00:18:32,040
All right, so let's see,
let's look a little bit

481
00:18:32,040 --> 00:18:32,640
at the design.

482
00:18:32,640 --> 00:18:36,390
So remember we have
these authorities that

483
00:18:36,390 --> 00:18:38,390
could equivocate about
statements they issued

484
00:18:38,390 --> 00:18:39,580
like blockstack and Keybase.

485
00:18:39,580 --> 00:18:42,600
So what we propose is
look, these authorities

486
00:18:42,600 --> 00:18:45,930
can run a lock server,
a Catena lock server.

487
00:18:45,930 --> 00:18:49,943
And they start with some
funds locked in some output.

488
00:18:49,943 --> 00:18:51,360
And what they can
do first is they

489
00:18:51,360 --> 00:18:53,940
can issue this
genesis transaction

490
00:18:53,940 --> 00:18:55,290
to start a new lock.

491
00:18:55,290 --> 00:18:58,080
So for example, Keybase would
issue this genesis transaction

492
00:18:58,080 --> 00:19:02,310
starting the log of their
public directory Merkle routes.

493
00:19:02,310 --> 00:19:04,080
And this genesis
transaction can be

494
00:19:04,080 --> 00:19:05,760
thought of as the
public key of the log.

495
00:19:05,760 --> 00:19:08,250
Once you have this
genesis transaction,

496
00:19:08,250 --> 00:19:10,830
you know it's transaction ID.

497
00:19:10,830 --> 00:19:12,750
You can verify any
future statements,

498
00:19:12,750 --> 00:19:17,610
and you can implicitly prevent
equivocation about statements.

499
00:19:17,610 --> 00:19:20,290
And what the lock
server is going to do

500
00:19:20,290 --> 00:19:21,780
is it's going to
take these coins

501
00:19:21,780 --> 00:19:24,450
and send them back to the
server to answer your question.

502
00:19:24,450 --> 00:19:26,283
So if there was a public
key, if these coins

503
00:19:26,283 --> 00:19:27,825
are owned by some
public key, they're

504
00:19:27,825 --> 00:19:29,500
just sent back to
same public key here

505
00:19:29,500 --> 00:19:32,310
and paying some
fees in the process.

506
00:19:32,310 --> 00:19:34,230
Right, so we're not
burning coins in the sense

507
00:19:34,230 --> 00:19:35,730
that we're just
paying fees that are

508
00:19:35,730 --> 00:19:36,897
miners, which we have to do.

509
00:19:40,220 --> 00:19:42,120
OK, so now what you
can do is if you

510
00:19:42,120 --> 00:19:43,578
want to issue the
first statements,

511
00:19:43,578 --> 00:19:46,020
you create a transaction.

512
00:19:46,020 --> 00:19:48,480
You send the coins from this
output to this other output.

513
00:19:48,480 --> 00:19:49,855
You pay some fees
in the process.

514
00:19:49,855 --> 00:19:53,850
You put your statement
in an op return output.

515
00:19:53,850 --> 00:19:56,760
And as a result, if this lock
server wants to equivocate,

516
00:19:56,760 --> 00:19:59,140
it has to again,
double spend here,

517
00:19:59,140 --> 00:20:02,460
which it cannot do unless
it has enough mining power.

518
00:20:02,460 --> 00:20:04,230
So Keybase and
blockstack, if they

519
00:20:04,230 --> 00:20:05,730
were to use a system
like this, they

520
00:20:05,730 --> 00:20:08,518
could prevent themselves
from equivocating.

521
00:20:08,518 --> 00:20:09,810
And this can keep going, right?

522
00:20:09,810 --> 00:20:11,185
So you issue
another transaction.

523
00:20:11,185 --> 00:20:12,660
Spend the previous output.

524
00:20:12,660 --> 00:20:13,680
Put the new statement.

525
00:20:13,680 --> 00:20:14,190
Yes?

526
00:20:14,190 --> 00:20:16,148
AUDIENCE: This doesn't
seem like a new problem.

527
00:20:16,148 --> 00:20:20,795
How have authorities prevented
equivocation in the past?

528
00:20:20,795 --> 00:20:22,920
ALIN TOMESCU: This doesn't
seem like a new problem.

529
00:20:22,920 --> 00:20:24,180
How did authorities do it?

530
00:20:24,180 --> 00:20:25,200
The problem is not new.

531
00:20:25,200 --> 00:20:27,150
The problem is eternal.

532
00:20:27,150 --> 00:20:28,260
So you are correct there.

533
00:20:28,260 --> 00:20:29,860
How did they do it in the past?

534
00:20:29,860 --> 00:20:32,578
They just used a Byzantine
consensus algorithm.

535
00:20:32,578 --> 00:20:34,870
So in some sense this is what
we're doing here as well.

536
00:20:34,870 --> 00:20:37,940
We're just piggybacking
on top of Bitcoin's

537
00:20:37,940 --> 00:20:39,690
Byzantine consensus algorithm.

538
00:20:39,690 --> 00:20:42,960
AUDIENCE: So you're rolling down
to a newer Byzantine consensus

539
00:20:42,960 --> 00:20:44,630
algorithm basically.

540
00:20:44,630 --> 00:20:47,005
ALIN TOMESCU: Sure, I'm not
sure what rolling down means.

541
00:20:47,005 --> 00:20:49,197
But yeah, we're piggybacking
on top of bitcoin.

542
00:20:49,197 --> 00:20:51,030
The idea is that look,
a byzantine consensus

543
00:20:51,030 --> 00:20:53,520
is actually quite
complex to get right.

544
00:20:53,520 --> 00:20:56,920
We already have a publicly
verifiable business consensus

545
00:20:56,920 --> 00:20:57,420
algorithm.

546
00:20:57,420 --> 00:20:58,500
It's bitcoin.

547
00:20:58,500 --> 00:21:02,280
Why can't we use it to verify,
let's say, a log of statements

548
00:21:02,280 --> 00:21:03,600
super efficiently?

549
00:21:03,600 --> 00:21:07,380
So up until our work, people
didn't seem to do this.

550
00:21:07,380 --> 00:21:08,550
So Keybase didn't do this.

551
00:21:08,550 --> 00:21:09,900
Blockstack didn't do this.

552
00:21:09,900 --> 00:21:12,150
They kind of forced it to
download the entire bitcoin

553
00:21:12,150 --> 00:21:14,575
block chain to verify let's
say, three statements.

554
00:21:14,575 --> 00:21:16,950
So in our case, you only have
to download a few kilobytes

555
00:21:16,950 --> 00:21:18,870
of data to verify
these three statements,

556
00:21:18,870 --> 00:21:21,120
assuming you have the
bitcoin block headers, right,

557
00:21:21,120 --> 00:21:23,652
which we think is
a step forward.

558
00:21:23,652 --> 00:21:25,110
And it's sort of
like the right way

559
00:21:25,110 --> 00:21:28,260
to use these systems
should be you

560
00:21:28,260 --> 00:21:30,330
know, the efficient way,
not the inefficient way.

561
00:21:30,330 --> 00:21:32,095
Because bandwidth
is expensive, right?

562
00:21:32,095 --> 00:21:32,970
Computation is cheap.

563
00:21:32,970 --> 00:21:34,220
Bandwidth is expensive.

564
00:21:36,860 --> 00:21:40,680
All right, so anyway the idea is
that if the lock server becomes

565
00:21:40,680 --> 00:21:43,140
malicious, if Keybase or
blockstack gets hacked,

566
00:21:43,140 --> 00:21:46,050
they cannot equivocate
about the third statement.

567
00:21:46,050 --> 00:21:49,320
They can only issue one
unique third statement.

568
00:21:49,320 --> 00:21:50,820
And the advantages
are, you know,

569
00:21:50,820 --> 00:21:52,100
it's hard to fork this lock.

570
00:21:52,100 --> 00:21:54,202
It's hard to equivocate
about third statement.

571
00:21:54,202 --> 00:21:55,410
But it's efficient to verify.

572
00:21:55,410 --> 00:21:57,493
And I'll walk you through
how clients verify soon.

573
00:22:00,590 --> 00:22:03,020
The disadvantages
are that if I want

574
00:22:03,020 --> 00:22:05,390
to know that this is the
second statement in the log,

575
00:22:05,390 --> 00:22:07,250
I have to wait for
six more blocks

576
00:22:07,250 --> 00:22:10,832
to be built on top of this
statement's block, right?

577
00:22:10,832 --> 00:22:13,040
Just like in bitcoin, you
have to wait for six blocks

578
00:22:13,040 --> 00:22:15,380
to make sure a transaction
is confirmed, right?

579
00:22:15,380 --> 00:22:16,180
Why do you do that?

580
00:22:16,180 --> 00:22:18,055
The reason you do that
is because there could

581
00:22:18,055 --> 00:22:20,030
be another transaction
here, double spending

582
00:22:20,030 --> 00:22:23,280
because sometimes there are
accidental forks in bitcoin

583
00:22:23,280 --> 00:22:26,330
and things like that.

584
00:22:26,330 --> 00:22:27,890
What are some
other disadvantages

585
00:22:27,890 --> 00:22:29,057
that you guys can point out?

586
00:22:32,600 --> 00:22:33,224
Yeah?

587
00:22:33,224 --> 00:22:35,040
AUDIENCE: It's going
to be expensive.

588
00:22:35,040 --> 00:22:36,450
ALIN TOMESCU: It's going
to be expensive to issue

589
00:22:36,450 --> 00:22:36,992
these trends.

590
00:22:36,992 --> 00:22:40,240
So Alin-- Alin and I
share the same name.

591
00:22:40,240 --> 00:22:41,880
So Alin is pointing
out that you know,

592
00:22:41,880 --> 00:22:43,380
every time I issue
these statements,

593
00:22:43,380 --> 00:22:44,735
I have to pay a fee, right?

594
00:22:44,735 --> 00:22:46,860
And if you remember, the
fees were quite ridiculous

595
00:22:46,860 --> 00:22:48,240
in bitcoin.

596
00:22:48,240 --> 00:22:49,530
So that's a problem, right?

597
00:22:49,530 --> 00:22:51,330
So let's see, is
that the next thing?

598
00:22:51,330 --> 00:22:53,400
The next thing was
you have to issue--

599
00:22:53,400 --> 00:22:56,170
you can only issue a statement
every 10 minutes, right?

600
00:22:56,170 --> 00:22:58,128
So if you want to issue
statements really fast.

601
00:22:58,128 --> 00:22:59,250
you can't do that.

602
00:22:59,250 --> 00:23:04,150
All right, like Alin said you
have to pay bitcoin transaction

603
00:23:04,150 --> 00:23:04,900
fees.

604
00:23:04,900 --> 00:23:07,510
And the other problem is
that you don't get freshness

605
00:23:07,510 --> 00:23:11,500
in the sense that it's kind
of easy for this lock server

606
00:23:11,500 --> 00:23:13,353
to hide from you the
latest statement.

607
00:23:13,353 --> 00:23:15,520
You know, unless you have
a lot of these log servers

608
00:23:15,520 --> 00:23:18,100
and you ask many of them, hey,
what's the latest statement?

609
00:23:18,100 --> 00:23:20,872
And they show you back
the latest statement.

610
00:23:20,872 --> 00:23:23,080
If there is just one log
server and it's compromised,

611
00:23:23,080 --> 00:23:24,538
it could always
pretend no, no, no.

612
00:23:24,538 --> 00:23:25,850
This is the latest statement.

613
00:23:25,850 --> 00:23:27,760
And if you don't trust it,
the best recourse you have

614
00:23:27,760 --> 00:23:30,250
is to download the full block
and look for the statement

615
00:23:30,250 --> 00:23:31,360
yourself.

616
00:23:31,360 --> 00:23:34,030
So you don't get freshness.

617
00:23:34,030 --> 00:23:35,240
Those are some disadvantages.

618
00:23:35,240 --> 00:23:38,020
Now let's look at how
clients audit this log?

619
00:23:38,020 --> 00:23:39,928
So I was claiming that
it's very efficient

620
00:23:39,928 --> 00:23:41,470
to get these statements
and make sure

621
00:23:41,470 --> 00:23:44,087
that no equivocation happened.

622
00:23:44,087 --> 00:23:45,670
So let's say, you
have a Catena client

623
00:23:45,670 --> 00:23:47,753
and you're running on your
phone with this client.

624
00:23:47,753 --> 00:23:50,590
And your goal is to get
that list of statements.

625
00:23:50,590 --> 00:23:53,020
And there's the Catena log
server over there in the back.

626
00:23:53,020 --> 00:23:54,520
And there's the
bitcoin peer to peer

627
00:23:54,520 --> 00:23:58,105
network which at the moment
has about 11,000 nodes.

628
00:23:58,105 --> 00:23:59,980
And remember, I said
the first thing you need

629
00:23:59,980 --> 00:24:03,790
is the genesis transaction.

630
00:24:03,790 --> 00:24:05,690
Does everybody
sort of understand

631
00:24:05,690 --> 00:24:08,150
that if you get the wrong
Genesis transaction,

632
00:24:08,150 --> 00:24:09,800
you're completely
screwed, right?

633
00:24:09,800 --> 00:24:11,988
Because it's very
easy to equivocate

634
00:24:11,988 --> 00:24:14,030
if you have the wrong
Genesis transaction, right?

635
00:24:14,030 --> 00:24:18,720
I mean, you know there's the
right GTX here where you have,

636
00:24:18,720 --> 00:24:19,520
let's say, a s1.

637
00:24:19,520 --> 00:24:23,180
And then you have s2 in their
own transactions, right?

638
00:24:23,180 --> 00:24:26,810
But if there's another GTX prime
here and you're using that one,

639
00:24:26,810 --> 00:24:30,970
you're going to get s1 prime,
s2 prime, different statements.

640
00:24:30,970 --> 00:24:35,610
So if Alice uses GTX
but Bob uses GTX prime,

641
00:24:35,610 --> 00:24:38,237
Alice and Bob are
back to square one.

642
00:24:38,237 --> 00:24:40,820
So in some sense, you might ask,
OK, so then what's the point?

643
00:24:40,820 --> 00:24:41,903
What have you solved here?

644
00:24:41,903 --> 00:24:43,940
I still need to get
this GTX, right?

645
00:24:43,940 --> 00:24:46,820
So what we claim is that this is
a step forward because you only

646
00:24:46,820 --> 00:24:47,725
have to do this once.

647
00:24:47,725 --> 00:24:49,100
Once you've got
this GTX, you can

648
00:24:49,100 --> 00:24:52,010
be sure you're never
equivocated to, right?

649
00:24:52,010 --> 00:24:53,750
Whereas in the
past, you would have

650
00:24:53,750 --> 00:24:55,820
to for each
individual statement,

651
00:24:55,820 --> 00:24:57,560
you'd have to do
additional checks

652
00:24:57,560 --> 00:24:59,268
to make sure you're
not being equivocated

653
00:24:59,268 --> 00:25:02,180
to, like you would have to
ask in a full node, let's say.

654
00:25:02,180 --> 00:25:05,035
Right, so as long as you have
the right GTX, you're good.

655
00:25:05,035 --> 00:25:06,410
And how do you
get the right GTX?

656
00:25:06,410 --> 00:25:09,403
Well, usually you ship it with
the software on your phone.

657
00:25:09,403 --> 00:25:11,070
And there's some
problems there as well,

658
00:25:11,070 --> 00:25:13,833
like there is no problem
solved in computer

659
00:25:13,833 --> 00:25:14,750
science in some sense.

660
00:25:14,750 --> 00:25:18,470
But you know, we're trying
to make progress here.

661
00:25:18,470 --> 00:25:20,690
OK, so let's say you have
the right GTX because it

662
00:25:20,690 --> 00:25:22,402
got shipped with your software.

663
00:25:22,402 --> 00:25:24,860
Now the next thing you want to
do is get the block headers.

664
00:25:24,860 --> 00:25:26,870
So you have header
i, but there are

665
00:25:26,870 --> 00:25:28,438
some new headers being posted.

666
00:25:28,438 --> 00:25:30,980
Let's say the bitcoin peer to
peer network sends them to you.

667
00:25:30,980 --> 00:25:31,938
You have these headers.

668
00:25:31,938 --> 00:25:33,500
You verify the proof
of work, right?

669
00:25:33,500 --> 00:25:35,950
So this only costs you 80
bytes per header, right?

670
00:25:35,950 --> 00:25:38,330
Does everyone see that
this is very cheap?

671
00:25:38,330 --> 00:25:43,040
So far, so far I have the GTX,
which is let's say 235 bytes.

672
00:25:43,040 --> 00:25:44,650
And now I'm downloading
some headers.

673
00:25:44,650 --> 00:25:46,567
And now I'm ready to ask
the log server what's

674
00:25:46,567 --> 00:25:48,703
the first statement
in the log, right?

675
00:25:48,703 --> 00:25:50,120
And what the log
server will do is

676
00:25:50,120 --> 00:25:52,370
he's going to reply
with the transaction

677
00:25:52,370 --> 00:25:55,370
with the statement, which is
600 bytes and the Merkle proof.

678
00:25:55,370 --> 00:25:58,430
So all of this is
600 bytes, actually.

679
00:25:58,430 --> 00:26:01,280
Right, and now what the
Catena client will do

680
00:26:01,280 --> 00:26:03,530
is he'll check the
Merkle proof against one

681
00:26:03,530 --> 00:26:06,740
of the headers so to see in
which headers does it fit.

682
00:26:06,740 --> 00:26:09,200
And then he'll also
check that the input here

683
00:26:09,200 --> 00:26:12,950
has a valid signature from the
public key in the output here.

684
00:26:16,000 --> 00:26:18,870
All right, I want at least
one question about this.

685
00:26:18,870 --> 00:26:19,440
Yeah?

686
00:26:19,440 --> 00:26:21,773
AUDIENCE: Sorry I came in
late, but is the Catena client

687
00:26:21,773 --> 00:26:23,940
over there similar to the SPV?

688
00:26:23,940 --> 00:26:25,440
ALIN TOMESCU: Yeah,
so that exactly.

689
00:26:25,440 --> 00:26:26,460
It's an SPV client.

690
00:26:26,460 --> 00:26:29,050
Yeah, so the idea is
that we want SPV clients.

691
00:26:29,050 --> 00:26:31,140
We don't want these
mobile phone clients

692
00:26:31,140 --> 00:26:34,500
to download 150
gigabytes of data.

693
00:26:34,500 --> 00:26:37,560
We want them to download let's
say 40 megabytes worth of block

694
00:26:37,560 --> 00:26:41,045
headers, which they can discard
very quickly as they verify.

695
00:26:41,045 --> 00:26:42,420
And then we want
them to download

696
00:26:42,420 --> 00:26:44,610
600 bytes per
statement, but still

697
00:26:44,610 --> 00:26:47,370
be sure that they saw all of
the statements in sequence,

698
00:26:47,370 --> 00:26:48,840
and that there was
no equivocation.

699
00:26:51,550 --> 00:26:52,050
Yeah?

700
00:26:53,988 --> 00:26:55,530
AUDIENCE: So in our
previous classes,

701
00:26:55,530 --> 00:26:59,890
we discussed, if you can have
a full node and an SPV node,

702
00:26:59,890 --> 00:27:03,310
do these sort of vulnerabilities
exist with the [INAUDIBLE]

703
00:27:03,310 --> 00:27:04,000
client?

704
00:27:04,000 --> 00:27:05,583
ALIN TOMESCU: So
which vulnerabilities

705
00:27:05,583 --> 00:27:06,830
that you did talk about?

706
00:27:06,830 --> 00:27:07,930
AUDIENCE: I forget.

707
00:27:07,930 --> 00:27:08,590
ALIN TOMESCU: Did
you talk about--

708
00:27:08,590 --> 00:27:11,020
AUDIENCE: There was just a
box that was like less secure.

709
00:27:11,020 --> 00:27:14,440
And then there was another box
that was something else bad.

710
00:27:14,440 --> 00:27:18,210
There was one about [INAUDIBLE].

711
00:27:18,210 --> 00:27:21,160
The clients would
lie to you about--

712
00:27:21,160 --> 00:27:24,680
If you say, here's
some transactions

713
00:27:24,680 --> 00:27:26,240
or here are some
unspent outputs,

714
00:27:26,240 --> 00:27:29,280
then they could just tell
you something different.

715
00:27:29,280 --> 00:27:30,030
ALIN TOMESCU: Yes.

716
00:27:30,030 --> 00:27:30,440
Yeah?

717
00:27:30,440 --> 00:27:30,730
Sorry.

718
00:27:30,730 --> 00:27:32,188
AUDIENCE: Well,
they can't tell you

719
00:27:32,188 --> 00:27:33,980
that transactions
exist or don't exist.

720
00:27:33,980 --> 00:27:35,693
They can just not tell you--

721
00:27:35,693 --> 00:27:37,610
ALIN TOMESCU: Yeah, they
can hide transactions

722
00:27:37,610 --> 00:27:39,920
from you, which gets back
to the freshness issue

723
00:27:39,920 --> 00:27:41,120
that we could discussed.

724
00:27:41,120 --> 00:27:43,460
They could also--
this block header

725
00:27:43,460 --> 00:27:46,610
could be an header
for an invalid block.

726
00:27:46,610 --> 00:27:49,850
But remember, that before
you accept this tx1,

727
00:27:49,850 --> 00:27:51,860
you wait for enough
proof of work,

728
00:27:51,860 --> 00:27:54,710
you wait for more block
headers on top of this guy

729
00:27:54,710 --> 00:27:56,750
to sort of get some
assurance that no, this

730
00:27:56,750 --> 00:27:59,270
was a valid block because a
bunch of other miners built

731
00:27:59,270 --> 00:28:00,370
on top of it.

732
00:28:00,370 --> 00:28:00,870
Right?

733
00:28:03,037 --> 00:28:05,370
So as long as you're willing
to trust that the miners do

734
00:28:05,370 --> 00:28:07,440
the right thing, which
they have an incentive

735
00:28:07,440 --> 00:28:09,750
to do the right thing,
you should be good.

736
00:28:09,750 --> 00:28:12,350
But like you said,
what's your name?

737
00:28:12,350 --> 00:28:13,170
AUDIENCE: Anne.

738
00:28:13,170 --> 00:28:13,962
ALIN TOMESCU: Anne?

739
00:28:13,962 --> 00:28:16,260
Like Anne said, there are
actually bigger problems

740
00:28:16,260 --> 00:28:17,048
with SPV clients.

741
00:28:17,048 --> 00:28:18,840
And if there is time,
we can talk about it.

742
00:28:18,840 --> 00:28:22,650
But it's actually easier to
trick SPV clients to fork them.

743
00:28:22,650 --> 00:28:26,100
And there's something called a
generalized vector 76 attack.

744
00:28:26,100 --> 00:28:28,650
Have any of your guys
heard about this?

745
00:28:28,650 --> 00:28:30,420
So it's like a pre-mining
attack but it's

746
00:28:30,420 --> 00:28:32,490
a bit easier to pull off.

747
00:28:32,490 --> 00:28:34,740
Actually, it's a lot easier
to pull off on an SPV node

748
00:28:34,740 --> 00:28:35,540
than on a full node.

749
00:28:35,540 --> 00:28:37,748
And if there's time at the
end, we can talk about it.

750
00:28:37,748 --> 00:28:39,810
If there isn't, you
can read our paper,

751
00:28:39,810 --> 00:28:41,260
which is online on my website.

752
00:28:41,260 --> 00:28:44,160
And you can read about
these pre-mining attacks

753
00:28:44,160 --> 00:28:46,645
that work easier for SPV nodes.

754
00:28:46,645 --> 00:28:48,270
But anyway, this can
keep going, right?

755
00:28:48,270 --> 00:28:50,710
You get block headers,
80 bytes each.

756
00:28:50,710 --> 00:28:52,570
You ask the log
server, hey what's

757
00:28:52,570 --> 00:28:53,830
the next statement in the log?

758
00:28:53,830 --> 00:28:56,140
You get a Merkle proof
in a transaction.

759
00:28:56,140 --> 00:28:58,000
And then you verify
the Merkle proof.

760
00:28:58,000 --> 00:29:00,430
You put this transaction
in one of these blocks

761
00:29:00,430 --> 00:29:03,210
and you verify that it
spends the previous one.

762
00:29:03,210 --> 00:29:05,320
Right, and as a
result, you implicitly

763
00:29:05,320 --> 00:29:07,733
by doing this verification,
by checking that hey,

764
00:29:07,733 --> 00:29:09,400
this is a transaction
and a valid block.

765
00:29:09,400 --> 00:29:11,770
This block has enough
stuff built on top of it

766
00:29:11,770 --> 00:29:14,020
and this transaction
spends this guy here,

767
00:29:14,020 --> 00:29:16,042
you implicitly
prevent equivocation.

768
00:29:16,042 --> 00:29:18,250
Right, then you don't have
to download anything else.

769
00:29:18,250 --> 00:29:21,210
Right, you only have to
download these Merkle proofs

770
00:29:21,210 --> 00:29:23,410
and transactions in
these block headers.

771
00:29:23,410 --> 00:29:26,240
Whereas in previous work, you
could be missing these s1's,

772
00:29:26,240 --> 00:29:28,000
these s2's, they
could be hidden away

773
00:29:28,000 --> 00:29:30,280
in some other branch
of the Merkle tree.

774
00:29:30,280 --> 00:29:32,380
And you'd have to do
peer to peer bloom

775
00:29:32,380 --> 00:29:34,270
filtering on the full nodes.

776
00:29:34,270 --> 00:29:36,092
And those full nodes
could lie to you.

777
00:29:39,023 --> 00:29:40,940
Yeah, so the bandwidth
is actually very small.

778
00:29:40,940 --> 00:29:43,250
So suppose we have
500k block headers--

779
00:29:43,250 --> 00:29:45,320
I think bitcoin has a
bit more right now--

780
00:29:45,320 --> 00:29:46,820
which are 80 bytes
each and we have

781
00:29:46,820 --> 00:29:49,790
10,000 statements in this
log, which are 600 bytes each.

782
00:29:49,790 --> 00:29:53,057
Then we only need now
with 46 megabytes, right?

783
00:29:53,057 --> 00:29:54,890
What's the what's the
other way of doing it?

784
00:29:54,890 --> 00:29:59,410
You have to download
hundreds of gigabytes.

785
00:29:59,410 --> 00:30:02,320
All right, so let's talk about
scalability a little bit.

786
00:30:02,320 --> 00:30:04,750
So suppose the system
gets deployed widely.

787
00:30:04,750 --> 00:30:09,900
Let's say Whatsapp starts to
use the system to witness--

788
00:30:09,900 --> 00:30:13,000
to publish their public
directory in bitcoin, right?

789
00:30:13,000 --> 00:30:15,100
And everybody, a lot of
you here have Whatsapp.

790
00:30:15,100 --> 00:30:16,270
And this is you guys.

791
00:30:16,270 --> 00:30:18,445
Let's say there are 200,000
people using Whatsapp.

792
00:30:18,445 --> 00:30:20,710
I think there's
more like a billion.

793
00:30:20,710 --> 00:30:21,980
So what are they going to do?

794
00:30:21,980 --> 00:30:24,730
Remember that part of
the verification protocol

795
00:30:24,730 --> 00:30:26,170
is asking for
these block headers

796
00:30:26,170 --> 00:30:28,450
from the peer to
peer network, right?

797
00:30:28,450 --> 00:30:30,170
And in fact, if
you're SPV clients,

798
00:30:30,170 --> 00:30:32,770
you usually open up around
eight connections to the peer

799
00:30:32,770 --> 00:30:34,180
to peer network.

800
00:30:34,180 --> 00:30:37,300
And if you're a full node in the
bitcoin peer to peer network,

801
00:30:37,300 --> 00:30:41,910
you usually have around
117 incoming connections.

802
00:30:41,910 --> 00:30:43,300
That that's how
much you support.

803
00:30:43,300 --> 00:30:46,395
You support 117 incoming
connections as a full node.

804
00:30:46,395 --> 00:30:48,520
So that means in total,
you support about a million

805
00:30:48,520 --> 00:30:49,930
incoming connections.

806
00:30:49,930 --> 00:30:51,700
So you know, this guy
supports a million.

807
00:30:51,700 --> 00:30:55,840
But we need about 1.6
million connections

808
00:30:55,840 --> 00:30:57,830
from these 200,000
clients, right?

809
00:30:57,830 --> 00:30:59,830
So it's a bit of
a problem if you

810
00:30:59,830 --> 00:31:04,020
deploy Catena and it
becomes wildly popular.

811
00:31:04,020 --> 00:31:05,770
Being a bit optimistic
here, but you know,

812
00:31:05,770 --> 00:31:08,440
let's say that happened, right?

813
00:31:08,440 --> 00:31:09,800
So how can we fix this?

814
00:31:09,800 --> 00:31:12,550
How can we avoid this problem
because in this case, what

815
00:31:12,550 --> 00:31:14,110
we would basically
be doing is we

816
00:31:14,110 --> 00:31:16,030
would be accidentally
DDOSing bitcoin.

817
00:31:16,030 --> 00:31:17,500
And we don't want to do that.

818
00:31:17,500 --> 00:31:20,042
Does everybody see that there's
a problem here, first of all?

819
00:31:23,310 --> 00:31:25,270
OK, so the idea is very simple.

820
00:31:25,270 --> 00:31:28,005
We just introduced something
called a header relay network.

821
00:31:28,005 --> 00:31:29,880
And what that means is
look, you don't really

822
00:31:29,880 --> 00:31:32,400
have to ask for these block
headers from the Bitcoin peer

823
00:31:32,400 --> 00:31:33,360
to peer network.

824
00:31:33,360 --> 00:31:36,210
You could just outsource
these block headers anywhere

825
00:31:36,210 --> 00:31:37,900
because they're
publicly verifiable.

826
00:31:37,900 --> 00:31:41,050
Right, the block headers
have proof of work on them.

827
00:31:41,050 --> 00:31:46,530
So you can use volunteer
nodes that sort of push block

828
00:31:46,530 --> 00:31:49,230
headers to whoever
asks for them.

829
00:31:49,230 --> 00:31:51,570
You could use blockchain
explorers like blockchain.info,

830
00:31:51,570 --> 00:31:52,350
right?

831
00:31:52,350 --> 00:31:53,580
You could use Facebook.

832
00:31:53,580 --> 00:31:56,350
You could just post block
headers on Facebook.

833
00:31:56,350 --> 00:31:57,988
Right, like in a Facebook feed.

834
00:31:57,988 --> 00:31:59,280
You could use Twitter for that.

835
00:31:59,280 --> 00:32:01,420
You could use GitHub gists.

836
00:32:01,420 --> 00:32:02,893
You know, so you could--

837
00:32:02,893 --> 00:32:04,560
there's a lot of ways
to implement this.

838
00:32:04,560 --> 00:32:07,230
The simplest way is
have servers and have

839
00:32:07,230 --> 00:32:11,070
them send these headers
to whoever asks for them.

840
00:32:11,070 --> 00:32:12,900
So it's easy to
scale in that sense

841
00:32:12,900 --> 00:32:15,270
because if you now
ask these header relay

842
00:32:15,270 --> 00:32:17,577
network for the
block headers, you

843
00:32:17,577 --> 00:32:20,160
know, it's much easier to scale
this than to scale the bitcoin

844
00:32:20,160 --> 00:32:23,190
peer to peer network, which
has to do a bit more than just

845
00:32:23,190 --> 00:32:23,820
block headers.

846
00:32:23,820 --> 00:32:25,050
They have to verify blocks.

847
00:32:25,050 --> 00:32:28,330
They have to verify signatures.

848
00:32:28,330 --> 00:32:28,960
Yeah?

849
00:32:28,960 --> 00:32:29,865
AUDIENCE: Did you
consider having

850
00:32:29,865 --> 00:32:31,232
the clients being peer to peer?

851
00:32:31,232 --> 00:32:32,690
ALIN TOMESCU: ,
Yes so another way.

852
00:32:32,690 --> 00:32:36,510
And I think we'll talk
about that in the paper--

853
00:32:36,510 --> 00:32:39,600
is you can implement the header
relay network as a peer to peer

854
00:32:39,600 --> 00:32:41,620
network on top of the clients.

855
00:32:41,620 --> 00:32:45,920
Yeah, so that's
another way to do it.

856
00:32:45,920 --> 00:32:48,350
There's some subtleties there
that you have to get right.

857
00:32:48,350 --> 00:32:49,690
But you can do it, I think.

858
00:32:49,690 --> 00:32:49,840
Yeah?

859
00:32:49,840 --> 00:32:52,423
AUDIENCE: Would you you expect
that if a company like Whatsapp

860
00:32:52,423 --> 00:32:55,670
decided adopt Catena, they
would run their own servers

861
00:32:55,670 --> 00:32:57,918
to make sure that there
was the capacity for it?

862
00:32:57,918 --> 00:32:59,710
ALIN TOMESCU: For the
header relay network?

863
00:32:59,710 --> 00:33:00,830
AUDIENCE: Yes.

864
00:33:00,830 --> 00:33:03,260
ALIN TOMESCU: I mean, I
would be just be speculating,

865
00:33:03,260 --> 00:33:05,750
wishful thinking.

866
00:33:05,750 --> 00:33:07,730
They could.

867
00:33:07,730 --> 00:33:10,490
There is some problem with this
header relay network as well.

868
00:33:10,490 --> 00:33:11,865
And we talk about
it in the paper

869
00:33:11,865 --> 00:33:13,960
because this had a really
network could withhold

870
00:33:13,960 --> 00:33:14,930
block headers from you.

871
00:33:14,930 --> 00:33:16,520
So you do have to distribute it.

872
00:33:16,520 --> 00:33:18,320
Like usually you don't
want to just trust

873
00:33:18,320 --> 00:33:21,050
Whatsapp who's also doing
the statements, who's

874
00:33:21,050 --> 00:33:23,360
also pushing the statements
in the blockchain.

875
00:33:23,360 --> 00:33:25,460
You don't want to trust
them to also give you

876
00:33:25,460 --> 00:33:26,210
the block headers.

877
00:33:26,210 --> 00:33:28,730
You actually want to fetch
them from a different source

878
00:33:28,730 --> 00:33:30,890
that Whatsapp
doesn't collude with.

879
00:33:30,890 --> 00:33:32,722
Yeah, Anne, you had a question?

880
00:33:32,722 --> 00:33:34,680
AUDIENCE: Are there other
header relay networks

881
00:33:34,680 --> 00:33:36,648
that are deployed already?

882
00:33:36,648 --> 00:33:38,940
ALIN TOMESCU: Yeah, there
was actually one on ethereum.

883
00:33:38,940 --> 00:33:40,887
There's a smart
contract in ethereum,

884
00:33:40,887 --> 00:33:43,220
I think, that if you submit
bitcoin block headers to it,

885
00:33:43,220 --> 00:33:44,900
you get something back.

886
00:33:44,900 --> 00:33:46,970
And then you can just
query bitcoin block headers

887
00:33:46,970 --> 00:33:48,990
from the ethereum blockchain.

888
00:33:48,990 --> 00:33:50,690
Is anyone familiar with this?

889
00:33:50,690 --> 00:33:52,190
So I guess I didn't
include it here.

890
00:33:52,190 --> 00:33:54,440
But another way to do it is
to just publish the header

891
00:33:54,440 --> 00:33:56,230
is in an ethereum
smart contract.

892
00:33:56,230 --> 00:33:57,290
Yeah.

893
00:33:57,290 --> 00:33:59,660
So there's crazy ways
you could do this too.

894
00:33:59,660 --> 00:34:00,620
Yeah?

895
00:34:00,620 --> 00:34:02,910
AUDIENCE: Will that
one go out of gas?

896
00:34:02,910 --> 00:34:05,563
I don't know, my understanding
of ethereum is pretty decent,

897
00:34:05,563 --> 00:34:08,580
but would that smart contract
eventually run out of gas

898
00:34:08,580 --> 00:34:10,090
and not publish anymore?

899
00:34:10,090 --> 00:34:14,590
ALIN TOMESCU: So to fetch from
it, you don't need to pay gas.

900
00:34:14,590 --> 00:34:18,991
But I suspect to push
in it, I actually

901
00:34:18,991 --> 00:34:20,449
don't know who
funds that contract.

902
00:34:20,449 --> 00:34:23,322
So I guess you fund that
when you push maybe.

903
00:34:23,322 --> 00:34:25,489
Maybe not because then you
also want something back.

904
00:34:25,489 --> 00:34:27,648
Why would you push?

905
00:34:27,648 --> 00:34:28,190
I'm not sure.

906
00:34:28,190 --> 00:34:30,790
But we can look at it after.

907
00:34:30,790 --> 00:34:32,639
Yeah.

908
00:34:32,639 --> 00:34:35,050
It's a good question.

909
00:34:35,050 --> 00:34:37,550
So anyway, even if this header
relay network is compromised,

910
00:34:37,550 --> 00:34:39,092
if you implement it
in the right way,

911
00:34:39,092 --> 00:34:42,770
you've distributed on a
sufficient number of parties,

912
00:34:42,770 --> 00:34:46,083
you can still get all of the
properties that you need to,

913
00:34:46,083 --> 00:34:47,750
meaning freshness for
the block headers.

914
00:34:47,750 --> 00:34:49,540
That's really the only
property that you need to.

915
00:34:49,540 --> 00:34:51,123
The header relay
network should always

916
00:34:51,123 --> 00:34:53,210
reply with the
latest block headers.

917
00:34:53,210 --> 00:34:57,610
So let's look at costs since
Alin was mentioning the costs.

918
00:34:57,610 --> 00:35:01,490
So to open a statement, you have
to issue a transaction, right?

919
00:35:01,490 --> 00:35:06,230
And the size of our transactions
are around 235 bytes.

920
00:35:06,230 --> 00:35:11,840
So and the fee as of December
13 was $16.24 for transactions

921
00:35:11,840 --> 00:35:14,360
if you guys remember
those great bitcoin times.

922
00:35:14,360 --> 00:35:17,690
I think it went up to
$40 at some point too.

923
00:35:17,690 --> 00:35:20,510
So it was it was really hard
for me to talk to people

924
00:35:20,510 --> 00:35:21,820
about this research back then.

925
00:35:21,820 --> 00:35:23,278
But guess what,
the fees are today?

926
00:35:26,660 --> 00:35:28,100
So today, this
morning I checked.

927
00:35:28,100 --> 00:35:30,650
And there were $0.78, right?

928
00:35:30,650 --> 00:35:34,670
When we wrote the paper,
they were like $0.12.

929
00:35:34,670 --> 00:35:36,620
So you know, here I
am standing in front

930
00:35:36,620 --> 00:35:38,720
of you pitching our work.

931
00:35:38,720 --> 00:35:41,840
In two minutes they could be
back to $100, but who knows?

932
00:35:41,840 --> 00:35:43,050
Yes, you had a question.

933
00:35:43,050 --> 00:35:45,050
AUDIENCE: Maybe I'm wrong,
but in a transaction,

934
00:35:45,050 --> 00:35:46,520
you can have some outputs.

935
00:35:46,520 --> 00:35:49,273
Can you have several
statements in there?

936
00:35:49,273 --> 00:35:50,940
ALIN TOMESCU: So
that's a good question.

937
00:35:50,940 --> 00:35:52,535
So can you batch statements?

938
00:35:52,535 --> 00:35:53,660
And then the answer is yes.

939
00:35:53,660 --> 00:35:55,760
You can definitely
batch statements.

940
00:35:55,760 --> 00:35:59,150
What we've said so far
is in a transaction--

941
00:35:59,150 --> 00:36:02,390
in a Catena transaction,
you have this output.

942
00:36:02,390 --> 00:36:04,120
And you have this
op return output

943
00:36:04,120 --> 00:36:06,620
where you put the
statement, right?

944
00:36:06,620 --> 00:36:10,580
And you know, it spends
a previous transaction.

945
00:36:10,580 --> 00:36:16,670
But as a matter of fact, what
you can do and some of you

946
00:36:16,670 --> 00:36:19,950
may already notice this.

947
00:36:19,950 --> 00:36:22,490
There is no reason to put
just one statement in here.

948
00:36:22,490 --> 00:36:24,530
There is some reason--
so the only reason

949
00:36:24,530 --> 00:36:27,150
is that it only fits 80 bytes.

950
00:36:27,150 --> 00:36:29,870
So you could put easily, let's
say, two or three statements

951
00:36:29,870 --> 00:36:32,518
in there if you hashed them
with the right hash function.

952
00:36:32,518 --> 00:36:34,310
But a better way to do
it is why didn't you

953
00:36:34,310 --> 00:36:36,620
put here a Merkle root hash?

954
00:36:36,620 --> 00:36:39,380
And then you can have
as many statements

955
00:36:39,380 --> 00:36:43,630
as you want in the leafs
of that Merkle tree, right?

956
00:36:43,630 --> 00:36:45,380
In fact, here you could
have I don't know,

957
00:36:45,380 --> 00:36:49,600
billions of statements.

958
00:36:49,600 --> 00:36:52,320
So keep in mind,
you will only be

959
00:36:52,320 --> 00:36:55,380
able to issue billions of
statements every 10 minutes.

960
00:36:55,380 --> 00:36:56,880
But you can definitely
have billions

961
00:36:56,880 --> 00:36:59,010
of statements in a
single transaction

962
00:36:59,010 --> 00:37:00,270
if you just batch them.

963
00:37:00,270 --> 00:37:03,750
So now, remember the blockchain
will only store the root hash.

964
00:37:03,750 --> 00:37:06,840
This Merkle tree will be stored
by the log server perhaps

965
00:37:06,840 --> 00:37:08,880
or by a different party.

966
00:37:08,880 --> 00:37:11,280
They don't have to
be the same party.

967
00:37:11,280 --> 00:37:12,910
Does that make sense?

968
00:37:12,910 --> 00:37:14,370
Does that answer your question?

969
00:37:14,370 --> 00:37:14,870
Yeah.

970
00:37:17,450 --> 00:37:20,100
Right, that was my next point.

971
00:37:20,100 --> 00:37:22,510
Statements can be batched
with Merkle trees.

972
00:37:22,510 --> 00:37:24,900
OK, so let's talk about
the why since so far

973
00:37:24,900 --> 00:37:27,150
we've been talking abstractly
about these statements.

974
00:37:27,150 --> 00:37:29,490
But what could these
statements actually be.

975
00:37:29,490 --> 00:37:31,290
So let's look at a
secure software update.

976
00:37:31,290 --> 00:37:34,140
So how do you do
secure software update?

977
00:37:34,140 --> 00:37:36,540
An example attack on a
software update scheme

978
00:37:36,540 --> 00:37:41,330
is that somebody compromises
the bitcoin.org domain.

979
00:37:41,330 --> 00:37:44,390
And they change the bitcoin
binary to a malicious binary.

980
00:37:44,390 --> 00:37:47,417
And they wait for people to
install that malicious binary.

981
00:37:47,417 --> 00:37:48,750
And then they steal their coins.

982
00:37:48,750 --> 00:37:49,820
They steal their data.

983
00:37:49,820 --> 00:37:51,890
They could execute
arbitrary code.

984
00:37:51,890 --> 00:37:54,230
And an example of this was
sort of this binary safety

985
00:37:54,230 --> 00:37:55,875
warning on the bitcoin website.

986
00:37:55,875 --> 00:37:57,500
At some point, they
were very concerned

987
00:37:57,500 --> 00:38:01,130
that a state actor is going
to mess with the DNS servers

988
00:38:01,130 --> 00:38:04,010
and redirect clients
to a different server

989
00:38:04,010 --> 00:38:07,290
and make them download
a bad bitcoin binary.

990
00:38:07,290 --> 00:38:08,540
So does the attack make sense?

991
00:38:08,540 --> 00:38:10,410
Does everybody see
why this is possible?

992
00:38:10,410 --> 00:38:15,590
You do need to sort of accept
that the DNS service that we

993
00:38:15,590 --> 00:38:17,930
currently have on the internet
is fundamentally flawed.

994
00:38:17,930 --> 00:38:21,630
It's not built for security.

995
00:38:21,630 --> 00:38:24,680
So the typical defense the
typical defense for this

996
00:38:24,680 --> 00:38:26,960
is that the Bitcoin developers--
they sign the bitcoin

997
00:38:26,960 --> 00:38:29,270
binaries with some secret key.

998
00:38:29,270 --> 00:38:31,820
And they protect
that secret key.

999
00:38:31,820 --> 00:38:34,455
And then there's a
public key associated

1000
00:38:34,455 --> 00:38:36,830
with the secret key that's
posted on the bitcoin website,

1001
00:38:36,830 --> 00:38:37,670
right?

1002
00:38:37,670 --> 00:38:39,795
And maybe some of you'll
notice that sometimes it's

1003
00:38:39,795 --> 00:38:42,087
also very easy to change the
public key on the website,

1004
00:38:42,087 --> 00:38:43,580
you know, if you
can just redirect

1005
00:38:43,580 --> 00:38:45,290
the victim to another website.

1006
00:38:45,290 --> 00:38:47,990
And another problem is that not
everyone checks the signature.

1007
00:38:47,990 --> 00:38:50,448
Even if let's say you have the
public key on your computer,

1008
00:38:50,448 --> 00:38:52,303
you know what the
right public key is,

1009
00:38:52,303 --> 00:38:53,720
only if you're
like an expert user

1010
00:38:53,720 --> 00:38:57,380
and you know how to use GPG,
you will check that signature,

1011
00:38:57,380 --> 00:38:58,430
right?

1012
00:38:58,430 --> 00:39:00,145
And the other problem
that's probably I

1013
00:39:00,145 --> 00:39:01,520
think a much, much
bigger problem

1014
00:39:01,520 --> 00:39:03,770
is that for the bitcoin
devs themselves,

1015
00:39:03,770 --> 00:39:06,280
it's very hard to detect if
someone stole their secret key.

1016
00:39:06,280 --> 00:39:09,500
Like if I'm a state actor
and I break your computer

1017
00:39:09,500 --> 00:39:12,440
and I steal your secret key, I
will sign this bitcoin binary.

1018
00:39:12,440 --> 00:39:14,270
And I'll give it to
let's say, one guy.

1019
00:39:14,270 --> 00:39:15,282
I'll give it to you.

1020
00:39:15,282 --> 00:39:17,240
You know, and I'll just
target you individually

1021
00:39:17,240 --> 00:39:18,320
because I'm really--

1022
00:39:18,320 --> 00:39:19,790
I know you have
a lot of bitcoin.

1023
00:39:19,790 --> 00:39:23,047
Right, and then the bitcoin devs
will never find out about it

1024
00:39:23,047 --> 00:39:25,130
unless you know you kind
of realize what happened.

1025
00:39:25,130 --> 00:39:26,685
Then you take your
bitcoin binary

1026
00:39:26,685 --> 00:39:28,310
and you go with it
to the bitcoin devs.

1027
00:39:28,310 --> 00:39:30,828
And they check the signature
on it and they say, oh wow.

1028
00:39:30,828 --> 00:39:32,870
This is a valid signature
and we never signed it.

1029
00:39:32,870 --> 00:39:36,440
So somebody must have
stolen our secret key.

1030
00:39:36,440 --> 00:39:39,553
So this is really a bad--

1031
00:39:39,553 --> 00:39:40,970
kind of the core
of the problem is

1032
00:39:40,970 --> 00:39:43,322
that it's hard for
whoever publishes software

1033
00:39:43,322 --> 00:39:44,780
to detect that
their secret key has

1034
00:39:44,780 --> 00:39:47,360
been stolen-- to detect
malicious signatures

1035
00:39:47,360 --> 00:39:48,200
on their binaries.

1036
00:39:48,200 --> 00:39:49,530
Does that make sense?

1037
00:39:49,530 --> 00:39:51,860
So the solution, of course,
is you know, publish

1038
00:39:51,860 --> 00:39:55,460
the signatures of bitcoin
binaries in a Catena log

1039
00:39:55,460 --> 00:39:57,830
So now if there's a malicious
binary being published

1040
00:39:57,830 --> 00:40:00,770
by a state actor, people
won't accept that binary

1041
00:40:00,770 --> 00:40:02,780
unless it's in the
Catena log, which

1042
00:40:02,780 --> 00:40:07,050
means people in the bitcoin
devs will see the same binary.

1043
00:40:07,050 --> 00:40:09,800
Right, so let me let me show
you what I mean with a picture.

1044
00:40:09,800 --> 00:40:13,610
So we have this Catena
log for Bitcoin binaries.

1045
00:40:13,610 --> 00:40:15,110
And let's say, the
first transaction

1046
00:40:15,110 --> 00:40:21,260
has a hash of the bitcoin
0.001 tar file, right,

1047
00:40:21,260 --> 00:40:23,840
the bitcoin binaries.

1048
00:40:23,840 --> 00:40:27,230
And this hash here
is implicitly signed

1049
00:40:27,230 --> 00:40:29,780
by the signature in
this input because it

1050
00:40:29,780 --> 00:40:32,510
signs the whole transaction.

1051
00:40:32,510 --> 00:40:35,210
So now if I put this
hash in a Catena log,

1052
00:40:35,210 --> 00:40:38,240
I get a signature
on it for free.

1053
00:40:38,240 --> 00:40:42,320
And now, if let's say, a state
actor compromises the log

1054
00:40:42,320 --> 00:40:44,720
server, gets the
secret key, he can

1055
00:40:44,720 --> 00:40:48,110
publish this second malicious
binary in the log, right?

1056
00:40:48,110 --> 00:40:51,090
But what that malicious
state actor will want to do,

1057
00:40:51,090 --> 00:40:53,240
he will want to hide this
from the Bitcoin devs

1058
00:40:53,240 --> 00:40:54,620
and show it to all of you guys.

1059
00:40:54,620 --> 00:40:57,197
All right, so he'll
want to equivocate.

1060
00:40:57,197 --> 00:40:59,780
So as a result, he will want to
create a different transaction

1061
00:40:59,780 --> 00:41:02,420
with the right
bitcoin binary there,

1062
00:41:02,420 --> 00:41:05,350
show this to the bitcoin devs
while showing this to you guys.

1063
00:41:05,350 --> 00:41:08,240
All right, so the bitcoin
devs would think they're good.

1064
00:41:08,240 --> 00:41:11,540
This is the binary they wanted
to publish while you guys would

1065
00:41:11,540 --> 00:41:17,535
be using this malicious binary
published by the state actor.

1066
00:41:17,535 --> 00:41:19,910
Of course this cannot happen
because in Catena you cannot

1067
00:41:19,910 --> 00:41:21,316
equivocate.

1068
00:41:21,316 --> 00:41:24,190
Right?

1069
00:41:24,190 --> 00:41:27,360
Does everybody see this?

1070
00:41:27,360 --> 00:41:28,832
Right, any questions about this?

1071
00:41:28,832 --> 00:41:30,290
There has to be a
question on this.

1072
00:41:37,122 --> 00:41:39,090
No?

1073
00:41:39,090 --> 00:41:42,240
So this mechanism is called
software transparency.

1074
00:41:42,240 --> 00:41:45,720
It's this idea that rather
than just downloading software

1075
00:41:45,720 --> 00:41:50,040
like a crazy person from the
internet and installing it,

1076
00:41:50,040 --> 00:41:51,930
we should just be
publishing these binaries

1077
00:41:51,930 --> 00:41:54,990
in a log that everybody can see,
including the software vendors

1078
00:41:54,990 --> 00:41:56,770
that created those binaries.

1079
00:41:56,770 --> 00:41:59,947
So in this way, if somebody
compromises a software vendor,

1080
00:41:59,947 --> 00:42:01,530
that vendor can
notice that in the log

1081
00:42:01,530 --> 00:42:03,197
there's a new version
for their software

1082
00:42:03,197 --> 00:42:04,350
that they didn't publish.

1083
00:42:04,350 --> 00:42:07,890
So you know, this isn't to say
that it'll prevent attacks.

1084
00:42:07,890 --> 00:42:10,890
You know, what a state
actor can do anyway

1085
00:42:10,890 --> 00:42:12,180
is they can just do this.

1086
00:42:12,180 --> 00:42:13,950
They can post this
H2 prime in here,

1087
00:42:13,950 --> 00:42:16,770
show it to you guys
including the bitcoin devs

1088
00:42:16,770 --> 00:42:18,960
and still screw everyone over.

1089
00:42:18,960 --> 00:42:21,890
But at least these attacks
then go undetected anymore.

1090
00:42:21,890 --> 00:42:27,325
All right, so it's a step
forward in that sense.

1091
00:42:27,325 --> 00:42:28,700
Yes, so the idea
is that you have

1092
00:42:28,700 --> 00:42:30,590
to double spend to equivocate.

1093
00:42:30,590 --> 00:42:34,760
And the other
example that really--

1094
00:42:34,760 --> 00:42:36,680
the reason I wanted
to start this research

1095
00:42:36,680 --> 00:42:38,340
had to do with public
key distribution.

1096
00:42:38,340 --> 00:42:40,550
So let's say we have
Alice and we have Bob.

1097
00:42:40,550 --> 00:42:42,530
And they both have
their public keys.

1098
00:42:42,530 --> 00:42:44,648
And I'm using this
letter b to denote

1099
00:42:44,648 --> 00:42:46,190
Bob and his public
key and the letter

1100
00:42:46,190 --> 00:42:48,500
a to denote Alice
and her public key.

1101
00:42:48,500 --> 00:42:50,540
And they have their
corresponding secret keys,

1102
00:42:50,540 --> 00:42:51,110
right?

1103
00:42:51,110 --> 00:42:53,318
And Alice and Bob, they want
to chat securely, right?

1104
00:42:53,318 --> 00:42:58,580
So they want to set
up a secure channel.

1105
00:42:58,580 --> 00:43:01,280
And there's this directory
which stores their public keys.

1106
00:43:01,280 --> 00:43:03,320
So this guy stores
Alice, pk Alice.

1107
00:43:03,320 --> 00:43:05,300
This guy stores Bob, pk Bob.

1108
00:43:05,300 --> 00:43:09,440
All right, and the directory
gets updated over time,

1109
00:43:09,440 --> 00:43:13,670
maybe Karl, Ellen
and Dan registered.

1110
00:43:13,670 --> 00:43:17,090
And if you have
non-equivocation,

1111
00:43:17,090 --> 00:43:19,940
if the attacker wants to
impersonate Alice and Bob,

1112
00:43:19,940 --> 00:43:22,730
he kind of has to put
their public keys,

1113
00:43:22,730 --> 00:43:24,570
the fake public keys
in the same directory,

1114
00:43:24,570 --> 00:43:27,800
which means that when
Alice and Bob monitor--

1115
00:43:27,800 --> 00:43:29,870
they check their
own public keys,

1116
00:43:29,870 --> 00:43:34,030
they both notice they've
been impersonated, right?

1117
00:43:34,030 --> 00:43:37,420
So again, the idea is
that you can detect.

1118
00:43:37,420 --> 00:43:41,030
Now how can this
attacker still trick

1119
00:43:41,030 --> 00:43:43,640
Alice to send an
encrypted message to Bob

1120
00:43:43,640 --> 00:43:45,590
with Bob's fake public key?

1121
00:43:45,590 --> 00:43:48,057
Is there a way even if
you have non-equivocation?

1122
00:43:50,980 --> 00:43:52,030
So what's the attack?

1123
00:43:52,030 --> 00:43:53,775
Even I have
non-equivocation and I

1124
00:43:53,775 --> 00:43:55,150
claim that the
attacker can still

1125
00:43:55,150 --> 00:43:59,710
get Alice to send a fake,
an encrypted message to Bob

1126
00:43:59,710 --> 00:44:01,573
that the attacker can decrypt.

1127
00:44:01,573 --> 00:44:02,740
What should the attacker do?

1128
00:44:02,740 --> 00:44:07,750
So pretend that we are
here without the ability

1129
00:44:07,750 --> 00:44:08,862
to equivocate.

1130
00:44:11,453 --> 00:44:12,870
So the attacker
cannot equivocate.

1131
00:44:12,870 --> 00:44:14,790
But I claimed that
the attacker can still

1132
00:44:14,790 --> 00:44:17,070
trick Alice into
sending a message to Bob

1133
00:44:17,070 --> 00:44:19,790
that the attacker can read.

1134
00:44:19,790 --> 00:44:22,166
So now it's time to see if
you guys paid attention.

1135
00:44:28,010 --> 00:44:29,920
Somebody?

1136
00:44:29,920 --> 00:44:31,040
Alin?

1137
00:44:31,040 --> 00:44:32,097
Oh, you?

1138
00:44:32,097 --> 00:44:34,430
AUDIENCE: Does the attacker
have to have the secret key?

1139
00:44:34,430 --> 00:44:35,060
ALIN TOMESCU: No, no.

1140
00:44:35,060 --> 00:44:35,680
He does not.

1141
00:44:35,680 --> 00:44:36,920
Yeah.

1142
00:44:36,920 --> 00:44:39,112
He does not have to
have the secret key.

1143
00:44:41,840 --> 00:44:44,527
The attacker just
creates fake public keys.

1144
00:44:44,527 --> 00:44:45,110
Here's a hint.

1145
00:44:48,290 --> 00:44:51,010
AUDIENCE: If you only
changed one person to choose,

1146
00:44:51,010 --> 00:44:53,965
they don't know that it's a
fake key so they could send it

1147
00:44:53,965 --> 00:44:55,960
to a fake key for a bit?

1148
00:44:55,960 --> 00:44:58,042
ALIN TOMESCU: Yeah, so
whose person should they

1149
00:44:58,042 --> 00:44:59,250
attack or change the key for?

1150
00:44:59,250 --> 00:45:00,870
AUDIENCE: Like if they
change Bob's, Alice

1151
00:45:00,870 --> 00:45:02,245
will still think
Bob's is correct

1152
00:45:02,245 --> 00:45:04,873
so she'll send it to the
fake Bob until Bob checks it.

1153
00:45:04,873 --> 00:45:05,790
ALIN TOMESCU: Exactly.

1154
00:45:05,790 --> 00:45:07,350
So that's exactly right.

1155
00:45:07,350 --> 00:45:08,483
So what's your name?

1156
00:45:08,483 --> 00:45:09,150
AUDIENCE: Lucas.

1157
00:45:09,150 --> 00:45:09,983
ALIN TOMESCU: Lucas.

1158
00:45:09,983 --> 00:45:15,810
So what Lucas is saying is
look, even without equivocation,

1159
00:45:15,810 --> 00:45:18,390
I had this directory at T1.

1160
00:45:18,390 --> 00:45:19,950
I had another one at T2.

1161
00:45:19,950 --> 00:45:25,590
But at T3 and both of these had
keys for Alice and Bob, right?

1162
00:45:25,590 --> 00:45:30,190
But at T3, Lucas is saying look,
just put the fake key for Bob

1163
00:45:30,190 --> 00:45:30,690
here.

1164
00:45:30,690 --> 00:45:32,130
And that's it.

1165
00:45:32,130 --> 00:45:35,310
Don't put a fake key for
Alice there, just for Bob.

1166
00:45:35,310 --> 00:45:41,335
And now when Alice looks up
this public key for Bob here,

1167
00:45:41,335 --> 00:45:42,960
she sends a query to
the directory hey,

1168
00:45:42,960 --> 00:45:44,940
what's Bob's public key?

1169
00:45:44,940 --> 00:45:50,280
She gets back b prime,
which is equal to Bob pk

1170
00:45:50,280 --> 00:45:53,980
Bob prime, right?

1171
00:45:53,980 --> 00:45:57,168
Alice can't tell if that's
really Bob's fake public key.

1172
00:45:57,168 --> 00:45:58,960
That's the reason she's
using the directory

1173
00:45:58,960 --> 00:46:00,160
in the first place.

1174
00:46:00,160 --> 00:46:02,920
She wants sort of a trustworthy
place to get it from.

1175
00:46:02,920 --> 00:46:04,610
Bob can tell if Bob looks.

1176
00:46:04,610 --> 00:46:06,610
But by the time Bob looks,
it might be too late.

1177
00:46:06,610 --> 00:46:08,880
Alice might have already
encrypted a message, right?

1178
00:46:08,880 --> 00:46:10,880
So again, what's the point
of doing all of this?

1179
00:46:10,880 --> 00:46:14,103
It's not like you're
preventing attacks, right?

1180
00:46:14,103 --> 00:46:15,520
And the point of
doing all of this

1181
00:46:15,520 --> 00:46:16,840
is that you get transparency.

1182
00:46:16,840 --> 00:46:19,810
Bob can detect, whereas
right now Bob has no hope.

1183
00:46:19,810 --> 00:46:23,110
In fact, so you said, a
lot of you use Whatsapp.

1184
00:46:23,110 --> 00:46:24,793
So you know in
Whatsapp, if you really

1185
00:46:24,793 --> 00:46:26,710
want to be sure, so I
have a conversation here

1186
00:46:26,710 --> 00:46:28,570
with Alin Dragos.

1187
00:46:28,570 --> 00:46:31,060
So, Alin, do you what to
to bring your phone here?

1188
00:46:31,060 --> 00:46:32,950
Do you have Whatsapp?

1189
00:46:32,950 --> 00:46:35,440
So if you really want
to be sure that you're

1190
00:46:35,440 --> 00:46:37,697
talking to the real Alin
and not some other guy,

1191
00:46:37,697 --> 00:46:39,280
you have to go on
this encryption tab.

1192
00:46:39,280 --> 00:46:42,548
Can you tape this?

1193
00:46:42,548 --> 00:46:43,840
So my phone is black and white.

1194
00:46:43,840 --> 00:46:45,760
It's going through
a depression phase.

1195
00:46:45,760 --> 00:46:47,890
I apologize.

1196
00:46:47,890 --> 00:46:50,140
So you have to go here and
there's a code here, right?

1197
00:46:50,140 --> 00:46:51,640
And Alin, can you
do the same thing?

1198
00:46:51,640 --> 00:46:52,973
You know what I'm talking about?

1199
00:46:56,250 --> 00:46:59,490
I really hope I don't get some
weird text message right now

1200
00:46:59,490 --> 00:47:01,930
with the camera on the phone.

1201
00:47:01,930 --> 00:47:05,095
OK, so now with Alin's
phone, is that the same code?

1202
00:47:05,095 --> 00:47:05,970
Can somebody tell me?

1203
00:47:05,970 --> 00:47:07,410
I can't see it.

1204
00:47:07,410 --> 00:47:09,740
AUDIENCE: It's hard to see.

1205
00:47:09,740 --> 00:47:12,333
ALIN TOMESCU: OK, so
we have 27836 and yeah.

1206
00:47:12,333 --> 00:47:13,250
So it's the same code.

1207
00:47:13,250 --> 00:47:15,050
Can you see it on the camera?

1208
00:47:15,050 --> 00:47:16,915
So now because we have
the same code here.

1209
00:47:16,915 --> 00:47:18,290
With this code
here, it really is

1210
00:47:18,290 --> 00:47:20,673
a hash of my public key
and Alin's public key.

1211
00:47:20,673 --> 00:47:23,090
And if we've got the same hash
of both of our public keys,

1212
00:47:23,090 --> 00:47:24,840
then we know we're
talking to one another.

1213
00:47:24,840 --> 00:47:27,290
But we won't really know that's
the case until we actually

1214
00:47:27,290 --> 00:47:29,360
meet in person and do
this exchange, right?

1215
00:47:29,360 --> 00:47:31,880
So what this system
does instead is

1216
00:47:31,880 --> 00:47:33,710
it allows Alin to check
his own public key

1217
00:47:33,710 --> 00:47:36,410
and it allows me to
check my own public key.

1218
00:47:36,410 --> 00:47:38,078
This way if we check
our own public key,

1219
00:47:38,078 --> 00:47:40,370
we'll always know when we're
impersonated even though I

1220
00:47:40,370 --> 00:47:43,130
might send an encrypted
message to the wrong Alin,

1221
00:47:43,130 --> 00:47:44,703
Alin will eventually find out.

1222
00:47:44,703 --> 00:47:46,870
It's a bit confusing because
we're both called Alin.

1223
00:47:50,505 --> 00:47:51,880
So does that sort
of makes sense?

1224
00:47:51,880 --> 00:47:52,380
Yeah?

1225
00:47:52,380 --> 00:47:54,290
AUDIENCE: So what do
you do when you realize

1226
00:47:54,290 --> 00:47:55,943
that your key is the wrong key?

1227
00:47:55,943 --> 00:47:57,110
ALIN TOMESCU: Good question.

1228
00:47:57,110 --> 00:47:58,400
So that's really the
crucial question.

1229
00:47:58,400 --> 00:47:59,442
What the hell can you do?

1230
00:47:59,442 --> 00:48:01,490
Right, so this directory
impersonated you.

1231
00:48:01,490 --> 00:48:03,788
In fact, you can't even--

1232
00:48:03,788 --> 00:48:04,580
here's the problem.

1233
00:48:04,580 --> 00:48:07,713
If you're Bob here and you
see this fake public key.

1234
00:48:07,713 --> 00:48:09,380
And you go to the New
York Times and you

1235
00:48:09,380 --> 00:48:12,110
say hey, New York Times,
this Whatsapp directory

1236
00:48:12,110 --> 00:48:13,820
started impersonating me.

1237
00:48:13,820 --> 00:48:15,920
And the New York Times
can go to the directory

1238
00:48:15,920 --> 00:48:18,087
and say, hey directory, why
did you impersonate Bob?

1239
00:48:18,087 --> 00:48:20,462
And the directory can say,
no, I did not impersonate Bob.

1240
00:48:20,462 --> 00:48:22,280
Bob really just ask
for a new public key.

1241
00:48:22,280 --> 00:48:24,620
And this was the public
key that Bob gave me.

1242
00:48:24,620 --> 00:48:28,133
And it's just a he said, they
said, kind of a thing, right?

1243
00:48:28,133 --> 00:48:30,050
So it's really a sort
of an open research area

1244
00:48:30,050 --> 00:48:32,960
to figure out what's the right
way to whistle blow here.

1245
00:48:32,960 --> 00:48:35,570
So for example, one project
that we're trying to work on

1246
00:48:35,570 --> 00:48:38,690
is there a way to track
the directory somehow

1247
00:48:38,690 --> 00:48:40,640
so that when he does
stuff like this,

1248
00:48:40,640 --> 00:48:44,868
you get a publicly verifiable
cryptographic proof that he

1249
00:48:44,868 --> 00:48:45,910
really misbehaved, right?

1250
00:48:45,910 --> 00:48:47,120
No, there's no
cryptographic proof.

1251
00:48:47,120 --> 00:48:48,860
The fact that that
public key is there

1252
00:48:48,860 --> 00:48:50,693
could have come from
the malicious directory

1253
00:48:50,693 --> 00:48:52,430
or could have come
from an honest Bob who

1254
00:48:52,430 --> 00:48:54,720
just changed his public key.

1255
00:48:54,720 --> 00:48:55,560
So yeah.

1256
00:48:55,560 --> 00:48:57,440
So again, a step forward
but we're not there.

1257
00:48:57,440 --> 00:49:00,660
You know, it's just that there's
much more work to do here.

1258
00:49:00,660 --> 00:49:02,752
And I think you
also had a question.

1259
00:49:02,752 --> 00:49:06,360
AUDIENCE: Can't Alice just
ask Bob if this is you?

1260
00:49:06,360 --> 00:49:08,930
ALIN TOMESCU: So that's a
chicken and an egg, right?

1261
00:49:08,930 --> 00:49:10,790
So we have Alice.

1262
00:49:10,790 --> 00:49:11,600
We have Bob.

1263
00:49:11,600 --> 00:49:13,130
And we have the attacker.

1264
00:49:13,130 --> 00:49:17,510
Alice asks Bob, is
this your public key?

1265
00:49:17,510 --> 00:49:20,730
You know, let's say, b prime.

1266
00:49:20,730 --> 00:49:22,580
Let me make this more readable.

1267
00:49:22,580 --> 00:49:24,860
So attacker, right?

1268
00:49:24,860 --> 00:49:26,360
So Alice asks, hey Bob.

1269
00:49:26,360 --> 00:49:28,490
Is this b prime your public key?

1270
00:49:28,490 --> 00:49:30,410
The attacker changes it.

1271
00:49:30,410 --> 00:49:32,930
Hey Bob, is b your public key?

1272
00:49:32,930 --> 00:49:35,460
The attacker-- Bob says yes.

1273
00:49:35,460 --> 00:49:38,030
Attacker forwards
yes to Alice, right?

1274
00:49:38,030 --> 00:49:40,358
Remember, Bob and Alice
don't have a secure channel.

1275
00:49:40,358 --> 00:49:42,900
That's the problem we're trying
to solve with this directory,

1276
00:49:42,900 --> 00:49:44,910
right?

1277
00:49:44,910 --> 00:49:47,610
So the attacker can always
man in the middle people.

1278
00:49:47,610 --> 00:49:49,030
Well, people like Alice and Bob.

1279
00:49:49,030 --> 00:49:51,100
If the attacker can man
in the middle everything,

1280
00:49:51,100 --> 00:49:52,320
then there's really no hope.

1281
00:49:52,320 --> 00:49:56,330
And we're living in a very sad,
sad world if that's the case.

1282
00:49:56,330 --> 00:50:00,920
Yeah, it actually might
be the case but we'll see.

1283
00:50:00,920 --> 00:50:04,470
Anyway, so yeah, so I claim here
that this is a step forward.

1284
00:50:04,470 --> 00:50:05,893
But there's still
much work to do.

1285
00:50:05,893 --> 00:50:07,310
All right, so we
get transparency.

1286
00:50:07,310 --> 00:50:08,090
Bob can detect.

1287
00:50:08,090 --> 00:50:08,930
He'll know.

1288
00:50:08,930 --> 00:50:10,700
He won't be able to
convince anybody.

1289
00:50:10,700 --> 00:50:13,310
But if a lot of Bobs
get compromised,

1290
00:50:13,310 --> 00:50:15,610
you're still like in a
place where everybody

1291
00:50:15,610 --> 00:50:18,110
knows that something's off and
will all stop using Whatsapp,

1292
00:50:18,110 --> 00:50:18,740
for example.

1293
00:50:18,740 --> 00:50:19,358
Right?

1294
00:50:19,358 --> 00:50:20,900
By the way, Whatsapp
is a great tool.

1295
00:50:20,900 --> 00:50:23,090
You should continue using it.

1296
00:50:23,090 --> 00:50:25,050
I'm just saying it's
difficult to use right

1297
00:50:25,050 --> 00:50:26,920
like if somebody really
wants to target it,

1298
00:50:26,920 --> 00:50:31,460
they can play a lot of
tricks to still trick you.

1299
00:50:31,460 --> 00:50:34,730
So yeah, so all right, we
already talked about this.

1300
00:50:34,730 --> 00:50:37,970
And yeah, again if the director
can equivocate, then you know,

1301
00:50:37,970 --> 00:50:41,300
all bets are off because now
Bob will look in this directory,

1302
00:50:41,300 --> 00:50:42,763
he'll think he's
not impersonated.

1303
00:50:42,763 --> 00:50:44,180
Alice will look
in this directory.

1304
00:50:44,180 --> 00:50:46,240
She'll think she's not
impersonated, right?

1305
00:50:46,240 --> 00:50:48,140
So the reason we
started this research

1306
00:50:48,140 --> 00:50:49,700
is because I really want to--

1307
00:50:49,700 --> 00:50:52,310
my thesis is on building
these directories that

1308
00:50:52,310 --> 00:50:55,370
are efficiently auditable and
have a hard time impersonating

1309
00:50:55,370 --> 00:50:56,148
people.

1310
00:50:56,148 --> 00:50:58,190
So that's why we decided
to look at how could you

1311
00:50:58,190 --> 00:50:59,960
do this with Bitcoin.

1312
00:50:59,960 --> 00:51:02,600
Yeah, so of course there's
one project called KeyChat

1313
00:51:02,600 --> 00:51:04,940
that we're working on with
some high school students.

1314
00:51:04,940 --> 00:51:07,065
And we're using the key
based public key directory.

1315
00:51:07,065 --> 00:51:09,050
And we're witnessing
it in a Catena log

1316
00:51:09,050 --> 00:51:13,370
so that stuff like
that doesn't happen.

1317
00:51:13,370 --> 00:51:16,700
OK, so now let's talk
about the blockchains.

1318
00:51:16,700 --> 00:51:18,710
In general, people
nowadays like to say I

1319
00:51:18,710 --> 00:51:20,940
need a blockchain for x, right?

1320
00:51:20,940 --> 00:51:22,940
I need a blockchain for
supply chain management.

1321
00:51:22,940 --> 00:51:25,820
I mean a blockchain
for cats, for whatever.

1322
00:51:25,820 --> 00:51:27,770
I've heard a lot
of crazy stories.

1323
00:51:27,770 --> 00:51:30,430
IOT, self-driving
cars, blah, blah, blah.

1324
00:51:30,430 --> 00:51:35,120
And I think the right way
to think about blockchain

1325
00:51:35,120 --> 00:51:37,310
is to never ever say
that word unless you

1326
00:51:37,310 --> 00:51:38,850
use quotes, first of all.

1327
00:51:38,850 --> 00:51:41,870
And second of all, to
understand what Byzantine state

1328
00:51:41,870 --> 00:51:43,185
machine replication is.

1329
00:51:43,185 --> 00:51:45,560
Right, and if you understand
what Byzantine state machine

1330
00:51:45,560 --> 00:51:47,687
replication is or a
Byzantine consensus,

1331
00:51:47,687 --> 00:51:48,770
you understand blockchain.

1332
00:51:48,770 --> 00:51:50,210
And you understand all the hype.

1333
00:51:50,210 --> 00:51:52,790
And then you can make some
progress in solving problems.

1334
00:51:52,790 --> 00:51:54,350
In the sense that
what is blockchain?

1335
00:51:54,350 --> 00:51:55,737
So what we're
doing here is we're

1336
00:51:55,737 --> 00:51:57,320
doing a Byzantine
consensus algorithm.

1337
00:51:57,320 --> 00:51:59,520
We're agreeing on a
log of operations.

1338
00:51:59,520 --> 00:52:01,550
Right, by the way, that's
what Catena does too

1339
00:52:01,550 --> 00:52:03,960
by piggybacking on bitcoin.

1340
00:52:03,960 --> 00:52:07,080
It agrees on a lot on
a log of operations.

1341
00:52:07,080 --> 00:52:11,390
Right, but the other thing that
SMR or Byzantine consensus does

1342
00:52:11,390 --> 00:52:13,340
is that it also
allows you to agree

1343
00:52:13,340 --> 00:52:16,520
on the execution of
the ops in that log.

1344
00:52:16,520 --> 00:52:18,530
So in Catena, you don't
agree on the execution,

1345
00:52:18,530 --> 00:52:20,430
you just agree on
the statements.

1346
00:52:20,430 --> 00:52:23,030
But there is no execution of
those statements in the sense

1347
00:52:23,030 --> 00:52:25,550
that you can't build
another bitcoin

1348
00:52:25,550 --> 00:52:29,870
on top of bitcoin in Catena
because you can't prevent

1349
00:52:29,870 --> 00:52:33,560
double spends of
transactions that

1350
00:52:33,560 --> 00:52:34,768
are Catena statements, right?

1351
00:52:34,768 --> 00:52:37,102
Like the Catena statements,
you have to look in each one

1352
00:52:37,102 --> 00:52:38,260
and tell if it's correct.

1353
00:52:38,260 --> 00:52:41,360
So to detect a double spend in
a Catena backed cryptocurrency,

1354
00:52:41,360 --> 00:52:43,510
you would have to download
all of the transactions

1355
00:52:43,510 --> 00:52:46,850
and because you cannot execute
it like the bitcoin miners do

1356
00:52:46,850 --> 00:52:49,430
and build this UTXO set.

1357
00:52:49,430 --> 00:52:51,290
I'm not sure this is
making a lot of sense.

1358
00:52:51,290 --> 00:52:54,060
But let's put it another way.

1359
00:52:54,060 --> 00:52:57,677
In bitcoin, you have
block one and then

1360
00:52:57,677 --> 00:52:58,760
you have block two, right?

1361
00:53:01,307 --> 00:53:03,890
And there's a hash pointer and
there's a bunch of transactions

1362
00:53:03,890 --> 00:53:05,450
here, right?

1363
00:53:05,450 --> 00:53:08,570
And remember that what prevents
me from double spending

1364
00:53:08,570 --> 00:53:09,770
something here--

1365
00:53:09,770 --> 00:53:11,510
I can have two
transactions in this block

1366
00:53:11,510 --> 00:53:13,670
that double spend
the same one here.

1367
00:53:13,670 --> 00:53:16,340
What prevents me from doing
that is exactly this execution

1368
00:53:16,340 --> 00:53:18,140
stage, right?

1369
00:53:18,140 --> 00:53:21,110
Because in the execution stage,
when I try to I execute this

1370
00:53:21,110 --> 00:53:24,020
first transaction and I
mark this output as spent,

1371
00:53:24,020 --> 00:53:25,910
when I execute the
second transaction,

1372
00:53:25,910 --> 00:53:27,660
I cannot spend that
output anymore, right?

1373
00:53:27,660 --> 00:53:30,285
In Catena, you can't do anything
like that with the statements.

1374
00:53:30,285 --> 00:53:31,740
You just agree on
the statements.

1375
00:53:31,740 --> 00:53:33,170
In Catena, you would
put this transaction

1376
00:53:33,170 --> 00:53:35,503
in the log, this one and then
that one and someone would

1377
00:53:35,503 --> 00:53:38,060
have to detect that the second
one is a bad one by actually

1378
00:53:38,060 --> 00:53:38,700
downloading it.

1379
00:53:38,700 --> 00:53:40,975
That's kind of what
I'm trying to say here.

1380
00:53:40,975 --> 00:53:42,350
So in general,
the way should you

1381
00:53:42,350 --> 00:53:43,808
should be thinking
about blockchain

1382
00:53:43,808 --> 00:53:45,740
is through the lens of
Byzantine consensus.

1383
00:53:45,740 --> 00:53:48,230
And that'll get you
ahead of the curve

1384
00:53:48,230 --> 00:53:50,100
in this overly
hyped space, right,

1385
00:53:50,100 --> 00:53:51,350
because it's really just this.

1386
00:53:51,350 --> 00:53:52,710
You're agreeing on
a log of operations.

1387
00:53:52,710 --> 00:53:55,010
And then you're agreeing on the
execution of those operations

1388
00:53:55,010 --> 00:53:56,030
according to some rules.

1389
00:53:56,030 --> 00:53:59,390
The rules in bitcoin
are transaction cannot--

1390
00:53:59,390 --> 00:54:02,880
there cannot be two inputs
spending the same output more

1391
00:54:02,880 --> 00:54:03,380
or less.

1392
00:54:03,380 --> 00:54:05,180
There's other things too.

1393
00:54:05,180 --> 00:54:06,680
What that gives you
is it allows you

1394
00:54:06,680 --> 00:54:09,510
to agree on a final state
which in bitcoin are

1395
00:54:09,510 --> 00:54:10,760
the valid transactions.

1396
00:54:10,760 --> 00:54:12,345
That's the final state right.

1397
00:54:12,345 --> 00:54:14,510
In ethereum, for
example, the final state

1398
00:54:14,510 --> 00:54:16,520
are the valid transactions,
and the account

1399
00:54:16,520 --> 00:54:19,490
balances of everything,
and the smart contract

1400
00:54:19,490 --> 00:54:20,930
state for everything.

1401
00:54:20,930 --> 00:54:23,040
And I guess you'll
learn about that later.

1402
00:54:23,040 --> 00:54:25,460
So you can build
arbitrarily complex things

1403
00:54:25,460 --> 00:54:29,020
with Byzantine consensus
or with blockchains.

1404
00:54:29,020 --> 00:54:31,670
And the high level bit, if you
want to look at it another way,

1405
00:54:31,670 --> 00:54:35,390
is that you have a program p,
right, which could be anything,

1406
00:54:35,390 --> 00:54:36,808
could be a cryptocurrency.

1407
00:54:36,808 --> 00:54:38,600
And then, instead of
running this program p

1408
00:54:38,600 --> 00:54:41,540
on a single server s,
what do you do is you

1409
00:54:41,540 --> 00:54:47,380
distribute it on a bunch of
servers, s1, s2, s3, s4, right?

1410
00:54:47,380 --> 00:54:49,610
And now as a result, to
mess with this program p,

1411
00:54:49,610 --> 00:54:51,320
it's not enough to
compromise one server,

1412
00:54:51,320 --> 00:54:53,860
you have to compromise
a bunch of them.

1413
00:54:53,860 --> 00:54:55,220
right?

1414
00:54:55,220 --> 00:54:57,620
OK so, and some
of you might also

1415
00:54:57,620 --> 00:55:00,230
be familiar with this term
permissioned blockchain.

1416
00:55:00,230 --> 00:55:03,890
So when you distribute this
program o amongst n servers

1417
00:55:03,890 --> 00:55:07,760
where n is equal let's
say, three f plus 1 and f

1418
00:55:07,760 --> 00:55:10,300
is equal to 1 in
this particular case.

1419
00:55:10,300 --> 00:55:13,220
In a permission blockchain,
this n is fixed, right?

1420
00:55:13,220 --> 00:55:15,980
Once you've set n to
4, it has to stay 4.

1421
00:55:15,980 --> 00:55:17,640
These servers have
to know one another.

1422
00:55:17,640 --> 00:55:19,640
They need to know each
other's public keys.

1423
00:55:19,640 --> 00:55:23,720
And only one of the servers,
f is equal to 1, can fail.

1424
00:55:23,720 --> 00:55:26,510
If more than one server
fails, then all bets are off.

1425
00:55:26,510 --> 00:55:28,840
Your program can start
doing arbitrary things.

1426
00:55:28,840 --> 00:55:30,590
In particular, if your
program is bitcoin,

1427
00:55:30,590 --> 00:55:33,170
it can start double spending.

1428
00:55:33,170 --> 00:55:36,390
I'm moving a little bit fast,
so I'll take some questions up

1429
00:55:36,390 --> 00:55:39,326
until this point before I go on.

1430
00:55:39,326 --> 00:55:40,315
Yes?

1431
00:55:40,315 --> 00:55:41,940
AUDIENCE: In a
permissioned blockchain,

1432
00:55:41,940 --> 00:55:43,270
do they use proof of work?

1433
00:55:43,270 --> 00:55:44,770
ALIN TOMESCU: No,
you don't have to.

1434
00:55:44,770 --> 00:55:47,400
And that's kind of
what the hype is about.

1435
00:55:47,400 --> 00:55:49,800
This stuff, there is like--

1436
00:55:49,800 --> 00:55:52,560
the first interesting
paper on this

1437
00:55:52,560 --> 00:55:55,200
was 1976 or something like that.

1438
00:55:55,200 --> 00:55:58,420
So this is 40-30,
40-year-old research.

1439
00:55:58,420 --> 00:56:00,510
We've known how to do
permissioned consensus-- we

1440
00:56:00,510 --> 00:56:03,060
used to call it Byzantine
consensus for 30 or 40

1441
00:56:03,060 --> 00:56:04,560
years, right?

1442
00:56:04,560 --> 00:56:06,030
So there's nothing new there.

1443
00:56:06,030 --> 00:56:09,575
It's just that it's very useful
nowadays to say blockchain then

1444
00:56:09,575 --> 00:56:10,950
to say consensus
because then you

1445
00:56:10,950 --> 00:56:14,243
get 10 more million from your
venture capitalist folks.

1446
00:56:14,243 --> 00:56:16,410
AUDIENCE: But if you don't
have to do proof of work,

1447
00:56:16,410 --> 00:56:19,440
do they ever do proof of work?

1448
00:56:19,440 --> 00:56:21,660
ALIN TOMESCU: It would
be such a bad idea

1449
00:56:21,660 --> 00:56:24,090
technically to do prefer
working a permissioned consensus

1450
00:56:24,090 --> 00:56:24,590
algorithm.

1451
00:56:24,590 --> 00:56:30,510
It just-- completely
unnecessary, plus probably

1452
00:56:30,510 --> 00:56:32,450
insecure too.

1453
00:56:32,450 --> 00:56:36,090
Yeah, so now the reason
you do proof of work

1454
00:56:36,090 --> 00:56:39,510
is because in a
permissionless blockchain,

1455
00:56:39,510 --> 00:56:40,830
this n is not fixed.

1456
00:56:40,830 --> 00:56:43,080
n could go, let's say, n was 4.

1457
00:56:43,080 --> 00:56:45,330
It could go to 8.

1458
00:56:45,330 --> 00:56:47,760
Then it could go to 3.

1459
00:56:47,760 --> 00:56:50,750
Then it could go to 12.

1460
00:56:50,750 --> 00:56:53,460
In other words, people
are joining and leaving

1461
00:56:53,460 --> 00:56:55,038
as they please.

1462
00:56:55,038 --> 00:56:56,580
And the reason you
need proof of work

1463
00:56:56,580 --> 00:57:00,030
in bitcoin, one way
you can look at it

1464
00:57:00,030 --> 00:57:03,090
is that you're really turning
a permissioned consensus

1465
00:57:03,090 --> 00:57:05,280
algorithm into a
permissionless one.

1466
00:57:05,280 --> 00:57:08,100
And a consensus
algorithm is just voting.

1467
00:57:08,100 --> 00:57:10,440
These n folks are just voting.

1468
00:57:10,440 --> 00:57:16,170
And you need 2f plus 1 votes
to sort of move on, right?

1469
00:57:16,170 --> 00:57:20,520
And if this n changes over time,
like if the n becomes bigger,

1470
00:57:20,520 --> 00:57:23,670
it's very easy to take over
a majority of the voters.

1471
00:57:23,670 --> 00:57:26,730
If I can just add fake voters
to a permissioned consensus

1472
00:57:26,730 --> 00:57:29,920
algorithm, I can just take
over the consensus algorithm.

1473
00:57:29,920 --> 00:57:32,520
In other words, I can take
over more than f nodes.

1474
00:57:32,520 --> 00:57:35,460
Right, so the trick
there is you have

1475
00:57:35,460 --> 00:57:37,360
to prevent that from happening.

1476
00:57:37,360 --> 00:57:39,270
And the only way to
prevent that is to say,

1477
00:57:39,270 --> 00:57:42,450
look if you're going to join
and then make my n bigger,

1478
00:57:42,450 --> 00:57:43,458
you better do some work.

1479
00:57:43,458 --> 00:57:45,000
So that it's not
easy for you to join

1480
00:57:45,000 --> 00:57:48,540
because if you're a bad guy
and you want to join, you know,

1481
00:57:48,540 --> 00:57:50,040
you can do that
very easily unless I

1482
00:57:50,040 --> 00:57:53,727
require you to do some work.

1483
00:57:53,727 --> 00:57:55,310
So that's kind of
the trick in turning

1484
00:57:55,310 --> 00:57:56,970
a permissioned
consensus algorithm

1485
00:57:56,970 --> 00:57:58,460
into a permissionless one.

1486
00:57:58,460 --> 00:58:01,040
And in fact, the way these
permissioned animals work

1487
00:58:01,040 --> 00:58:02,915
is completely
different than bitcoin.

1488
00:58:02,915 --> 00:58:04,040
They are much more complex.

1489
00:58:04,040 --> 00:58:07,490
Bitcoin is incredibly simple
as a consensus algorithm.

1490
00:58:07,490 --> 00:58:09,980
If you ever read a
consensus algorithm paper,

1491
00:58:09,980 --> 00:58:11,228
you know, it's a bit insane.

1492
00:58:11,228 --> 00:58:12,770
Also to implement,
it's a bit insane.

1493
00:58:12,770 --> 00:58:14,620
Bitcoin is very simple
to implement compared

1494
00:58:14,620 --> 00:58:16,370
to these other things,
I mean, bitcoin is,

1495
00:58:16,370 --> 00:58:17,850
of course, a complex
beast as well.

1496
00:58:17,850 --> 00:58:21,170
But you should look at let's
say, practical Byzantine fault

1497
00:58:21,170 --> 00:58:25,880
tolerant paper, PBFT, and
try and implement that.

1498
00:58:28,177 --> 00:58:30,010
So anyway, why am I
telling you all of this?

1499
00:58:30,010 --> 00:58:31,593
The reason I'm telling
you all of this

1500
00:58:31,593 --> 00:58:33,920
is because if you want to
do a permissioned blockchain

1501
00:58:33,920 --> 00:58:35,902
for whatever reason,
one way to do

1502
00:58:35,902 --> 00:58:38,360
that is to use your favorite
Byzantine consensus algorithm.

1503
00:58:38,360 --> 00:58:39,500
So that would be--

1504
00:58:39,500 --> 00:58:41,000
let's say pbft.

1505
00:58:41,000 --> 00:58:46,473
This was 1999 from MIT.

1506
00:58:46,473 --> 00:58:47,390
So you could use that.

1507
00:58:47,390 --> 00:58:50,250
You could have a lot
of fun implementing it.

1508
00:58:50,250 --> 00:58:52,970
Another thing you could do is
you could take your program p

1509
00:58:52,970 --> 00:58:55,522
and just give it to an
ethereum smart contract.

1510
00:58:55,522 --> 00:58:57,480
And you know, that the
ethereum smart contract,

1511
00:58:57,480 --> 00:59:00,935
if the ethereum security
assumption holds,

1512
00:59:00,935 --> 00:59:02,060
it will do the right thing.

1513
00:59:02,060 --> 00:59:04,902
It will execute your
program p correctly, right?

1514
00:59:04,902 --> 00:59:06,860
But the other thing that
you could do actually,

1515
00:59:06,860 --> 00:59:10,010
is you could use Catena
to agree on these logs,

1516
00:59:10,010 --> 00:59:13,280
on the log of operations
for your program.

1517
00:59:13,280 --> 00:59:17,630
And then you could use another
2f plus 1 servers or replicas

1518
00:59:17,630 --> 00:59:19,880
to do the execution
stuff so that you

1519
00:59:19,880 --> 00:59:22,495
can agree on the final state.

1520
00:59:22,495 --> 00:59:25,120
And this gives you a very simple
Byzantine consensus algorithm.

1521
00:59:25,120 --> 00:59:27,440
So remember Catena doesn't
give you execution.

1522
00:59:27,440 --> 00:59:29,240
It allows you to agree
on the log of ops.

1523
00:59:29,240 --> 00:59:32,690
To get the execution, you'd
basically take a majority vote.

1524
00:59:32,690 --> 00:59:37,790
If you see you have 2f
plus 1 replica servers

1525
00:59:37,790 --> 00:59:41,780
and if you see f plus 1
votes on a final state,

1526
00:59:41,780 --> 00:59:46,070
you know that's the right
state because only f of them

1527
00:59:46,070 --> 00:59:46,850
are malicious.

1528
00:59:51,320 --> 00:59:53,600
So in fact, if you use Catena
with 2f plus 1 replicas,

1529
00:59:53,600 --> 00:59:56,630
I claim that you can get sort of
a permissioned blockchain that

1530
00:59:56,630 --> 01:00:00,200
sort of leverages the bitcoin
blockchain to do the agreement

1531
01:00:00,200 --> 01:00:01,280
on the log of ops.

1532
01:00:01,280 --> 01:00:03,050
So in that sense, it's sort
of a mix of a permissioned

1533
01:00:03,050 --> 01:00:03,980
and permissionless.

1534
01:00:03,980 --> 01:00:05,730
We haven't studied
this like we don't know

1535
01:00:05,730 --> 01:00:07,850
what properties it would have.

1536
01:00:07,850 --> 01:00:10,440
So that's future work.

1537
01:00:10,440 --> 01:00:14,400
And if you don't need the
execution, for example,

1538
01:00:14,400 --> 01:00:16,520
if all you're doing
is you're agreeing

1539
01:00:16,520 --> 01:00:19,070
on a public key directory--
like here, there's no execution.

1540
01:00:19,070 --> 01:00:21,920
This directory just is
supposed to stay append-only,

1541
01:00:21,920 --> 01:00:24,020
we have some
research that allows

1542
01:00:24,020 --> 01:00:26,450
you to prove that
every transition is

1543
01:00:26,450 --> 01:00:28,832
an append-only directory.

1544
01:00:28,832 --> 01:00:30,290
And if you only
need execution, you

1545
01:00:30,290 --> 01:00:32,900
can just use Catena
directly as I already

1546
01:00:32,900 --> 01:00:35,030
told you guys for the
software transparency

1547
01:00:35,030 --> 01:00:39,860
application for the public
key directory application.

1548
01:00:39,860 --> 01:00:43,020
And if you want to do a
permissionless blockchain then,

1549
01:00:43,020 --> 01:00:46,340
of course, you would
have to roll your own.

1550
01:00:46,340 --> 01:00:48,770
But you have to proceed
with caution there, right?

1551
01:00:48,770 --> 01:00:51,140
It's not an easy thing to do.

1552
01:00:51,140 --> 01:00:52,940
OK, so let's conclude now.

1553
01:00:52,940 --> 01:00:55,400
What we did here is that we
enabled these applications

1554
01:00:55,400 --> 01:00:59,090
to efficiently leverage
bitcoin's consensus, right?

1555
01:00:59,090 --> 01:01:02,240
So clients can download
transactions selectively

1556
01:01:02,240 --> 01:01:05,360
rather than full blockchain
and prevent equivocation.

1557
01:01:05,360 --> 01:01:07,940
Right, and you only need
to get 46 megabytes instead

1558
01:01:07,940 --> 01:01:11,420
of gigabytes from the
Bitcoin blockchain.

1559
01:01:11,420 --> 01:01:13,223
So why does this matter?

1560
01:01:13,223 --> 01:01:15,140
These are just I think
the three killer apps--

1561
01:01:15,140 --> 01:01:17,550
secure software update,
public key directories--

1562
01:01:17,550 --> 01:01:18,967
by the way, the
public directories

1563
01:01:18,967 --> 01:01:21,260
also are applicable to https.

1564
01:01:21,260 --> 01:01:22,760
So when you go on
Facebook, you have

1565
01:01:22,760 --> 01:01:24,325
to get Facebook's public key.

1566
01:01:24,325 --> 01:01:26,870
The certificate authorities
that sign the public keys

1567
01:01:26,870 --> 01:01:28,533
are often compromised.

1568
01:01:28,533 --> 01:01:30,200
So they're often fake
search for Google,

1569
01:01:30,200 --> 01:01:32,180
for big companies like that.

1570
01:01:32,180 --> 01:01:33,420
And you might use it.

1571
01:01:33,420 --> 01:01:35,462
But if you have a
public key directory,

1572
01:01:35,462 --> 01:01:36,920
Facebook and Google
can immediately

1573
01:01:36,920 --> 01:01:39,980
notice those fake sorts.

1574
01:01:39,980 --> 01:01:41,220
It's a step forward.

1575
01:01:41,220 --> 01:01:44,560
And for more, of course,
you can read our paper.

1576
01:01:44,560 --> 01:01:48,980
It appeared in a IEEE
security and privacy 2017.

1577
01:01:48,980 --> 01:01:51,420
And I'll post the
slide on GitHub too.

1578
01:01:51,420 --> 01:01:52,870
So there are links there.

1579
01:01:52,870 --> 01:01:56,030
Yeah so, again this is the high
level overview of everything

1580
01:01:56,030 --> 01:01:58,340
that's previous
work and our work.

1581
01:01:58,340 --> 01:02:01,940
The difference is very small.

1582
01:02:01,940 --> 01:02:04,760
And now we can also
talk about other stuff.

1583
01:02:04,760 --> 01:02:06,400
In fact, I have
more stuff to talk.

1584
01:02:06,400 --> 01:02:09,920
But before we do, I'd like to
have a discussion with you guys

1585
01:02:09,920 --> 01:02:11,500
if you have questions.

1586
01:02:11,500 --> 01:02:12,410
So?

1587
01:02:12,410 --> 01:02:17,050
AUDIENCE: So could you implement
Catena in the actual bitcoin

1588
01:02:17,050 --> 01:02:18,510
node?

1589
01:02:18,510 --> 01:02:21,290
Would that be something
that they would want?

1590
01:02:21,290 --> 01:02:26,480
It seems like it would
be a good feature to add?

1591
01:02:26,480 --> 01:02:28,523
Or is it strictly separate?

1592
01:02:28,523 --> 01:02:30,440
ALIN TOMESCU: Yeah, I
don't think you need to.

1593
01:02:30,440 --> 01:02:31,800
That's the whole point, right?

1594
01:02:31,800 --> 01:02:33,217
The whole point
of the research is

1595
01:02:33,217 --> 01:02:36,517
how do we use bitcoin without
getting the miners to accept

1596
01:02:36,517 --> 01:02:38,600
a new version of bitcoin,
without changing bitcoin

1597
01:02:38,600 --> 01:02:40,990
in any way?

1598
01:02:40,990 --> 01:02:43,610
So no, I don't think--
there's nothing to do really,

1599
01:02:43,610 --> 01:02:44,655
we're just--

1600
01:02:44,655 --> 01:02:46,280
we're taking bitcoin
as it is and we're

1601
01:02:46,280 --> 01:02:48,740
piggybacking on top of it.

1602
01:02:48,740 --> 01:02:49,890
We couldn't change it.

1603
01:02:49,890 --> 01:02:51,410
I mean, there's a lot of things
you can do in some sense.

1604
01:02:51,410 --> 01:02:53,600
But then you get a
very different system.

1605
01:02:53,600 --> 01:02:56,430
Very different, we can talk
about it more if you want.

1606
01:02:56,430 --> 01:02:57,270
Yeah?

1607
01:02:57,270 --> 01:03:00,020
AUDIENCE: You were talking about
how you can use this system

1608
01:03:00,020 --> 01:03:03,060
to verify software
binaries and how

1609
01:03:03,060 --> 01:03:06,220
you want this to run in
SPV modes on phones So

1610
01:03:06,220 --> 01:03:08,707
how do you install
software on the phones say,

1611
01:03:08,707 --> 01:03:10,040
through apple and the app store.

1612
01:03:10,040 --> 01:03:12,470
Is there a way to sign the
binary that you actually

1613
01:03:12,470 --> 01:03:14,420
get from the app store?

1614
01:03:14,420 --> 01:03:16,730
ALIN TOMESCU: I think your
question is really about how

1615
01:03:16,730 --> 01:03:20,188
do appstore binaries--

1616
01:03:20,188 --> 01:03:21,730
how do you Verify
App store binaries?

1617
01:03:21,730 --> 01:03:23,260
It's a chicken and an
egg in some sense right,

1618
01:03:23,260 --> 01:03:24,385
is that what you're saying?

1619
01:03:24,385 --> 01:03:27,640
Yeah, you're right.

1620
01:03:27,640 --> 01:03:30,610
Eventually, I mean
in the best case,

1621
01:03:30,610 --> 01:03:32,440
wishful thinking would
be to say that look,

1622
01:03:32,440 --> 01:03:35,470
the app store does this
already for all of the binaries

1623
01:03:35,470 --> 01:03:42,120
that they publish, allowing the
developers to make sure nobody

1624
01:03:42,120 --> 01:03:44,480
is posting malicious binaries
on the app store for them.

1625
01:03:48,120 --> 01:03:49,444
Yes?

1626
01:03:49,444 --> 01:03:51,772
AUDIENCE: Who do you
envision running it?

1627
01:03:51,772 --> 01:03:54,230
ALIN TOMESCU: So I'd really
like to see Keybase run Catena,

1628
01:03:54,230 --> 01:03:56,060
it seems like a
missed opportunity

1629
01:03:56,060 --> 01:03:57,380
that they don't do this.

1630
01:03:57,380 --> 01:04:02,750
I'm sure they have better stuff
to do but it's just really

1631
01:04:02,750 --> 01:04:04,450
easily allow Keybase--

1632
01:04:04,450 --> 01:04:06,200
let's say Keybase has
a mobile phone app,

1633
01:04:06,200 --> 01:04:09,110
it would allow that mobile phone
app to verify the directory

1634
01:04:09,110 --> 01:04:11,490
and get much, much,
much more security.

1635
01:04:11,490 --> 01:04:15,080
You know, no equivocation as
long as nobody forks bitcoin.

1636
01:04:15,080 --> 01:04:17,150
Since Keybase is already
publishing these digests

1637
01:04:17,150 --> 01:04:20,030
but they cannot be audited
efficiently on a mobile phone.

1638
01:04:20,030 --> 01:04:21,980
I mean they can but
not securely you know,

1639
01:04:21,980 --> 01:04:23,480
because full nodes can lie.

1640
01:04:33,090 --> 01:04:36,450
So there's a big problem with
everything I said so far.

1641
01:04:36,450 --> 01:04:37,950
And nobody caught it.

1642
01:04:37,950 --> 01:04:40,710
So one problem is
that what do you

1643
01:04:40,710 --> 01:04:44,247
do when you run out of funds?

1644
01:04:44,247 --> 01:04:46,580
Remember I said the log server
starts with two bitcoins.

1645
01:04:46,580 --> 01:04:49,220
Let's say it issues
thousands of transactions,

1646
01:04:49,220 --> 01:04:52,885
starts paying those $40 fees
and it runs out of funds.

1647
01:04:52,885 --> 01:04:53,510
What do you do?

1648
01:04:53,510 --> 01:04:55,820
Then Yeah?

1649
01:04:55,820 --> 01:04:59,274
AUDIENCE: You can maybe
reload the new transactions

1650
01:04:59,274 --> 01:05:03,188
with this transaction
[INAUDIBLE]..

1651
01:05:03,188 --> 01:05:04,480
ALIN TOMESCU: What's your name?

1652
01:05:04,480 --> 01:05:05,170
AUDIENCE: Raul.

1653
01:05:05,170 --> 01:05:06,110
ALIN TOMESCU: Raul.

1654
01:05:06,110 --> 01:05:07,560
So Raul is saying
you can reload.

1655
01:05:07,560 --> 01:05:08,750
And that's exactly right.

1656
01:05:08,750 --> 01:05:13,680
We just have to change the
transaction format slightly.

1657
01:05:13,680 --> 01:05:15,860
We talk about this
in the paper as well.

1658
01:05:15,860 --> 01:05:19,190
But just to demonstrate
real quickly.

1659
01:05:19,190 --> 01:05:23,510
Suppose, let's take a ridiculous
example which hopefully

1660
01:05:23,510 --> 01:05:25,220
will never happen in bitcoin.

1661
01:05:25,220 --> 01:05:29,300
But suppose that the
Bitcoin fee is 1 bitcoin.

1662
01:05:32,450 --> 01:05:35,070
So now I have one bitcoin here.

1663
01:05:35,070 --> 01:05:37,380
And I have s1 here.

1664
01:05:37,380 --> 01:05:40,640
And now I have
zero bitcoins here.

1665
01:05:44,860 --> 01:05:49,270
All right, so zero
bitcoins in this output.

1666
01:05:49,270 --> 01:05:52,097
And maybe s2 here.

1667
01:05:52,097 --> 01:05:53,180
So that would be terrible.

1668
01:05:53,180 --> 01:05:55,790
Right, now I can't go on.

1669
01:05:55,790 --> 01:05:58,650
Does everybody see
this as a problem?

1670
01:05:58,650 --> 01:06:01,810
Right, so what Raul is saying
is look at another input

1671
01:06:01,810 --> 01:06:07,130
here and make it take coins
from some other transaction

1672
01:06:07,130 --> 01:06:08,610
whatever, 20 bitcoins.

1673
01:06:08,610 --> 01:06:11,090
And now you get
20 bitcoins here.

1674
01:06:11,090 --> 01:06:13,982
Right, so you can easily
refund transactions.

1675
01:06:19,070 --> 01:06:21,920
There is a bit more
subtlety there in the sense

1676
01:06:21,920 --> 01:06:26,180
that you don't
want to join logs--

1677
01:06:26,180 --> 01:06:31,460
let's say if you have two
logs, GTX and GTX prime

1678
01:06:31,460 --> 01:06:33,630
for different applications.

1679
01:06:33,630 --> 01:06:36,710
Right, and they start
issuing statements-- s1, s2.

1680
01:06:36,710 --> 01:06:38,660
You don't want to
be able to join--

1681
01:06:38,660 --> 01:06:40,850
I'm sorry, s1, s1 prime--

1682
01:06:40,850 --> 01:06:45,200
these two locks to a single
log for certain reasons right.

1683
01:06:45,200 --> 01:06:47,960
But this doesn't actually
allow you to join them

1684
01:06:47,960 --> 01:06:48,965
in the sense that--

1685
01:06:48,965 --> 01:06:50,840
let's say you actually
do this and join them.

1686
01:06:50,840 --> 01:06:53,990
Right, so let's say
this transaction here

1687
01:06:53,990 --> 01:06:58,840
came from GTX prime, right?

1688
01:06:58,840 --> 01:07:00,970
And there was an s1 prime here.

1689
01:07:04,428 --> 01:07:08,380
And you did this, right?

1690
01:07:08,380 --> 01:07:10,610
So the problem is this is
no longer a valid Catena

1691
01:07:10,610 --> 01:07:15,560
transaction for this log
because a valid Casino

1692
01:07:15,560 --> 01:07:17,840
transaction-- the
first input spends

1693
01:07:17,840 --> 01:07:20,450
the previous
transactions output.

1694
01:07:20,450 --> 01:07:24,110
But in this chain, it's
the second input that

1695
01:07:24,110 --> 01:07:26,090
spends the previous output.

1696
01:07:26,090 --> 01:07:28,280
So I cannot join logs.

1697
01:07:28,280 --> 01:07:30,800
And this matters for
a bunch of reasons

1698
01:07:30,800 --> 01:07:33,110
that we don't have to go into.

1699
01:07:37,350 --> 01:07:41,768
Yeah, so we talked about the
about batching statements.

1700
01:07:41,768 --> 01:07:43,560
I want to show you guys
some previous work,

1701
01:07:43,560 --> 01:07:47,190
so how did some previous
work do this since there

1702
01:07:47,190 --> 01:07:49,152
we seem to have a bit of time.

1703
01:07:49,152 --> 01:07:50,610
OK, so there are
some previous work

1704
01:07:50,610 --> 01:07:53,670
called liar, liar,
coins on fire.

1705
01:07:53,670 --> 01:07:55,380
Have you guys seen this?

1706
01:07:55,380 --> 01:07:58,290
So the idea here is that it
is a really nice piece of work

1707
01:07:58,290 --> 01:08:00,240
and lots of people in
the bitcoin community

1708
01:08:00,240 --> 01:08:02,740
already know about this.

1709
01:08:02,740 --> 01:08:05,830
And I think it was an idea
before the paper or maybe not,

1710
01:08:05,830 --> 01:08:07,330
I'm not sure.

1711
01:08:07,330 --> 01:08:10,330
But Tadge has a similar idea.

1712
01:08:10,330 --> 01:08:15,450
So for example, suppose
we have this authority.

1713
01:08:15,450 --> 01:08:17,100
Imagine this is a
Catena block server

1714
01:08:17,100 --> 01:08:21,149
and it publishes a transaction
which locks to bitcoin

1715
01:08:21,149 --> 01:08:24,420
and locks those bitcoins
to that public key.

1716
01:08:24,420 --> 01:08:26,473
And this authority
sometimes will want

1717
01:08:26,473 --> 01:08:27,640
to say two different things.

1718
01:08:27,640 --> 01:08:31,250
It will want to say
s and s prime, right?

1719
01:08:31,250 --> 01:08:35,319
And it will sign the statements
with their secret key.

1720
01:08:37,990 --> 01:08:39,970
But the secret key
that the authority

1721
01:08:39,970 --> 01:08:44,290
uses to sign statements is
also a bitcoin secret key.

1722
01:08:44,290 --> 01:08:46,569
It's the same secret
key that the authority

1723
01:08:46,569 --> 01:08:50,470
used to lock to bitcoins,
so $20,000 or something.

1724
01:08:50,470 --> 01:08:53,410
I'm not sure if bitcoin
plummeted since yesterday,

1725
01:08:53,410 --> 01:08:54,880
can never be sure.

1726
01:08:54,880 --> 01:08:56,560
Does the setting make sense?

1727
01:08:56,560 --> 01:08:57,720
So I have an authority.

1728
01:08:57,720 --> 01:08:59,710
It issues statements
just like before.

1729
01:08:59,710 --> 01:09:04,649
And it numbers them
with i, let's say.

1730
01:09:04,649 --> 01:09:07,535
And we want to prevent this
authority from equivocating.

1731
01:09:07,535 --> 01:09:09,160
We're not actually
going to prevent it.

1732
01:09:09,160 --> 01:09:12,340
We're just going to
disincentivize it in the sense

1733
01:09:12,340 --> 01:09:15,430
that if this authority
equivocates like this

1734
01:09:15,430 --> 01:09:18,700
for the same statement
i, what I claim

1735
01:09:18,700 --> 01:09:21,490
is that anybody can then
steal that authority's

1736
01:09:21,490 --> 01:09:23,800
bitcoin because
equivocating like this

1737
01:09:23,800 --> 01:09:25,973
reveals the secret key.

1738
01:09:25,973 --> 01:09:27,640
And the reason it
reveals the secret key

1739
01:09:27,640 --> 01:09:32,750
is because the signature
shares the same i here.

1740
01:09:32,750 --> 01:09:35,270
So how many of you
guys actually, you

1741
01:09:35,270 --> 01:09:37,340
did cover Schnorr
signatures, right?

1742
01:09:37,340 --> 01:09:41,712
So did Tadge talk about how
to do this with Schnorr?

1743
01:09:41,712 --> 01:09:43,170
AUDIENCE: This is
the [INAUDIBLE]..

1744
01:09:46,270 --> 01:09:47,930
ALIN TOMESCU: But
did Tadge cover it?

1745
01:09:47,930 --> 01:09:48,520
AUDIENCE: Yes.

1746
01:09:48,520 --> 01:09:49,600
ALIN TOMESCU: OK, great.

1747
01:09:49,600 --> 01:09:53,080
So yeah so let's go
over that less briefly.

1748
01:09:53,080 --> 01:09:56,440
So again, the idea is that
if someone observes these two

1749
01:09:56,440 --> 01:09:59,440
signatures on conflicting
statements for the same i,

1750
01:09:59,440 --> 01:10:02,020
there is this box where you
can put the two signatures

1751
01:10:02,020 --> 01:10:03,650
and get back the secret key.

1752
01:10:03,650 --> 01:10:05,130
And once you have
the secret key,

1753
01:10:05,130 --> 01:10:06,760
we can spend this
transaction and get

1754
01:10:06,760 --> 01:10:09,325
that authority's bitcoins.

1755
01:10:09,325 --> 01:10:10,700
And then there's
a lot of details

1756
01:10:10,700 --> 01:10:13,010
to get it right because
you might notice

1757
01:10:13,010 --> 01:10:16,860
that if the authority does
this, before they do this,

1758
01:10:16,860 --> 01:10:19,280
they might already be spending
this transaction themselves

1759
01:10:19,280 --> 01:10:22,970
so as to prevent
you from taking it.

1760
01:10:22,970 --> 01:10:25,490
And details on how to prevent
the authority from doing that

1761
01:10:25,490 --> 01:10:27,350
are in those paper.

1762
01:10:27,350 --> 01:10:29,870
Yeah, and then you
know, whoever discovered

1763
01:10:29,870 --> 01:10:31,950
this can spend those bitcoins.

1764
01:10:31,950 --> 01:10:36,170
And the idea is that this
disincentivizes equivocation

1765
01:10:36,170 --> 01:10:38,200
by locking these funds
under the secret key

1766
01:10:38,200 --> 01:10:40,190
of the bad authority.

1767
01:10:40,190 --> 01:10:41,420
But it does not prevent it.

1768
01:10:41,420 --> 01:10:44,150
Right, so in Catena we
actually prevent equivocation.

1769
01:10:44,150 --> 01:10:46,220
We say, if you
want to equivocate,

1770
01:10:46,220 --> 01:10:48,540
you better fork bitcoin.

1771
01:10:48,540 --> 01:10:52,100
Here they say, if you
want to equivocate,

1772
01:10:52,100 --> 01:10:54,750
you're going to lose $20,000.

1773
01:10:54,750 --> 01:10:56,870
Right?

1774
01:10:56,870 --> 01:10:58,520
But you have to
understand like this

1775
01:10:58,520 --> 01:11:00,872
could be a good authority
that locked $20,000 here.

1776
01:11:00,872 --> 01:11:03,080
But the attackers are going
to be-- they're not going

1777
01:11:03,080 --> 01:11:04,610
to care about those $20,000.

1778
01:11:04,610 --> 01:11:06,620
They're just going to
steal the secret key,

1779
01:11:06,620 --> 01:11:08,780
equivocate and
then the authority

1780
01:11:08,780 --> 01:11:11,690
is going to be
left without money.

1781
01:11:11,690 --> 01:11:13,203
If the authority
is the attacker,

1782
01:11:13,203 --> 01:11:14,120
then this makes sense.

1783
01:11:14,120 --> 01:11:15,828
But if the attacker
is not the authority,

1784
01:11:15,828 --> 01:11:18,110
then this makes less sense
because the authority sort

1785
01:11:18,110 --> 01:11:22,070
of risking their bitcoins
on the assumption

1786
01:11:22,070 --> 01:11:24,662
that no attacker can compromise
them, which you know,

1787
01:11:24,662 --> 01:11:26,370
if you could do that
in computer science,

1788
01:11:26,370 --> 01:11:28,328
I wouldn't be sitting
here talking to you guys.

1789
01:11:30,330 --> 01:11:33,230
OK, so now how do you do this?

1790
01:11:33,230 --> 01:11:36,500
So do you remember Schnorr
signatures real quickly?

1791
01:11:39,050 --> 01:11:44,700
An easy way to do this is
using Schnorr signatures.

1792
01:11:44,700 --> 01:11:45,760
And I think--

1793
01:11:45,760 --> 01:11:48,450
I could be wrong, but
I think this new SegWit

1794
01:11:48,450 --> 01:11:54,360
update to bitcoin allows
Schnorr signatures, right?

1795
01:11:54,360 --> 01:11:58,160
So with SegWit and
with Schnorr signature,

1796
01:11:58,160 --> 01:11:59,750
as you can definitely do this.

1797
01:11:59,750 --> 01:12:02,480
And the idea is that
Schnorr signature

1798
01:12:02,480 --> 01:12:11,210
if you recall, is just k
plus h of m, g to the k s,

1799
01:12:11,210 --> 01:12:13,160
where s is the secret key.

1800
01:12:13,160 --> 01:12:16,230
Right, so this is a Schnorr
signature on m, right?

1801
01:12:18,840 --> 01:12:20,782
How many recall this?

1802
01:12:20,782 --> 01:12:22,490
Right, and of course,
it's not just this.

1803
01:12:22,490 --> 01:12:31,880
I mean, the signature is
really this and this h of m,

1804
01:12:31,880 --> 01:12:33,710
g to the k.

1805
01:12:33,710 --> 01:12:36,740
So it's these two things, right.

1806
01:12:36,740 --> 01:12:44,870
But now I want to show you that
if I sign two different things,

1807
01:12:44,870 --> 01:12:47,720
I can actually get s.

1808
01:12:47,720 --> 01:12:50,030
But if I sign them
in a certain way--

1809
01:12:50,030 --> 01:12:53,330
so remember what I said before
is I would like this authority

1810
01:12:53,330 --> 01:12:59,490
to sign i m and i m prime.

1811
01:12:59,490 --> 01:13:03,200
Right, and if the
authority does this,

1812
01:13:03,200 --> 01:13:05,150
I claim that I can get
the secret key out.

1813
01:13:05,150 --> 01:13:07,100
But I have to have the same i.

1814
01:13:07,100 --> 01:13:10,340
So we relax this in a
sense that we're not

1815
01:13:10,340 --> 01:13:15,950
going to use i here, we're just
going to use this g to the k.

1816
01:13:15,950 --> 01:13:19,320
So if the authority uses
the same g to the k to sign,

1817
01:13:19,320 --> 01:13:20,790
then we can extract
the secret key.

1818
01:13:20,790 --> 01:13:23,200
And the way we can do that
is I'll just show an example,

1819
01:13:23,200 --> 01:13:32,120
sig1 would be k plus
h of m1 g to the k.

1820
01:13:32,120 --> 01:13:36,950
And it'll be sig 1 and I
guess the associated hash

1821
01:13:36,950 --> 01:13:41,370
E1 would be h m1, g to the k.

1822
01:13:41,370 --> 01:13:43,490
Does everybody see this?

1823
01:13:43,490 --> 01:13:50,030
And then sig2 would be k
plus h of m 2 g to the k.

1824
01:13:50,030 --> 01:13:53,440
So again, oh, you guys are
clearly not paying attention.

1825
01:13:53,440 --> 01:13:54,440
I forgot the secret key.

1826
01:13:57,200 --> 01:14:00,960
E2 is hash of m2 g to the k.

1827
01:14:00,960 --> 01:14:05,090
OK, so I'm using the same
g to the k and the same k.

1828
01:14:12,520 --> 01:14:14,320
So now how can I
extract the secret key?

1829
01:14:14,320 --> 01:14:15,820
Does anybody see a
solution to this?

1830
01:14:18,970 --> 01:14:21,417
AUDIENCE: It's now just system
of two equations and two

1831
01:14:21,417 --> 01:14:23,500
variables where k and s
are the unknown variables.

1832
01:14:23,500 --> 01:14:25,542
ALIN TOMESCU: So I think
what Rahul always saying

1833
01:14:25,542 --> 01:14:32,500
is that s is just sig1 minus
sig2 divided by e1 minus e2.

1834
01:14:32,500 --> 01:14:33,140
Is that right?

1835
01:14:33,140 --> 01:14:33,640
Yeah.

1836
01:14:39,140 --> 01:14:42,230
See this, because if I
subtract this from this,

1837
01:14:42,230 --> 01:14:48,470
I just get h m1--

1838
01:14:48,470 --> 01:14:50,240
I'm not going to
have space here.

1839
01:14:50,240 --> 01:14:54,470
I get-- let's say it here--

1840
01:14:54,470 --> 01:15:06,150
h of m1, g to the k minus h
of m2 g into the k times s.

1841
01:15:06,150 --> 01:15:08,680
All right?

1842
01:15:08,680 --> 01:15:12,590
And now I take this
here in the denominator

1843
01:15:12,590 --> 01:15:14,580
and I simplify and I get s.

1844
01:15:14,580 --> 01:15:15,080
All right.

1845
01:15:18,430 --> 01:15:20,830
So it turns out that
that's kind of the trick

1846
01:15:20,830 --> 01:15:22,900
that this word leverage
is more or less.

1847
01:15:22,900 --> 01:15:27,000
The only sort of caveat here
is that remember I said,

1848
01:15:27,000 --> 01:15:29,080
there is a position
i for the statement.

1849
01:15:29,080 --> 01:15:31,330
But now I'm saying, there's
no longer a position.

1850
01:15:31,330 --> 01:15:33,130
We have to use this g to the k.

1851
01:15:33,130 --> 01:15:35,290
And how do we map positions
that are g to the k

1852
01:15:35,290 --> 01:15:38,090
is another trick that you
have to do but you can do it.

1853
01:15:38,090 --> 01:15:40,257
And if you want more details,
you can read the paper

1854
01:15:40,257 --> 01:15:43,930
and I'm sure Tadge will tell you
even more details about this.

1855
01:15:43,930 --> 01:15:45,850
I think the lightning
network also leverages

1856
01:15:45,850 --> 01:15:48,550
this trick in some cases.

1857
01:15:51,850 --> 01:15:55,150
All right now, let's talk a
little bit about some attacks

1858
01:15:55,150 --> 01:15:56,380
if there is time.

1859
01:15:56,380 --> 01:15:59,950
The most interesting attack
would be the generalized vector

1860
01:15:59,950 --> 01:16:00,748
76 attack.

1861
01:16:00,748 --> 01:16:02,290
So this is a screenshot
from a paper.

1862
01:16:02,290 --> 01:16:05,020
Can everybody see this?

1863
01:16:05,020 --> 01:16:09,070
So the generalized vector 76
attack is very interesting.

1864
01:16:09,070 --> 01:16:11,200
So you have you have
an attacker, right?

1865
01:16:11,200 --> 01:16:14,970
And remember, the goal of
the attacker is to replace--

1866
01:16:14,970 --> 01:16:19,120
let's say this
TX1 with that TX2.

1867
01:16:19,120 --> 01:16:22,180
It's going to show this TX1
to a merchant saying hey,

1868
01:16:22,180 --> 01:16:23,560
I paid you money.

1869
01:16:23,560 --> 01:16:26,560
And then it's going to show
this TX2 to the merchant sending

1870
01:16:26,560 --> 01:16:27,700
the money back to himself.

1871
01:16:27,700 --> 01:16:30,640
So the whole trick is to fork
the merchant on this side

1872
01:16:30,640 --> 01:16:32,980
and then to switch him
back to that, right.

1873
01:16:32,980 --> 01:16:37,030
It's a sort of a pre-mining
attack and this attack is much

1874
01:16:37,030 --> 01:16:40,780
easier to pull on SPV notes
because what the attacker does

1875
01:16:40,780 --> 01:16:41,290
is--

1876
01:16:41,290 --> 01:16:46,020
let's see-- he works on the
secret chain with TX1 in it.

1877
01:16:46,020 --> 01:16:48,410
And at some point, the main
chain might take him over,

1878
01:16:48,410 --> 01:16:48,910
might win.

1879
01:16:48,910 --> 01:16:51,250
So the attacker kind of
gives up and keeps trying.

1880
01:16:51,250 --> 01:16:53,740
But at some point the
attacker gets ahead,

1881
01:16:53,740 --> 01:16:54,867
right in the second chain.

1882
01:16:54,867 --> 01:16:56,200
He gets ahead of the main chain.

1883
01:16:56,200 --> 01:16:57,940
This stuff is not posted yet.

1884
01:16:57,940 --> 01:17:01,600
And let's see this merchant only
for some reason because they're

1885
01:17:01,600 --> 01:17:06,190
silly, they only need one
confirmation to accept the TX1.

1886
01:17:06,190 --> 01:17:07,940
And for other silly
reasons, this merchant

1887
01:17:07,940 --> 01:17:10,960
is an SPV merchant, right?

1888
01:17:10,960 --> 01:17:12,730
Let's draw a figure here.

1889
01:17:12,730 --> 01:17:14,290
So this merchant
is an SPV merchant

1890
01:17:14,290 --> 01:17:19,686
that only needs one confirmation
to accept the transaction.

1891
01:17:23,970 --> 01:17:25,875
So we have the attacker.

1892
01:17:29,550 --> 01:17:31,470
And we have let's
say, the SPV merchant.

1893
01:17:35,670 --> 01:17:38,300
Right, and now the
attacker's sent you know,

1894
01:17:38,300 --> 01:17:42,750
this was the main block
there, let's say, this was bi.

1895
01:17:42,750 --> 01:17:44,290
And the attacker forked it.

1896
01:17:44,290 --> 01:17:47,230
He put TX1 here.

1897
01:17:47,230 --> 01:17:49,720
And then had a
confirmation on it, right?

1898
01:17:49,720 --> 01:17:52,780
So he sends this
chain to the merchant.

1899
01:17:52,780 --> 01:17:55,137
And the other miners
haven't found anything yet.

1900
01:17:55,137 --> 01:17:56,720
Right, so the attacker
is a bit ahead.

1901
01:17:56,720 --> 01:17:58,240
So he's pre-mining.

1902
01:17:58,240 --> 01:17:59,320
So far, so good.

1903
01:17:59,320 --> 01:18:01,510
Are you all with me?

1904
01:18:01,510 --> 01:18:05,620
And remember that he really
just is showing block headers.

1905
01:18:05,620 --> 01:18:06,937
So these are not full blocks.

1906
01:18:06,937 --> 01:18:08,270
He's just showing block headers.

1907
01:18:08,270 --> 01:18:11,620
They're much smaller
blocks, 80 bytes.

1908
01:18:11,620 --> 01:18:14,160
And the blocks are missing.

1909
01:18:14,160 --> 01:18:15,940
All right, so when
he shows TX1, he

1910
01:18:15,940 --> 01:18:21,340
is just showing like a Merkle
path to TX1 to the merchant.

1911
01:18:21,340 --> 01:18:23,200
You guys with me?

1912
01:18:23,200 --> 01:18:25,020
OK, so now he did
this to the merchant.

1913
01:18:25,020 --> 01:18:26,850
And now what the
attacker is going to do,

1914
01:18:26,850 --> 01:18:29,430
he's going to relax,
sit back, and post

1915
01:18:29,430 --> 01:18:33,240
the TX2 the mines--
give TX2 to the miners.

1916
01:18:33,240 --> 01:18:35,700
Let's say the miners are here.

1917
01:18:35,700 --> 01:18:38,370
Hes going to post
TX2 to the miners.

1918
01:18:38,370 --> 01:18:39,930
And he'll stop
mining the attacker.

1919
01:18:39,930 --> 01:18:41,100
He's not going to mine anymore.

1920
01:18:41,100 --> 01:18:43,017
But he got the merchant
to accept the payment.

1921
01:18:43,017 --> 01:18:44,730
And the merchant
shipped the goods.

1922
01:18:44,730 --> 01:18:47,220
Maybe this is an
online purchase.

1923
01:18:47,220 --> 01:18:52,740
And so now the miners, they will
find the next block bi plus 1.

1924
01:18:52,740 --> 01:18:55,590
They'll put TX2 here.

1925
01:18:55,590 --> 01:18:58,650
They'll find the next block
and then the next block.

1926
01:18:58,650 --> 01:18:59,760
And they'll take over.

1927
01:18:59,760 --> 01:19:01,385
Their chain will
be the main chain.

1928
01:19:01,385 --> 01:19:02,760
And eventually
this merchant will

1929
01:19:02,760 --> 01:19:04,770
hear about this main chain.

1930
01:19:04,770 --> 01:19:06,830
And he'll just see the
headers, of course,

1931
01:19:06,830 --> 01:19:08,350
because he's an SPV merchant.

1932
01:19:08,350 --> 01:19:10,860
So now the SPV merchant
just got double spent.

1933
01:19:10,860 --> 01:19:12,620
Does everybody see this?

1934
01:19:12,620 --> 01:19:14,340
That I just double
spent the merchant?

1935
01:19:14,340 --> 01:19:16,470
And sort of the
fundamental problem

1936
01:19:16,470 --> 01:19:20,760
here is that the merchant
only received block headers.

1937
01:19:20,760 --> 01:19:23,250
So he cannot take these block
headers and broadcast them

1938
01:19:23,250 --> 01:19:26,460
to the miners because miners
don't mine on top of block

1939
01:19:26,460 --> 01:19:27,510
headers.

1940
01:19:27,510 --> 01:19:29,760
Miners mine on top
of full blocks.

1941
01:19:29,760 --> 01:19:32,340
So this merchant cannot has
no protection against this

1942
01:19:32,340 --> 01:19:34,150
if he's doing SPV--

1943
01:19:34,150 --> 01:19:35,960
if he's accepting SPV payments.

1944
01:19:35,960 --> 01:19:39,210
If these were full blocks, what
the merchant could have done

1945
01:19:39,210 --> 01:19:41,273
was the following.

1946
01:19:41,273 --> 01:19:42,690
The merchant would
have would have

1947
01:19:42,690 --> 01:19:51,060
received the full block here
with the transaction in it,

1948
01:19:51,060 --> 01:19:51,560
right.

1949
01:19:51,560 --> 01:19:53,390
So again the attacker got ahead.

1950
01:19:53,390 --> 01:19:55,940
But now the merchant
because he saw a block,

1951
01:19:55,940 --> 01:19:59,900
he's going to ship this
block with TX1 to the miners

1952
01:19:59,900 --> 01:20:01,010
because he's a full node.

1953
01:20:01,010 --> 01:20:01,910
Full nodes do that.

1954
01:20:01,910 --> 01:20:05,000
When they hear about a
block, they broadcast it.

1955
01:20:05,000 --> 01:20:07,370
So now the miners
know about this block.

1956
01:20:07,370 --> 01:20:09,480
They're going to continue
mining on top of it.

1957
01:20:09,480 --> 01:20:11,480
And in fact, the merchant
will send both blocks,

1958
01:20:11,480 --> 01:20:12,680
right to the miners.

1959
01:20:12,680 --> 01:20:15,800
So now miners will
continue building here.

1960
01:20:15,800 --> 01:20:21,380
So now the attacker has a bit of
a tougher problem on his hands.

1961
01:20:21,380 --> 01:20:23,480
There is still a way to
trick even full nodes,

1962
01:20:23,480 --> 01:20:25,460
even if this guy
is a full node, you

1963
01:20:25,460 --> 01:20:27,730
can sort of leverage
some timing assumptions

1964
01:20:27,730 --> 01:20:29,050
to still trick the full node.

1965
01:20:29,050 --> 01:20:32,000
And the details are in
that paper over there.

1966
01:20:32,000 --> 01:20:35,193
But that's one way--

1967
01:20:35,193 --> 01:20:37,610
somebody was asking, I think
you were asking Anne, right--

1968
01:20:37,610 --> 01:20:39,170
about SPV nodes and how
they're less secure.

1969
01:20:39,170 --> 01:20:41,712
And this is one fundamental way
in which they're less secure.

1970
01:20:41,712 --> 01:20:44,510
If you accept payments
with SPV nodes

1971
01:20:44,510 --> 01:20:46,410
you're really playing with fire.

1972
01:20:46,410 --> 01:20:50,930
You know, you do need a
sufficiently powerful attacker

1973
01:20:50,930 --> 01:20:52,680
who can get ahead.

1974
01:20:52,680 --> 01:20:54,160
But yeah.

1975
01:20:54,160 --> 01:20:59,010
OK, so with that I think that
kind of concludes the lecture.

1976
01:20:59,010 --> 01:21:00,260
Any final questions?

1977
01:21:05,190 --> 01:21:06,250
All right, cool.

1978
01:21:06,250 --> 01:21:07,970
Thank you guys.