1
00:00:15,580 --> 00:00:17,680
PROFESSOR: So welcome, everyone.

2
00:00:17,680 --> 00:00:20,740
Today is the first
of what will be

3
00:00:20,740 --> 00:00:25,070
a series of four guest lectures
throughout the semester.

4
00:00:25,070 --> 00:00:28,412
There will be two
guest lectures,

5
00:00:28,412 --> 00:00:30,370
starting the week from
today, and then there'll

6
00:00:30,370 --> 00:00:32,350
be another one towards
the end of the semester.

7
00:00:32,350 --> 00:00:34,930
And what Pete and
I decided to do

8
00:00:34,930 --> 00:00:37,120
is to bring in people who
know a lot more than us

9
00:00:37,120 --> 00:00:39,298
about some area of expertise.

10
00:00:39,298 --> 00:00:40,840
In today's instance,
it's going to be

11
00:00:40,840 --> 00:00:44,590
about cardiovascular
medicine, in particular

12
00:00:44,590 --> 00:00:47,260
about how to use imaging
and machine learning

13
00:00:47,260 --> 00:00:49,240
on images in that context.

14
00:00:49,240 --> 00:00:52,150
And for today's
lecture, we're very

15
00:00:52,150 --> 00:00:57,220
excited to have professor
Rahul Deo to speak.

16
00:00:57,220 --> 00:01:00,240
Rahul's name kept on showing
up, as I did research

17
00:01:00,240 --> 00:01:01,660
over the last couple of years.

18
00:01:01,660 --> 00:01:04,660
First, my group was
starting to get interested

19
00:01:04,660 --> 00:01:07,315
in echocardiography,
and we said, oh, here's

20
00:01:07,315 --> 00:01:08,920
an interesting
paper to read on it.

21
00:01:08,920 --> 00:01:12,610
We read it, and then
we read another paper

22
00:01:12,610 --> 00:01:17,050
on doing subtyping
of ejection fraction

23
00:01:17,050 --> 00:01:19,840
which is a type of heart
failure, and we read it.

24
00:01:19,840 --> 00:01:22,450
I wasn't really paying attention
to the names on the papers,

25
00:01:22,450 --> 00:01:23,908
and then suddenly,
someone told me,

26
00:01:23,908 --> 00:01:26,878
there's this guy moving
to Boston next month who's

27
00:01:26,878 --> 00:01:29,170
doing a lot of interesting
work and interesting machine

28
00:01:29,170 --> 00:01:29,950
learning.

29
00:01:29,950 --> 00:01:31,762
You should go meet him.

30
00:01:31,762 --> 00:01:33,220
And of course, I
meet him, and then

31
00:01:33,220 --> 00:01:34,762
I tell him about
these papers I read,

32
00:01:34,762 --> 00:01:38,270
and he said, oh, I wrote
all of those papers.

33
00:01:38,270 --> 00:01:40,360
He was a senior author on them.

34
00:01:40,360 --> 00:01:42,410
So Rahul's been
around for a while.

35
00:01:42,410 --> 00:01:48,440
He is already a
senior in his field.

36
00:01:48,440 --> 00:01:52,900
He started out doing his medical
school training at Cornell,

37
00:01:52,900 --> 00:01:55,450
in Cornell Medical
School, in New York

38
00:01:55,450 --> 00:01:58,630
City, at the same time as
doing his PhD at Rockefeller

39
00:01:58,630 --> 00:01:59,860
University.

40
00:01:59,860 --> 00:02:01,690
And then he spent the
first large chunk--

41
00:02:01,690 --> 00:02:05,770
after his post-doctoral
training, up here in Boston,

42
00:02:05,770 --> 00:02:07,270
at Harvard Medical
School-- he spent

43
00:02:07,270 --> 00:02:12,910
a large chunk of his career as
faculty at UCSF, in California.

44
00:02:12,910 --> 00:02:15,880
And just moved back this
past year to take a position

45
00:02:15,880 --> 00:02:18,900
as the chief data
scientist-- is that right--

46
00:02:18,900 --> 00:02:21,070
for the One Brave
Idea project which

47
00:02:21,070 --> 00:02:24,685
is a very large initiative
joint between MIT and Brigham

48
00:02:24,685 --> 00:02:27,805
and Women's Hospital to study
cardiovascular medicine.

49
00:02:27,805 --> 00:02:29,830
He'll tell you more maybe.

50
00:02:29,830 --> 00:02:34,360
And Rahul's research has
really gone the full spectrum,

51
00:02:34,360 --> 00:02:36,407
but the type of things
you'll hear about today

52
00:02:36,407 --> 00:02:38,740
is actually not what he's
been doing most of his career,

53
00:02:38,740 --> 00:02:39,820
amazingly so.

54
00:02:39,820 --> 00:02:40,990
Most of his career,
he's been thinking more

55
00:02:40,990 --> 00:02:42,640
about genotype and
how to really bridge

56
00:02:42,640 --> 00:02:48,803
that genotype-phenotype branch,
but I asked him specifically

57
00:02:48,803 --> 00:02:49,720
to talk about imaging.

58
00:02:49,720 --> 00:02:51,190
So that's what he'll be focusing
on today in his lecture.

59
00:02:51,190 --> 00:02:53,648
And without further ado, thank
you, Rahul, for coming here.

60
00:02:53,648 --> 00:02:56,150
[APPLAUSE]

61
00:02:56,650 --> 00:02:58,663
RAHUL DEO: So I'm
used to lecturing

62
00:02:58,663 --> 00:03:00,580
the clinical audiences,
so you guys are by far

63
00:03:00,580 --> 00:03:02,190
the most technical audience.

64
00:03:02,190 --> 00:03:04,510
So please spare me a
little bit, but I actually

65
00:03:04,510 --> 00:03:08,530
want to encourage
interruptions, questions.

66
00:03:08,530 --> 00:03:10,120
This is a very
opinionated lecture,

67
00:03:10,120 --> 00:03:13,780
so that if anybody has sort of
any questions, reservations,

68
00:03:13,780 --> 00:03:15,280
please bring them
up during lecture.

69
00:03:15,280 --> 00:03:17,200
Don't wait till the end.

70
00:03:17,200 --> 00:03:22,150
And in part, it's opinionated
because I feel passionately

71
00:03:22,150 --> 00:03:27,370
that the stuff we're doing needs
to make its way into practice.

72
00:03:27,370 --> 00:03:30,083
It's not by itself purely
academically interesting.

73
00:03:30,083 --> 00:03:31,750
We need to study the
things we're doing.

74
00:03:31,750 --> 00:03:33,970
We're already picking up
what everybody else here

75
00:03:33,970 --> 00:03:35,380
is already doing.

76
00:03:35,380 --> 00:03:38,740
So it's OK from that
standpoint, but it really

77
00:03:38,740 --> 00:03:39,648
has to make its way.

78
00:03:39,648 --> 00:03:42,190
And that means that we have to
have some mature understanding

79
00:03:42,190 --> 00:03:44,042
of what makes its
way into practice,

80
00:03:44,042 --> 00:03:45,250
where the resistance will be.

81
00:03:45,250 --> 00:03:48,850
So the lecture will be peppered
throughout with some opinions

82
00:03:48,850 --> 00:03:52,070
and comments in that, and
hopefully, that will be useful.

83
00:03:52,070 --> 00:03:53,778
So just a quick
outline, just going

84
00:03:53,778 --> 00:03:56,320
to introduce cardiac structure
and function which is probably

85
00:03:56,320 --> 00:03:59,350
not part of the regular
undergraduate and graduate

86
00:03:59,350 --> 00:04:00,730
training here at MIT.

87
00:04:00,730 --> 00:04:03,683
Talk a little bit about what the
major cardiac diagnostics are

88
00:04:03,683 --> 00:04:04,600
and how they use them.

89
00:04:04,600 --> 00:04:09,032
And all this is really
to help guide the thought

90
00:04:09,032 --> 00:04:10,990
and the decision making
about how we would ever

91
00:04:10,990 --> 00:04:12,850
automate and bring this into--

92
00:04:12,850 --> 00:04:15,280
how to bring machine learning,
artificial intelligence,

93
00:04:15,280 --> 00:04:16,530
into actual clinical practice.

94
00:04:16,530 --> 00:04:18,220
Because I need to
give enough background

95
00:04:18,220 --> 00:04:20,568
so you realize what
the challenges are,

96
00:04:20,568 --> 00:04:23,110
and then the question probably
every has is where's the data?

97
00:04:23,110 --> 00:04:24,657
How would how would
one get access

98
00:04:24,657 --> 00:04:26,740
to some of this stuff to
be able to potentially do

99
00:04:26,740 --> 00:04:28,210
work in this area?

100
00:04:28,210 --> 00:04:31,270
And then, I'm going to venture a
little bit into computer vision

101
00:04:31,270 --> 00:04:33,130
and just talk about
some of the topics

102
00:04:33,130 --> 00:04:35,088
that at least I've been
thinking about that are

103
00:04:35,088 --> 00:04:36,328
relevant to what we're doing.

104
00:04:36,328 --> 00:04:37,870
And then talk about
some of this work

105
00:04:37,870 --> 00:04:40,948
around an automated pipeline for
echocardiogram, not as by any

106
00:04:40,948 --> 00:04:42,490
means a gold standard
but really just

107
00:04:42,490 --> 00:04:44,350
as sort of an initial
foray into trying

108
00:04:44,350 --> 00:04:47,350
to make a dent into this.

109
00:04:47,350 --> 00:04:50,158
And then thinking a little
bit about what lessons--

110
00:04:50,158 --> 00:04:52,450
David mentioned that you
talked about electrocardiogram

111
00:04:52,450 --> 00:04:56,110
last week or last class,
and so a little bit of some

112
00:04:56,110 --> 00:04:59,050
of the ideas from there, and
how they would lend themselves

113
00:04:59,050 --> 00:05:01,240
to insights about future
types of approaches

114
00:05:01,240 --> 00:05:02,950
with automated interpretation.

115
00:05:02,950 --> 00:05:05,380
And then my background is
actually more in biology.

116
00:05:05,380 --> 00:05:07,392
So I'm going to come
back and say, OK,

117
00:05:07,392 --> 00:05:09,850
enough with all this imaging
stuff, what about the biology?

118
00:05:09,850 --> 00:05:12,450
How can we make
some insights there?

119
00:05:12,450 --> 00:05:13,330
OK.

120
00:05:13,330 --> 00:05:17,320
So every time people
try to get funding

121
00:05:17,320 --> 00:05:19,870
for coronary heart disease,
they try to talk up

122
00:05:19,870 --> 00:05:21,340
just how important it is.

123
00:05:21,340 --> 00:05:23,380
So this is still--

124
00:05:23,380 --> 00:05:25,570
we have some battles with
the oncology people--

125
00:05:25,570 --> 00:05:30,340
but this is still the leading
cause of death in the world.

126
00:05:30,340 --> 00:05:33,610
And then people like I,
you're just emphasizing

127
00:05:33,610 --> 00:05:34,560
the developed world.

128
00:05:34,560 --> 00:05:37,127
There's lots of communicable
diseases that matter much more.

129
00:05:37,127 --> 00:05:39,460
So even if you look at those,
and you look at the bottom

130
00:05:39,460 --> 00:05:44,560
here, this still, if this is all
causes of death age-adjusted,

131
00:05:44,560 --> 00:05:46,920
cardiovascular disease is
still number one amongst that.

132
00:05:46,920 --> 00:05:52,180
So certainly it remains
important and increasingly so

133
00:05:52,180 --> 00:05:54,550
in some of the
developing world also.

134
00:05:54,550 --> 00:05:57,598
So it's important to think
a little bit about what

135
00:05:57,598 --> 00:05:59,140
the heart does,
because this is going

136
00:05:59,140 --> 00:06:01,740
to guide at least the way that
diseases have been classified.

137
00:06:01,740 --> 00:06:03,740
So the main thing the
heart does is it's a pump,

138
00:06:03,740 --> 00:06:06,593
and it delivers oxygenated
blood throughout the circulatory

139
00:06:06,593 --> 00:06:08,260
system to all the
tissues that need it--

140
00:06:08,260 --> 00:06:11,830
the brain, the kidneys, the
muscles, and oxygen, of course,

141
00:06:11,830 --> 00:06:14,440
is required for ATP production.

142
00:06:14,440 --> 00:06:16,240
So it's a pretty
impressive organ.

143
00:06:16,240 --> 00:06:18,160
It pumps about five
liters of blood a minute,

144
00:06:18,160 --> 00:06:21,660
and with exercise, that can go
up five to seven-fold or so,

145
00:06:21,660 --> 00:06:24,280
with conditioned athletes,
not me, but other people

146
00:06:24,280 --> 00:06:26,660
can ramp that up substantially.

147
00:06:26,660 --> 00:06:29,890
And we have this need to keep
a very, very regular beat,

148
00:06:29,890 --> 00:06:33,340
so if you pause for
about three seconds,

149
00:06:33,340 --> 00:06:36,310
you are likely to get
lightheaded or pass out.

150
00:06:36,310 --> 00:06:40,330
So you have to maintain this
rhythmic beating of your heart,

151
00:06:40,330 --> 00:06:42,310
and you can compute
what that would be,

152
00:06:42,310 --> 00:06:45,370
and somewhere around two billion
beats in a typical lifetime.

153
00:06:45,370 --> 00:06:49,353
So I'm going to show a
lot of pictures and videos

154
00:06:49,353 --> 00:06:50,020
throughout this.

155
00:06:50,020 --> 00:06:52,562
So it's probably worthwhile just
to take a pause a little bit

156
00:06:52,562 --> 00:06:54,790
and talk about what the
anatomy of the heart is.

157
00:06:54,790 --> 00:06:57,880
So the heart sits like
this, so the pointy part

158
00:06:57,880 --> 00:07:01,200
is kind of sitting out
to the side, like that.

159
00:07:01,200 --> 00:07:04,540
And so I'm going to just sort
of describe the flow of blood.

160
00:07:04,540 --> 00:07:07,180
So the blood comes in something
called the inferior vena

161
00:07:07,180 --> 00:07:10,510
cava or the superior vena cava,
that's draining from the brain.

162
00:07:10,510 --> 00:07:12,880
This is draining
from the lower body,

163
00:07:12,880 --> 00:07:16,312
and then enters into a chamber
called the right atrium.

164
00:07:16,312 --> 00:07:18,520
It moves through something
called the tricuspid valve

165
00:07:18,520 --> 00:07:20,080
into what's called
the right ventricle.

166
00:07:20,080 --> 00:07:21,997
The right ventricle has
got some muscle to it.

167
00:07:21,997 --> 00:07:23,935
It pumps into the lungs.

168
00:07:23,935 --> 00:07:25,850
There, the blood
picks up oxygen,

169
00:07:25,850 --> 00:07:29,060
so that's why it's
shown as being red here.

170
00:07:29,060 --> 00:07:31,647
The oxygenated native blood
comes through the left atrium

171
00:07:31,647 --> 00:07:33,730
and then into the left
ventricle through something

172
00:07:33,730 --> 00:07:34,840
called the mitral valve.

173
00:07:34,840 --> 00:07:37,630
We'll show you some pictures
of the mitral valve later on.

174
00:07:37,630 --> 00:07:39,400
And then the left
ventricle, which

175
00:07:39,400 --> 00:07:41,170
is the big workhorse
of the heart,

176
00:07:41,170 --> 00:07:44,080
pumps blood through
the rest of the body,

177
00:07:44,080 --> 00:07:46,510
through a structure
of the aorta.

178
00:07:46,510 --> 00:07:48,970
So in through the right
heart, through the lungs,

179
00:07:48,970 --> 00:07:51,148
through the left heart,
to the rest of the body.

180
00:07:51,148 --> 00:07:53,440
And then shown here in yellow
is the conduction system.

181
00:07:53,440 --> 00:07:56,400
So you guys got a little bit
of a conversation last class

182
00:07:56,400 --> 00:07:57,810
on the electrical system.

183
00:07:57,810 --> 00:08:02,165
So the sinoatrial node is
up here in the right atrium,

184
00:08:02,165 --> 00:08:03,540
and then conduction
goes through.

185
00:08:03,540 --> 00:08:06,660
So the P wave on an EKG
represents the conduction

186
00:08:06,660 --> 00:08:07,410
through there.

187
00:08:07,410 --> 00:08:08,827
You get through
the AV node, where

188
00:08:08,827 --> 00:08:10,530
there's a delay which
is a PR interval,

189
00:08:10,530 --> 00:08:12,822
and then you get spreading
through the ventricles which

190
00:08:12,822 --> 00:08:16,290
is the QRS complex, and then
repolarization is the T wave.

191
00:08:16,290 --> 00:08:18,840
So that's the electrical system,
and of course, these things

192
00:08:18,840 --> 00:08:20,910
have to work
intimately together.

193
00:08:24,570 --> 00:08:27,080
Every single basic kind
of cardiac physiology

194
00:08:27,080 --> 00:08:29,850
will show this diagram called
the Wiggers diagram which

195
00:08:29,850 --> 00:08:31,838
really just shows the
interconnectedness

196
00:08:31,838 --> 00:08:32,880
of the electrical system.

197
00:08:32,880 --> 00:08:34,590
So there's the EKG up there.

198
00:08:34,590 --> 00:08:37,929
These are the heart sounds
that a provider would listen to

199
00:08:37,929 --> 00:08:39,720
with the stethoscope,
and this is

200
00:08:39,720 --> 00:08:43,530
capturing the flow of sort
of the changes in pressure

201
00:08:43,530 --> 00:08:45,120
in the heart and in the aorta.

202
00:08:45,120 --> 00:08:49,050
So heart fills during a period
of time called diastole.

203
00:08:49,050 --> 00:08:50,940
The mitral valve closes.

204
00:08:50,940 --> 00:08:52,020
The ventricle contracts.

205
00:08:52,020 --> 00:08:53,230
The pressure increases.

206
00:08:53,230 --> 00:08:54,907
This is a period of
time called systole.

207
00:08:54,907 --> 00:08:57,240
Eventually, something called
the aortic valve pops open,

208
00:08:57,240 --> 00:08:59,073
and blood goes through
the rest of the body.

209
00:08:59,073 --> 00:09:01,230
The heart finally
starts to relax.

210
00:09:01,230 --> 00:09:03,025
The atrioventricular
valve closes.

211
00:09:03,025 --> 00:09:03,900
Then, you fill again.

212
00:09:03,900 --> 00:09:06,960
So this happens again and again
and again in a cyclical way,

213
00:09:06,960 --> 00:09:09,450
and you have this combination
of electrical and mechanical

214
00:09:09,450 --> 00:09:11,300
properties.

215
00:09:11,300 --> 00:09:11,920
OK.

216
00:09:11,920 --> 00:09:12,890
So I have some pictures here.

217
00:09:12,890 --> 00:09:13,520
These are all MRIs.

218
00:09:13,520 --> 00:09:15,437
I'm going to talk about
echocardiography which

219
00:09:15,437 --> 00:09:17,990
is these very ugly, grainy
things that I unfortunately

220
00:09:17,990 --> 00:09:18,860
have to work with.

221
00:09:18,860 --> 00:09:20,952
MRIs are beautiful
but very expensive.

222
00:09:20,952 --> 00:09:22,160
So there's a reason for that.

223
00:09:22,160 --> 00:09:26,160
So this is something called the
long axis view of the heart.

224
00:09:26,160 --> 00:09:28,340
So this is the thick walled
left ventricle there.

225
00:09:28,340 --> 00:09:29,923
This is the left
atrium there, and you

226
00:09:29,923 --> 00:09:32,840
can see this beautiful turbulent
flow of blood in there,

227
00:09:32,840 --> 00:09:35,150
and it's flowing from the
atrium to the ventricle.

228
00:09:35,150 --> 00:09:37,190
This is another patient's.

229
00:09:37,190 --> 00:09:38,570
It's called the short axis view.

230
00:09:38,570 --> 00:09:41,090
There is the left ventricle
and the right ventricle there.

231
00:09:41,090 --> 00:09:43,402
So we're kind of looking
at it somewhat obliquely,

232
00:09:43,402 --> 00:09:45,485
and then this is another
view called the physical.

233
00:09:45,485 --> 00:09:46,340
It's a little bit dull there.

234
00:09:46,340 --> 00:09:47,150
I'm sorry.

235
00:09:47,150 --> 00:09:48,650
We can brighten it a little bit.

236
00:09:48,650 --> 00:09:52,022
This is the what's called
the four chamber view.

237
00:09:52,022 --> 00:09:54,230
So you can see the left
ventricle and right ventricle

238
00:09:54,230 --> 00:09:54,980
here.

239
00:09:54,980 --> 00:09:57,500
So the reason for
these different views

240
00:09:57,500 --> 00:10:01,550
is, ultimately, that
people have measures

241
00:10:01,550 --> 00:10:04,007
of function and measures
of disease that go along

242
00:10:04,007 --> 00:10:05,090
with these specific views.

243
00:10:05,090 --> 00:10:08,190
So you're going to see them
coming back again and again.

244
00:10:08,190 --> 00:10:08,690
OK.

245
00:10:08,690 --> 00:10:14,290
So the way that physicians like
to organize disease definitions

246
00:10:14,290 --> 00:10:16,540
really around some of these
same kind of functions.

247
00:10:16,540 --> 00:10:20,710
So failures of the
heart to pump properly

248
00:10:20,710 --> 00:10:23,380
causes a disease
called heart failure,

249
00:10:23,380 --> 00:10:26,380
and this shows up in terms of
being out of breath, having

250
00:10:26,380 --> 00:10:28,780
fluid buildup in the
belly and in the legs,

251
00:10:28,780 --> 00:10:30,812
and this is treated
with medications.

252
00:10:30,812 --> 00:10:32,770
Sometimes, you can have
some artificial devices

253
00:10:32,770 --> 00:10:34,562
to help the heart pump,
and ultimately, you

254
00:10:34,562 --> 00:10:37,310
could even have a transplant,
depending on how severe it is.

255
00:10:37,310 --> 00:10:38,920
So that's the pump.

256
00:10:38,920 --> 00:10:42,220
Blood supply to the heart
ultimately can also be blocked,

257
00:10:42,220 --> 00:10:44,830
and that causes a disease
called coronary artery disease.

258
00:10:44,830 --> 00:10:46,410
If blood is completely
blocked, you

259
00:10:46,410 --> 00:10:48,618
can get something called a
heart attack or myocardial

260
00:10:48,618 --> 00:10:49,210
infarction.

261
00:10:49,210 --> 00:10:51,490
That's chest pain, sometimes
shortness of breath,

262
00:10:51,490 --> 00:10:54,370
and we open up those blocked
vessels by angioplasty,

263
00:10:54,370 --> 00:10:57,790
stick a stent in there,
or bypass them altogether.

264
00:10:57,790 --> 00:11:02,350
And then the flow of
blood has to be one way.

265
00:11:02,350 --> 00:11:05,980
So abnormalities of flow
of the blood through valves

266
00:11:05,980 --> 00:11:09,030
is valvular disease, and so
you can have either two type

267
00:11:09,030 --> 00:11:10,480
valves, so that's
called stenosis.

268
00:11:10,480 --> 00:11:11,688
Or you can have leaky valves.

269
00:11:11,688 --> 00:11:12,855
That's called regurgitation.

270
00:11:12,855 --> 00:11:14,688
That shows up as
light-headedness, shortness

271
00:11:14,688 --> 00:11:17,290
of breath, fainting, and then
you've got to fix those valves.

272
00:11:17,290 --> 00:11:19,510
And finally, there's
abnormalities of rhythm.

273
00:11:19,510 --> 00:11:21,170
So something like
atrial fibrillation

274
00:11:21,170 --> 00:11:24,640
which is a quivering of the
atrium, so too slow heartbeats,

275
00:11:24,640 --> 00:11:27,100
which would look like cardiac,
can present as palpitations,

276
00:11:27,100 --> 00:11:28,925
fainting, or even sudden death.

277
00:11:28,925 --> 00:11:31,550
And you can stick a pacemaker in
there, defibrillator in there,

278
00:11:31,550 --> 00:11:34,190
or try to burn off
the arrhythmia.

279
00:11:34,190 --> 00:11:34,690
OK.

280
00:11:34,690 --> 00:11:38,440
So this is like the very
physiology-centric view,

281
00:11:38,440 --> 00:11:41,042
but the truth is that the
heart has a whole lot of cells.

282
00:11:41,042 --> 00:11:43,000
So there's a lot more
biology there than simply

283
00:11:43,000 --> 00:11:46,120
just thinking about the pumping
and the electrical function.

284
00:11:46,120 --> 00:11:50,000
Only 30% of the cells or so
are these cardiomyocytes.

285
00:11:50,000 --> 00:11:52,640
So these are the cells that
are involved in contraction.

286
00:11:52,640 --> 00:11:55,148
These are cells that are
excitable, but that's only 30%

287
00:11:55,148 --> 00:11:55,690
of the cells.

288
00:11:55,690 --> 00:11:57,850
There is endothelials
in the cell.

289
00:11:57,850 --> 00:11:58,780
There's fibroblasts.

290
00:11:58,780 --> 00:12:00,850
There's a bunch of
blood cells in there

291
00:12:00,850 --> 00:12:02,560
too, certainly a lot of red
blood cells in there too.

292
00:12:02,560 --> 00:12:03,710
So you have lots
of other things.

293
00:12:03,710 --> 00:12:05,360
So we're going to come
back to here a little bit

294
00:12:05,360 --> 00:12:07,818
when talking about how should
we be thinking about disease?

295
00:12:07,818 --> 00:12:10,750
The historic way is
to think about pumping

296
00:12:10,750 --> 00:12:12,820
and electrical activation,
but really, there's

297
00:12:12,820 --> 00:12:14,485
maybe a little bit
more complexity here

298
00:12:14,485 --> 00:12:15,610
that needs to be addressed.

299
00:12:15,610 --> 00:12:16,330
OK.

300
00:12:16,330 --> 00:12:20,290
So there's a lot of different--

301
00:12:20,290 --> 00:12:23,560
so cardiology is
very imaging-centric,

302
00:12:23,560 --> 00:12:25,930
and as a result,
it's very expensive.

303
00:12:25,930 --> 00:12:28,630
Because imaging costs
a lot of money to do,

304
00:12:28,630 --> 00:12:31,120
and so I have dollar
signs here reflecting

305
00:12:31,120 --> 00:12:32,650
the sorts of
different tests we do.

306
00:12:32,650 --> 00:12:36,410
So you saw the
cheapest one last week,

307
00:12:36,410 --> 00:12:38,990
electrocardiogram,
so one dollar sign,

308
00:12:38,990 --> 00:12:41,950
and that has lots of utility.

309
00:12:41,950 --> 00:12:44,170
For example, one could
diagnose an acute heart attack

310
00:12:44,170 --> 00:12:46,250
with that.

311
00:12:46,250 --> 00:12:48,680
Echocardiography, which
involves sound waves,

312
00:12:48,680 --> 00:12:51,280
is ultimately more used
for quantifying structure

313
00:12:51,280 --> 00:12:54,740
and function, can pick up heart
failure, valvular disease,

314
00:12:54,740 --> 00:12:56,280
high blood pressure
in the lungs.

315
00:12:56,280 --> 00:12:57,760
So that's another modality.

316
00:12:57,760 --> 00:13:00,760
MRI, which is just not used
all that much in this country,

317
00:13:00,760 --> 00:13:01,910
is very expensive.

318
00:13:01,910 --> 00:13:04,160
It does largely the same
things, and you can imagine,

319
00:13:04,160 --> 00:13:05,620
even though it's
beautiful, people

320
00:13:05,620 --> 00:13:07,990
have not had an easy
time and able to justify

321
00:13:07,990 --> 00:13:11,980
why it's any better than this
slightly cheaper modality.

322
00:13:11,980 --> 00:13:15,010
And then you have angiography
which can either be by CAT scan

323
00:13:15,010 --> 00:13:16,240
or by X-ray.

324
00:13:16,240 --> 00:13:19,870
And that visualizes the flow
of blood through the heart

325
00:13:19,870 --> 00:13:22,990
and looks for blockages which
are going to be stented,

326
00:13:22,990 --> 00:13:24,760
ballooned up and stented.

327
00:13:24,760 --> 00:13:28,840
And then you had these kind
of non-invasive technologies,

328
00:13:28,840 --> 00:13:32,970
like PET and SPECT that
use radionucleotides,

329
00:13:32,970 --> 00:13:35,200
like technetium,
rubidium, and they

330
00:13:35,200 --> 00:13:37,030
look for abnormalities
in blood flow

331
00:13:37,030 --> 00:13:38,980
to detect whether
or non-invasively

332
00:13:38,980 --> 00:13:40,480
there's some patch
of the heart that

333
00:13:40,480 --> 00:13:41,787
isn't getting enough blood.

334
00:13:41,787 --> 00:13:43,870
If you get one of these,
and it's abnormal, often,

335
00:13:43,870 --> 00:13:45,790
you go over there, and you
take a trip to the movies--

336
00:13:45,790 --> 00:13:47,350
as my old teachers used to say--

337
00:13:47,350 --> 00:13:50,800
and then you may find yourself
with an angioplasty or stent

338
00:13:50,800 --> 00:13:52,420
or bypass.

339
00:13:52,420 --> 00:13:54,708
So one of the sad
things about cardiology

340
00:13:54,708 --> 00:13:56,500
is we don't define our
diseases by biology.

341
00:13:56,500 --> 00:13:58,853
We define our
diseases often related

342
00:13:58,853 --> 00:14:00,520
to whether the anatomy
of the physiology

343
00:14:00,520 --> 00:14:03,520
is abnormal or normal, usually
based on some of these images

344
00:14:03,520 --> 00:14:04,860
or some of these numbers.

345
00:14:04,860 --> 00:14:05,990
OK.

346
00:14:05,990 --> 00:14:07,840
So we have to make
decisions, and we often

347
00:14:07,840 --> 00:14:09,443
use these very same
things too to be

348
00:14:09,443 --> 00:14:10,610
able to make some decisions.

349
00:14:10,610 --> 00:14:13,510
So we have to decide whether
we want to put a defibrillator,

350
00:14:13,510 --> 00:14:16,745
and to do so, you often need
to get an echocardiogram

351
00:14:16,745 --> 00:14:18,620
to look at the pumping
function of the heart.

352
00:14:18,620 --> 00:14:21,120
If you want to decide on whether
somebody needs angioplasty,

353
00:14:21,120 --> 00:14:22,647
you have to get an angiogram.

354
00:14:22,647 --> 00:14:24,730
If you want to decided to
get a valve replacement,

355
00:14:24,730 --> 00:14:26,470
you need an echo.

356
00:14:26,470 --> 00:14:28,102
But some of these
other ones actually

357
00:14:28,102 --> 00:14:29,560
don't involve any
imaging, and this

358
00:14:29,560 --> 00:14:31,268
is sort of one of the
challenges that I'm

359
00:14:31,268 --> 00:14:34,300
going to talk about is
that all of the future--

360
00:14:34,300 --> 00:14:36,680
you can imagine building
brand new risk models,

361
00:14:36,680 --> 00:14:38,140
new classification models.

362
00:14:38,140 --> 00:14:40,510
You're stuck with the
data that's out there,

363
00:14:40,510 --> 00:14:42,310
and the data that's
out there is ultimately

364
00:14:42,310 --> 00:14:44,890
being collected because
somebody feels like it's worth

365
00:14:44,890 --> 00:14:46,360
paying for it already.

366
00:14:46,360 --> 00:14:48,790
So if you want to
build a brand new risk

367
00:14:48,790 --> 00:14:51,550
model for who's going to
have a myocardial infarction,

368
00:14:51,550 --> 00:14:54,010
you're probably not going to
have any echocardiograms to be

369
00:14:54,010 --> 00:14:55,510
able to use for
that, because nobody

370
00:14:55,510 --> 00:14:57,460
is going to have paid
for that to be collected

371
00:14:57,460 --> 00:14:58,450
in the first place.

372
00:14:58,450 --> 00:14:59,350
So this is a problem.

373
00:14:59,350 --> 00:15:01,570
To be able to innovate, I've got
to keep on coming back to that,

374
00:15:01,570 --> 00:15:04,153
because I think you're going to
be shocked by the small sample

375
00:15:04,153 --> 00:15:05,950
sizes that we face in
some of these things.

376
00:15:05,950 --> 00:15:06,970
And part of it is
because if you just

377
00:15:06,970 --> 00:15:08,803
want to piggyback on
what insurers are going

378
00:15:08,803 --> 00:15:10,973
to be willing to pay
for to get your data,

379
00:15:10,973 --> 00:15:12,390
you're going to
be stuck with only

380
00:15:12,390 --> 00:15:14,340
being able to work off
the stuff we already

381
00:15:14,340 --> 00:15:15,215
know something about.

382
00:15:15,215 --> 00:15:17,280
So much of my work
has been really trying

383
00:15:17,280 --> 00:15:20,070
to think about how
we can change that.

384
00:15:20,070 --> 00:15:22,670
OK, so just a little
bit more, and then we

385
00:15:22,670 --> 00:15:24,520
can get into a
little bit more meat.

386
00:15:24,520 --> 00:15:26,640
So sort of the
universal standard

387
00:15:26,640 --> 00:15:29,870
for how imaging data is stored
is something called DICOMs,

388
00:15:29,870 --> 00:15:33,058
or Digital Imaging and
Communications standard,

389
00:15:33,058 --> 00:15:34,600
and really, the end
of the day, there

390
00:15:34,600 --> 00:15:36,457
is some compressed
data for the images.

391
00:15:36,457 --> 00:15:38,790
There's a DICOM header, which
I'll show you in a moment.

392
00:15:38,790 --> 00:15:40,282
It's lots of nice
Python libraries

393
00:15:40,282 --> 00:15:42,490
that are available to be
able to work with this data,

394
00:15:42,490 --> 00:15:45,180
and there's a free
viewer you could use too.

395
00:15:45,180 --> 00:15:46,555
OK.

396
00:15:46,555 --> 00:15:47,930
So where do I get
access to this?

397
00:15:47,930 --> 00:15:49,805
So this has actually
been an incredible pain.

398
00:15:49,805 --> 00:15:52,800
So hospitals are set up
to be clinical operations.

399
00:15:52,800 --> 00:15:54,300
They're not set
up to make it easy

400
00:15:54,300 --> 00:15:56,623
for you to get gobs
of data for being

401
00:15:56,623 --> 00:15:57,790
able to do machine learning.

402
00:15:57,790 --> 00:16:00,840
It's just not really there.

403
00:16:00,840 --> 00:16:04,230
And so sometimes, you have
some of these data archives

404
00:16:04,230 --> 00:16:05,730
that store this
data, but there's

405
00:16:05,730 --> 00:16:08,710
lots of reasons for why
people make that difficult.

406
00:16:08,710 --> 00:16:10,530
And one of them is
because often images

407
00:16:10,530 --> 00:16:13,100
have these burned in pixels
with identifiable information.

408
00:16:13,100 --> 00:16:16,077
So you'll have a patient's
name emblazoned in the image.

409
00:16:16,077 --> 00:16:17,160
You'll have date of birth.

410
00:16:17,160 --> 00:16:18,520
You'll have kind of
other attributes.

411
00:16:18,520 --> 00:16:20,100
So you're stuck
with that, and not

412
00:16:20,100 --> 00:16:22,133
only is it a problem
that they're there,

413
00:16:22,133 --> 00:16:24,300
the vendors don't make it
easy to be able to get rid

414
00:16:24,300 --> 00:16:25,133
of that information.

415
00:16:25,133 --> 00:16:28,170
So you actually have a
problem that they don't really

416
00:16:28,170 --> 00:16:31,202
make it easy to download in
bulk or de-identify this.

417
00:16:31,202 --> 00:16:32,910
And part of the reason
is because then it

418
00:16:32,910 --> 00:16:35,118
would make it easy for you
to switch vendors and have

419
00:16:35,118 --> 00:16:36,200
somebody else take over.

420
00:16:36,200 --> 00:16:37,950
So they make it a
little bit hard for you.

421
00:16:37,950 --> 00:16:40,530
Once it's in there, it's
hard for you to get it out,

422
00:16:40,530 --> 00:16:42,480
and people are
selling their data.

423
00:16:42,480 --> 00:16:43,900
That's certainly happening too.

424
00:16:43,900 --> 00:16:45,570
So there's a little
bit of attempts

425
00:16:45,570 --> 00:16:49,050
to try to control things that
way, and many of the labels

426
00:16:49,050 --> 00:16:50,710
you want are stored separately.

427
00:16:50,710 --> 00:16:52,390
So you want to know what the
diseases of these people.

428
00:16:52,390 --> 00:16:53,730
So you have the
raw imaging data,

429
00:16:53,730 --> 00:16:55,605
but all the clinical
stuff is somewhere else.

430
00:16:55,605 --> 00:16:57,130
So you have to
sometimes link that,

431
00:16:57,130 --> 00:16:58,740
and so you need to
get access there.

432
00:16:58,740 --> 00:17:01,115
And so just to give you a
little bit of an idea of scale,

433
00:17:01,115 --> 00:17:03,660
so we're about to get all the
ECGs from Brigham and Women's

434
00:17:03,660 --> 00:17:06,960
which is about 30
million historically,

435
00:17:06,960 --> 00:17:08,410
and this is all related to cost.

436
00:17:08,410 --> 00:17:11,609
So positron emission tomography,
you can get about 8,000 or so,

437
00:17:11,609 --> 00:17:13,800
and we're one of the
busiest centers for that.

438
00:17:13,800 --> 00:17:16,510
Echocardiograms are in
the 300,000 to 500,000

439
00:17:16,510 --> 00:17:17,260
range in archives.

440
00:17:17,260 --> 00:17:19,060
So that gets a little
bit more interesting.

441
00:17:19,060 --> 00:17:19,680
OK.

442
00:17:19,680 --> 00:17:21,599
This is what a DICOM
header looks like.

443
00:17:21,599 --> 00:17:23,849
You have some sort
of identifiers,

444
00:17:23,849 --> 00:17:25,980
and then you have some
information there,

445
00:17:25,980 --> 00:17:27,660
attributes of the
images, patient

446
00:17:27,660 --> 00:17:29,468
name, date of birth, frame rate.

447
00:17:29,468 --> 00:17:32,010
These kind of things are there,
and there's some variability.

448
00:17:32,010 --> 00:17:34,880
So it's never quite easy.

449
00:17:34,880 --> 00:17:36,170
OK.

450
00:17:36,170 --> 00:17:40,585
So these different modalities
have some different benefits

451
00:17:40,585 --> 00:17:41,960
to them which is
why they're used

452
00:17:41,960 --> 00:17:44,580
for one disease or the other.

453
00:17:44,580 --> 00:17:47,330
And so one of the real headaches
is that the heart moves.

454
00:17:47,330 --> 00:17:49,600
So the chest wall moves,
because we breathe,

455
00:17:49,600 --> 00:17:50,600
and the heart moves too.

456
00:17:50,600 --> 00:17:52,580
So you have to
image something that

457
00:17:52,580 --> 00:17:55,880
has enough temporal frequency
that you're not overwhelmed

458
00:17:55,880 --> 00:17:58,752
by the basic movement
of the heart itself,

459
00:17:58,752 --> 00:18:00,460
and so some of these
things aren't great.

460
00:18:00,460 --> 00:18:03,110
So SPECT or PET
acquire their images,

461
00:18:03,110 --> 00:18:05,443
which are radioactive
counts, over minutes.

462
00:18:05,443 --> 00:18:06,860
So that's certainly
a problem when

463
00:18:06,860 --> 00:18:08,880
it comes to something
that's moving like that,

464
00:18:08,880 --> 00:18:10,547
and if you want to
have high resolution.

465
00:18:10,547 --> 00:18:13,130
So typically, you have very
poor spatial resolution

466
00:18:13,130 --> 00:18:16,010
for something that
ultimately doesn't deal well

467
00:18:16,010 --> 00:18:17,570
with the moving aspect.

468
00:18:17,570 --> 00:18:20,160
So coronary angiography has
very, very fast frame rates.

469
00:18:20,160 --> 00:18:22,510
So that's X-ray, and
that's sort of very fast.

470
00:18:22,510 --> 00:18:24,740
Echocardiography
can be quite fast.

471
00:18:24,740 --> 00:18:26,765
MRI and CT are
not quite as good,

472
00:18:26,765 --> 00:18:28,640
and so there's some
degradation of the image.

473
00:18:28,640 --> 00:18:30,770
As a result, people do
something called gating,

474
00:18:30,770 --> 00:18:33,920
where they'll take the
electrocardiogram, the ECG,

475
00:18:33,920 --> 00:18:36,610
and try to line up
different portions

476
00:18:36,610 --> 00:18:37,610
of different heartbeats.

477
00:18:37,610 --> 00:18:40,332
And say, well, we'll take
this image from here,

478
00:18:40,332 --> 00:18:42,290
line it up with this one
from there, this one--

479
00:18:42,290 --> 00:18:44,873
I'm going to talk a little bit
about that, about registration,

480
00:18:44,873 --> 00:18:47,573
but ultimately, that's a problem
that people have to deal with.

481
00:18:47,573 --> 00:18:49,490
So it's a computer vision
problem of interest.

482
00:18:49,490 --> 00:18:50,750
OK.

483
00:18:50,750 --> 00:18:52,730
Preamble is almost done.

484
00:18:52,730 --> 00:18:54,770
OK.

485
00:18:54,770 --> 00:18:56,983
So why do we even
imagine any of this stuff

486
00:18:56,983 --> 00:18:57,900
is going to be useful?

487
00:18:57,900 --> 00:19:02,330
So it turns out that the
practice of interpreting

488
00:19:02,330 --> 00:19:04,520
involves a lot of
manual measurements.

489
00:19:04,520 --> 00:19:07,010
So people like
me, and people who

490
00:19:07,010 --> 00:19:08,660
have trained for
way too long, find

491
00:19:08,660 --> 00:19:11,550
themselves getting little rulers
and measuring various things.

492
00:19:11,550 --> 00:19:14,600
So for example, this is
a narrowing of an artery.

493
00:19:14,600 --> 00:19:16,850
So you could take a little
bit of calipers and measure

494
00:19:16,850 --> 00:19:19,190
across that and
compare it to here

495
00:19:19,190 --> 00:19:22,220
and say, ah, this
is 80% narrowed.

496
00:19:22,220 --> 00:19:24,440
You could measure the
area of this chamber,

497
00:19:24,440 --> 00:19:27,200
the left ventricle, and you
can measure its area is,

498
00:19:27,200 --> 00:19:29,660
and you can see, ah,
its peak area is this.

499
00:19:29,660 --> 00:19:31,125
It's minimum area is this.

500
00:19:31,125 --> 00:19:33,000
Therefore, it's contracting
a certain amount.

501
00:19:33,000 --> 00:19:33,917
So we do those things.

502
00:19:33,917 --> 00:19:36,410
We measure those things by hand.

503
00:19:36,410 --> 00:19:39,230
And the other thing we do is we
actually diagnose things just

504
00:19:39,230 --> 00:19:40,080
by looking at them.

505
00:19:40,080 --> 00:19:43,040
So this is a disease called
cardiac amyloid characterized

506
00:19:43,040 --> 00:19:44,030
by some thickening.

507
00:19:44,030 --> 00:19:45,110
I'll show you a little
bit more about that

508
00:19:45,110 --> 00:19:46,130
and some sparkling here.

509
00:19:46,130 --> 00:19:48,440
So people do look and say,
ah, this is what this is.

510
00:19:48,440 --> 00:19:50,703
So there's kind of a
classification problem

511
00:19:50,703 --> 00:19:52,620
that comes either at the
image or video level.

512
00:19:52,620 --> 00:19:54,828
So we'll talk about whether
this is even worth doing.

513
00:19:54,828 --> 00:19:56,020
AUDIENCE: I have a question.

514
00:19:56,020 --> 00:19:56,820
RAHUL DEO: Yes.

515
00:19:56,820 --> 00:19:58,987
AUDIENCE: Is this with
software, or do you literally

516
00:19:58,987 --> 00:20:00,340
take a ruler and measure?

517
00:20:00,340 --> 00:20:03,500
RAHUL DEO: So the software
involves clicking at one point,

518
00:20:03,500 --> 00:20:05,710
stretching something, and
clicking another point.

519
00:20:05,710 --> 00:20:07,793
So it's a little better
than pulling the ruler out

520
00:20:07,793 --> 00:20:11,140
of your back pocket, but
not that much better.

521
00:20:11,140 --> 00:20:11,690
OK.

522
00:20:11,690 --> 00:20:14,580
So we're going to talk
about or three little areas,

523
00:20:14,580 --> 00:20:15,770
and again, this is not--

524
00:20:15,770 --> 00:20:18,187
I got involved in this really
in the last two years or so.

525
00:20:18,187 --> 00:20:19,978
It's nice of David to
ask me to speak here,

526
00:20:19,978 --> 00:20:21,560
but I think there
are probably people

527
00:20:21,560 --> 00:20:24,295
in this room who have a lot
more experience in this space.

528
00:20:24,295 --> 00:20:26,420
But the areas that have
been relevant to what we've

529
00:20:26,420 --> 00:20:29,870
been doing has been image
classification and then

530
00:20:29,870 --> 00:20:30,850
semantic segmentation.

531
00:20:30,850 --> 00:20:33,350
So image classification being
assigning a label to an image,

532
00:20:33,350 --> 00:20:34,340
very great.

533
00:20:34,340 --> 00:20:37,820
Semantic segmentation, assigning
each pixel to a class label,

534
00:20:37,820 --> 00:20:40,278
and we haven't done anything
around the image registration,

535
00:20:40,278 --> 00:20:41,903
but there are some
interesting problems

536
00:20:41,903 --> 00:20:43,260
I've been thinking about there.

537
00:20:43,260 --> 00:20:45,343
And that's really mapping
different sets of images

538
00:20:45,343 --> 00:20:46,830
onto one coordinate system.

539
00:20:46,830 --> 00:20:47,330
OK.

540
00:20:47,330 --> 00:20:50,580
So seems obvious that
image classification would

541
00:20:50,580 --> 00:20:53,458
be something that you would
imagine a physician does,

542
00:20:53,458 --> 00:20:54,750
and so maybe we can mimic that.

543
00:20:54,750 --> 00:20:56,542
Seems like a reasonable
thing that happens.

544
00:20:56,542 --> 00:20:59,610
So lots of things that
radiologists, people

545
00:20:59,610 --> 00:21:03,690
who interpret images, do
involve terms of recognition,

546
00:21:03,690 --> 00:21:04,890
and they're really fast.

547
00:21:04,890 --> 00:21:08,400
So it takes them a couple of
minutes to often do things

548
00:21:08,400 --> 00:21:10,860
like detect if there's
cancer, detect if somebody has

549
00:21:10,860 --> 00:21:13,530
pneumonia, detect if there's
breast cancer in a mammogram,

550
00:21:13,530 --> 00:21:14,580
tells there's
fluid in the heart,

551
00:21:14,580 --> 00:21:17,038
and then even less than that,
one minute often, 30 seconds,

552
00:21:17,038 --> 00:21:19,300
they can very, very fast.

553
00:21:19,300 --> 00:21:23,160
So you can imagine
the wave of excitement

554
00:21:23,160 --> 00:21:26,730
around image classification
was really post-image net,

555
00:21:26,730 --> 00:21:28,830
so maybe about three years,
four years, or so ago.

556
00:21:28,830 --> 00:21:30,455
We're always a little
slow in medicine,

557
00:21:30,455 --> 00:21:32,985
so a little bit
behind other fields.

558
00:21:32,985 --> 00:21:34,860
And the places that they
went were the places

559
00:21:34,860 --> 00:21:36,810
where there are huge
data sets already,

560
00:21:36,810 --> 00:21:38,620
and where there's simple
recognition tests.

561
00:21:38,620 --> 00:21:40,710
So chest X-rays and
mammograms are both places

562
00:21:40,710 --> 00:21:43,410
that had a lot of
attention, and other places

563
00:21:43,410 --> 00:21:46,310
have been slowed down by just
how hard it is to get data.

564
00:21:46,310 --> 00:21:48,060
So if you can't get a
big enough data set,

565
00:21:48,060 --> 00:21:49,893
then you're not going
to be able to do much.

566
00:21:49,893 --> 00:21:50,460
OK.

567
00:21:50,460 --> 00:21:54,378
So David mentioned, you guys
already covered very nicely,

568
00:21:54,378 --> 00:21:55,920
and this is probably
kind of old hat.

569
00:21:55,920 --> 00:21:58,650
But I would say that prior to
convolutional neural networks,

570
00:21:58,650 --> 00:22:00,840
nothing was happening in
the image classification

571
00:22:00,840 --> 00:22:01,600
space in medicine.

572
00:22:01,600 --> 00:22:02,593
It was just not.

573
00:22:02,593 --> 00:22:05,010
People weren't even thinking
that it was even worth doing.

574
00:22:05,010 --> 00:22:07,110
Now, there's a lot
of interest, and so I

575
00:22:07,110 --> 00:22:10,650
have many different companies
coming and asking for help

576
00:22:10,650 --> 00:22:11,780
with some of these things.

577
00:22:11,780 --> 00:22:16,290
And so it is now a
very attractive thing

578
00:22:16,290 --> 00:22:17,713
in terms of
thinking, and I think

579
00:22:17,713 --> 00:22:19,380
people haven't thought
out all that well

580
00:22:19,380 --> 00:22:21,630
how we're going to use that.

581
00:22:21,630 --> 00:22:23,885
So for example, if it takes
a radiologist a minute

582
00:22:23,885 --> 00:22:25,260
to two minutes to
read something,

583
00:22:25,260 --> 00:22:28,360
how much benefit are you
going to get to automate it?

584
00:22:28,360 --> 00:22:30,600
And the real
problem is you can't

585
00:22:30,600 --> 00:22:31,860
take that radiologist away.

586
00:22:31,860 --> 00:22:33,360
They're still there,
because they're

587
00:22:33,360 --> 00:22:34,740
the ones who are on the hook.

588
00:22:34,740 --> 00:22:36,365
And they're going to
get sued, and it's

589
00:22:36,365 --> 00:22:38,460
among the most sued
profession in medicine.

590
00:22:38,460 --> 00:22:42,000
So there's lots of people
who can read an X-ray.

591
00:22:42,000 --> 00:22:44,140
You don't need to have
all that training.

592
00:22:44,140 --> 00:22:46,350
But if you're the one
who's going to be sued,

593
00:22:46,350 --> 00:22:48,070
it ends up being that
there really isn't

594
00:22:48,070 --> 00:22:49,320
any task shifting in medicine.

595
00:22:49,320 --> 00:22:51,150
There isn't that
kind of, oh, I'm

596
00:22:51,150 --> 00:22:53,940
going to let such
and such take on 99%,

597
00:22:53,940 --> 00:22:55,690
and just tell me when
there is a problem.

598
00:22:55,690 --> 00:22:58,260
It just doesn't happen, because
they ultimately don't feel

599
00:22:58,260 --> 00:22:59,610
comfortable passing that on.

600
00:22:59,610 --> 00:23:01,540
So that's something
to think about.

601
00:23:01,540 --> 00:23:03,540
So you have a task
that's relatively

602
00:23:03,540 --> 00:23:06,240
easy for a very, very expensive
and skilled person to do,

603
00:23:06,240 --> 00:23:07,990
and they refuse to give it up.

604
00:23:07,990 --> 00:23:08,490
OK.

605
00:23:08,490 --> 00:23:10,603
So that's a problem,
but you can imagine

606
00:23:10,603 --> 00:23:13,020
there is some scenarios-- and
we'll talk more about this--

607
00:23:13,020 --> 00:23:14,103
as to where that could be.

608
00:23:14,103 --> 00:23:16,050
So let's say it's overnight.

609
00:23:16,050 --> 00:23:18,660
The radiologist is sleeping
comfortably at home,

610
00:23:18,660 --> 00:23:20,790
and you have a bunch
of studies being

611
00:23:20,790 --> 00:23:22,235
done in the emergency room.

612
00:23:22,235 --> 00:23:24,360
And you want to figure out,
OK, which one should we

613
00:23:24,360 --> 00:23:25,050
call them about?

614
00:23:25,050 --> 00:23:26,760
So you can imagine
there could be triage,

615
00:23:26,760 --> 00:23:30,300
because the status quo would
be, we'll take them one by one.

616
00:23:30,300 --> 00:23:32,850
Maybe you could imagine sifting
through them quickly and then

617
00:23:32,850 --> 00:23:34,230
re-prioritizing them.

618
00:23:34,230 --> 00:23:35,412
They'll still be looked at.

619
00:23:35,412 --> 00:23:37,120
Every single one will
still be looked at.

620
00:23:37,120 --> 00:23:38,412
It's just the order may change.

621
00:23:38,412 --> 00:23:39,960
So that's an example,
and you could

622
00:23:39,960 --> 00:23:42,450
imagine there could be
separate-- someone else could

623
00:23:42,450 --> 00:23:43,540
read at the same time.

624
00:23:43,540 --> 00:23:44,910
And we'll come back
to this in terms of

625
00:23:44,910 --> 00:23:46,285
whether or not
you could have two

626
00:23:46,285 --> 00:23:49,500
streams and whether or not
that is a scenario that

627
00:23:49,500 --> 00:23:50,478
would make some sense.

628
00:23:50,478 --> 00:23:52,020
And maybe, in
resource-poor settings,

629
00:23:52,020 --> 00:23:53,670
where we're not teaming
with the radiologist,

630
00:23:53,670 --> 00:23:54,795
maybe that makes sense too.

631
00:23:54,795 --> 00:23:57,210
So we'll come back to that too.

632
00:23:57,210 --> 00:23:57,810
OK.

633
00:23:57,810 --> 00:23:59,460
So here's another problem.

634
00:23:59,460 --> 00:24:03,420
So almost everything in
medicine requires some element

635
00:24:03,420 --> 00:24:06,090
of confirmation of
a visual finding,

636
00:24:06,090 --> 00:24:08,100
and some of the reasons
are very simple.

637
00:24:08,100 --> 00:24:11,248
So let's say you want to talk
about there being a tumor.

638
00:24:11,248 --> 00:24:13,290
So if you're going to ask
a surgeon to biopsy it,

639
00:24:13,290 --> 00:24:15,440
you better tell
them where it is.

640
00:24:15,440 --> 00:24:17,220
It's not enough to
just say, this image

641
00:24:17,220 --> 00:24:18,990
has a tumor somewhere on it.

642
00:24:18,990 --> 00:24:20,893
So there is some element
of that that you're

643
00:24:20,893 --> 00:24:23,310
going to need to be a little
bit more detailed than simply

644
00:24:23,310 --> 00:24:26,100
making a classification
with a level one image,

645
00:24:26,100 --> 00:24:28,702
but I would say beyond that.

646
00:24:28,702 --> 00:24:30,910
Let's say, I'm going to try
to get one of my patients

647
00:24:30,910 --> 00:24:33,600
to go for valve surgery.

648
00:24:33,600 --> 00:24:36,060
I'll sit with them,
bring up their echo,

649
00:24:36,060 --> 00:24:39,120
sit side by side with them,
and point them to where it is.

650
00:24:39,120 --> 00:24:41,090
Bring up a normal
one and compare,

651
00:24:41,090 --> 00:24:43,240
because I want them to be
involved in the decision.

652
00:24:43,240 --> 00:24:45,387
I want them to feel like
they're not just trust--

653
00:24:45,387 --> 00:24:46,470
and they have to trust me.

654
00:24:46,470 --> 00:24:47,370
At the end of the
day, they don't even

655
00:24:47,370 --> 00:24:48,210
know that I'm showing--

656
00:24:48,210 --> 00:24:50,210
I'll show them their name,
but ultimately, there

657
00:24:50,210 --> 00:24:51,310
is some element of trust.

658
00:24:51,310 --> 00:24:53,440
They're not able to do
this, but at the same time,

659
00:24:53,440 --> 00:24:55,560
there is this sense of
shared decision making.

660
00:24:55,560 --> 00:24:58,920
You're trying to communicate to
somebody, whose life is really

661
00:24:58,920 --> 00:25:02,380
at risk here, that this is
why we're doing this decision.

662
00:25:02,380 --> 00:25:04,698
So the more you could imagine
that there is obscuring,

663
00:25:04,698 --> 00:25:06,490
the more difficult it
is to make that case.

664
00:25:06,490 --> 00:25:08,460
So medicine is this--

665
00:25:08,460 --> 00:25:12,560
I found this review by Bin Yu
from Berkeley, just came out,

666
00:25:12,560 --> 00:25:15,900
and it talks about this tension
between predictive accuracy

667
00:25:15,900 --> 00:25:17,510
and descriptive accuracy.

668
00:25:17,510 --> 00:25:21,005
So this is of the typical thing
we think about that matters,

669
00:25:21,005 --> 00:25:22,380
and there's lots
of people who've

670
00:25:22,380 --> 00:25:24,340
written about this thing.

671
00:25:24,340 --> 00:25:28,470
Medicine is tough in that it's
very demanding in this space

672
00:25:28,470 --> 00:25:32,340
here, and it's almost
inflexible in this space here.

673
00:25:32,340 --> 00:25:34,260
So it's a tough nut
to crack in terms

674
00:25:34,260 --> 00:25:35,760
of being able to
make some progress,

675
00:25:35,760 --> 00:25:38,460
and so we'll talk more about
when that's likely to happen.

676
00:25:38,460 --> 00:25:38,960
OK.

677
00:25:38,960 --> 00:25:42,060
So this again may be something
that's very familiar to you.

678
00:25:42,060 --> 00:25:44,910
So we had this problem in
terms of some of the disease

679
00:25:44,910 --> 00:25:46,620
detection models,
and I didn't find

680
00:25:46,620 --> 00:25:48,150
this all that
satisfying in terms

681
00:25:48,150 --> 00:25:50,010
being able to
successfully localize.

682
00:25:50,010 --> 00:25:51,635
So just digging
through the literature,

683
00:25:51,635 --> 00:25:55,230
it looks like this idea
of being able to explain

684
00:25:55,230 --> 00:25:57,690
what part of the
image is driving

685
00:25:57,690 --> 00:26:01,510
a certain classification.

686
00:26:01,510 --> 00:26:03,690
That field is modestly old.

687
00:26:03,690 --> 00:26:05,400
Maybe it goes back before that.

688
00:26:05,400 --> 00:26:07,110
But ultimately,
there's two broad ways.

689
00:26:07,110 --> 00:26:10,260
You can imagine finding an
exemplary image that maximally

690
00:26:10,260 --> 00:26:12,960
activates the classical work,
or you can take a given image

691
00:26:12,960 --> 00:26:17,010
and say, what aspect of it is
driving the classification?

692
00:26:17,010 --> 00:26:20,040
And so in this paper here
did both those things.

693
00:26:20,040 --> 00:26:23,135
They either went
through and optimized--

694
00:26:23,135 --> 00:26:25,260
starting from an average
of all the training data--

695
00:26:25,260 --> 00:26:28,020
they optimized the intensities
until they maximized

696
00:26:28,020 --> 00:26:29,580
the score for a given class.

697
00:26:29,580 --> 00:26:31,270
So that's what's shown here.

698
00:26:31,270 --> 00:26:33,660
And then another way to do
it is in some sense you could

699
00:26:33,660 --> 00:26:36,240
take a derivative of
the score function

700
00:26:36,240 --> 00:26:38,430
relative to the intensities
of all the pixels

701
00:26:38,430 --> 00:26:39,210
and come up with
something like this.

702
00:26:39,210 --> 00:26:41,070
But you could imagine, if
you showed this to a patient,

703
00:26:41,070 --> 00:26:42,750
they wouldn't be very satisfied.

704
00:26:42,750 --> 00:26:47,850
So it's very difficult to make a
case that this is super useful,

705
00:26:47,850 --> 00:26:50,400
but it seems like this field
has progressed somewhat,

706
00:26:50,400 --> 00:26:51,970
and I haven't tried this out.

707
00:26:51,970 --> 00:26:53,930
This is a paper by Max
Welling and company,

708
00:26:53,930 --> 00:26:55,320
out by a couple of
years, and maybe you guys

709
00:26:55,320 --> 00:26:56,550
are familiar with this.

710
00:26:56,550 --> 00:26:59,008
But this ultimately is a little
bit of a different approach

711
00:26:59,008 --> 00:27:01,230
in the sense that
they take patches,

712
00:27:01,230 --> 00:27:03,570
the sort of
purple-like patch here,

713
00:27:03,570 --> 00:27:10,440
and they compare the final
score, or class label,

714
00:27:10,440 --> 00:27:12,180
relative to what it--

715
00:27:12,180 --> 00:27:15,090
so taking the intensity
here and replacing it

716
00:27:15,090 --> 00:27:18,342
by a conditional result
sampling from the periphery.

717
00:27:18,342 --> 00:27:19,800
And just comparing
those two things

718
00:27:19,800 --> 00:27:22,210
and seeing whether or not
you either get activation,

719
00:27:22,210 --> 00:27:24,960
which is the red here.

720
00:27:24,960 --> 00:27:27,360
This is the way that they
did the conditional sampling,

721
00:27:27,360 --> 00:27:29,802
and then blue would be
the negative contributors.

722
00:27:29,802 --> 00:27:31,260
And there, you can
imagine, there's

723
00:27:31,260 --> 00:27:32,460
a little bit more
distinction here,

724
00:27:32,460 --> 00:27:34,793
and then something a little
bit more on the medical side

725
00:27:34,793 --> 00:27:35,910
is this is a brain MRI.

726
00:27:35,910 --> 00:27:37,980
And so depending
on this patch size,

727
00:27:37,980 --> 00:27:40,860
you get a different
degree of resolution

728
00:27:40,860 --> 00:27:44,590
to localizing some areas of
the image that are relevant.

729
00:27:44,590 --> 00:27:46,710
So this is something
that we're going

730
00:27:46,710 --> 00:27:50,955
to expect a lot of demands
from the medical field in terms

731
00:27:50,955 --> 00:27:52,080
of being able to show this.

732
00:27:52,080 --> 00:27:53,550
And at least our
initial forays weren't

733
00:27:53,550 --> 00:27:55,675
very satisfying doing this
with what we were doing,

734
00:27:55,675 --> 00:27:57,930
but maybe these algorithms
have gotten better.

735
00:27:57,930 --> 00:27:58,430
OK.

736
00:27:58,430 --> 00:27:59,730
So next thing that matters.

737
00:27:59,730 --> 00:28:00,230
OK.

738
00:28:00,230 --> 00:28:01,710
So this is what people do.

739
00:28:01,710 --> 00:28:06,360
So I did my cardiology
fellowship in MGH,

740
00:28:06,360 --> 00:28:07,820
and I just traced circles.

741
00:28:07,820 --> 00:28:08,570
That's what I did.

742
00:28:08,570 --> 00:28:11,820
I just trace circles, and
I stretched a ruler across,

743
00:28:11,820 --> 00:28:12,750
and then fed that in.

744
00:28:12,750 --> 00:28:14,910
At least the program
computed the volumes

745
00:28:14,910 --> 00:28:17,980
for me, the areas and
volumes, but otherwise, you

746
00:28:17,980 --> 00:28:20,100
have to do this yourself.

747
00:28:20,100 --> 00:28:23,880
And so this is like
a task that's done,

748
00:28:23,880 --> 00:28:26,040
and sometimes you may have to--

749
00:28:26,040 --> 00:28:28,680
here's an example of
volumes being computed

750
00:28:28,680 --> 00:28:32,238
by tracing these sorts of things
and much radiology reports just

751
00:28:32,238 --> 00:28:33,030
involve doing that.

752
00:28:33,030 --> 00:28:34,800
So this seems like a
very obvious task we

753
00:28:34,800 --> 00:28:36,870
should be able to improve on.

754
00:28:36,870 --> 00:28:39,060
So medicine tends
to be not the most

755
00:28:39,060 --> 00:28:40,675
creative in terms
of trying a bunch

756
00:28:40,675 --> 00:28:41,800
of different architectures.

757
00:28:41,800 --> 00:28:44,190
So if you look at the papers,
they all jump on the U-net

758
00:28:44,190 --> 00:28:47,310
as being the
favorite architecture

759
00:28:47,310 --> 00:28:48,870
for semantic segmentation.

760
00:28:48,870 --> 00:28:51,000
So maybe familiar
to people here,

761
00:28:51,000 --> 00:28:55,760
really, it just captures this
encoding or contracting layer.

762
00:28:55,760 --> 00:28:58,020
Where you're downsampling,
and then there's

763
00:28:58,020 --> 00:29:00,600
a symmetric upsampling
that takes place.

764
00:29:00,600 --> 00:29:03,300
And then ultimately, there's
these skip connections, where

765
00:29:03,300 --> 00:29:07,290
you take an image, and
then you can catonate it

766
00:29:07,290 --> 00:29:10,410
with this upsampled layer, and
this helps get a little bit

767
00:29:10,410 --> 00:29:11,160
more localization.

768
00:29:11,160 --> 00:29:12,827
So we used this for
our paper, and we'll

769
00:29:12,827 --> 00:29:15,090
talk about this a little
bit, and it's very popular

770
00:29:15,090 --> 00:29:16,742
within the medical literature.

771
00:29:16,742 --> 00:29:18,450
One of the things that
was quite annoying

772
00:29:18,450 --> 00:29:20,802
is that what you would find
for some of the images,

773
00:29:20,802 --> 00:29:22,260
you'd find, let's
say, a ventricle.

774
00:29:22,260 --> 00:29:24,180
You'd find this
nicely segmented area,

775
00:29:24,180 --> 00:29:26,305
and then you'd find this
little satellite ventricle

776
00:29:26,305 --> 00:29:28,200
that the image would just pick.

777
00:29:28,200 --> 00:29:31,350
The problem is that this
pixel-level classification

778
00:29:31,350 --> 00:29:33,690
tends to be a
problem, and a human

779
00:29:33,690 --> 00:29:35,010
would never make that mistake.

780
00:29:35,010 --> 00:29:38,370
But that tends to be something
that sounds like it is common

781
00:29:38,370 --> 00:29:40,740
in the-- this is a
common tension is

782
00:29:40,740 --> 00:29:45,750
that this sort of focusing
on relatively limited scales

783
00:29:45,750 --> 00:29:50,040
ends up being problematic,
when it comes to picking up

784
00:29:50,040 --> 00:29:51,100
the global architecture.

785
00:29:51,100 --> 00:29:52,440
And so there's lots
of different solutions

786
00:29:52,440 --> 00:29:53,800
it looks like in the literature.

787
00:29:53,800 --> 00:29:55,675
I just highlighted some
of these from a paper

788
00:29:55,675 --> 00:29:57,785
that was published from
Google a little while ago.

789
00:29:57,785 --> 00:29:59,160
One of the things
that's captured

790
00:29:59,160 --> 00:30:01,500
is these ideas of
dilated convolutions,

791
00:30:01,500 --> 00:30:04,440
and so that you
have convolutions

792
00:30:04,440 --> 00:30:05,520
built on convolutions.

793
00:30:05,520 --> 00:30:08,280
And so ultimately, you have
a much bigger receptive field

794
00:30:08,280 --> 00:30:11,513
for this layer, though
you haven't really

795
00:30:11,513 --> 00:30:12,930
increased the
number of parameters

796
00:30:12,930 --> 00:30:13,560
that you have to learn.

797
00:30:13,560 --> 00:30:14,350
So there is some.

798
00:30:14,350 --> 00:30:15,475
It seems like there's lots.

799
00:30:15,475 --> 00:30:17,400
This is not just
a problem for us

800
00:30:17,400 --> 00:30:19,323
but a problem for many
people in this field.

801
00:30:19,323 --> 00:30:21,240
So we need to be a little
bit more adventurous

802
00:30:21,240 --> 00:30:23,198
in terms of trying some
of these other methods.

803
00:30:23,198 --> 00:30:26,190
We did try a little bit of
that and didn't find a gains,

804
00:30:26,190 --> 00:30:27,690
but I think,
ultimately, there still

805
00:30:27,690 --> 00:30:29,270
needs to be a little
bit more work there.

806
00:30:29,270 --> 00:30:29,610
OK.

807
00:30:29,610 --> 00:30:31,318
So the last thing I'm
going to talk about

808
00:30:31,318 --> 00:30:33,840
before getting into my
work is really this idea

809
00:30:33,840 --> 00:30:35,490
of image registration.

810
00:30:35,490 --> 00:30:38,400
So I talked about how there are
sometimes some techniques that

811
00:30:38,400 --> 00:30:42,462
have limitations, either in
terms of spatial resolution

812
00:30:42,462 --> 00:30:43,420
or temporal resolution.

813
00:30:43,420 --> 00:30:46,940
So this is a PET scan here,
this sort of reddish glow here,

814
00:30:46,940 --> 00:30:49,780
and in the background, we
have a CAT scan of the heart.

815
00:30:49,780 --> 00:30:52,280
And so clearly, this is a
poorly registered image,

816
00:30:52,280 --> 00:30:55,450
where you have the PET scan kind
of floating out here, when it

817
00:30:55,450 --> 00:30:56,785
really should be lined up here.

818
00:30:56,785 --> 00:30:59,160
And so you have something
that's registered better there.

819
00:30:59,160 --> 00:31:00,868
I also mentioned this
problem but gating.

820
00:31:00,868 --> 00:31:03,520
So ultimately, if you
have an image taken

821
00:31:03,520 --> 00:31:05,560
from different
cardiac cycles, you're

822
00:31:05,560 --> 00:31:08,500
going to have align
them in some way.

823
00:31:08,500 --> 00:31:10,870
It seems like a very mature
problem in computer vision

824
00:31:10,870 --> 00:31:11,530
world.

825
00:31:11,530 --> 00:31:13,640
We haven't done
anything in this space,

826
00:31:13,640 --> 00:31:16,300
but ultimately, it has
been around for decades.

827
00:31:16,300 --> 00:31:19,907
If not, I would just at least
touch it, touch upon it.

828
00:31:19,907 --> 00:31:21,490
So this is sort of
the old school way,

829
00:31:21,490 --> 00:31:23,110
and then now people
are starting to use

830
00:31:23,110 --> 00:31:24,610
conditional variational
autoencoders

831
00:31:24,610 --> 00:31:27,880
to be able to learn
geometric transformations.

832
00:31:27,880 --> 00:31:32,430
This is the Siemens group out in
Princeton that has this paper.

833
00:31:32,430 --> 00:31:34,180
Again, nothing I'm
going to focus on, just

834
00:31:34,180 --> 00:31:36,250
wanted to bring it up
as being an area that

835
00:31:36,250 --> 00:31:38,300
remains of interest.

836
00:31:38,300 --> 00:31:39,280
OK.

837
00:31:39,280 --> 00:31:45,320
So I think we're doing
OK, but you said 4:00.

838
00:31:45,320 --> 00:31:46,060
PROFESSOR: 3:55

839
00:31:46,060 --> 00:31:46,600
RAHUL DEO: 3:55.

840
00:31:46,600 --> 00:31:47,110
OK.

841
00:31:47,110 --> 00:31:47,920
All right, and interrupt.

842
00:31:47,920 --> 00:31:48,670
Please, interrupt.

843
00:31:48,670 --> 00:31:49,720
OK?

844
00:31:49,720 --> 00:31:52,650
I'm hoping that I'm
not talking too fast.

845
00:31:52,650 --> 00:31:54,530
OK.

846
00:31:54,530 --> 00:31:58,160
As David said, this
was not my field,

847
00:31:58,160 --> 00:32:00,328
but increasingly,
there is some interest

848
00:32:00,328 --> 00:32:02,120
in terms of getting
involved in it, in part

849
00:32:02,120 --> 00:32:03,920
because of my frustrations
with clinical medicine.

850
00:32:03,920 --> 00:32:05,295
So this is one of
my frustrations

851
00:32:05,295 --> 00:32:06,470
with clinical medicine.

852
00:32:06,470 --> 00:32:10,380
So cardiology has
not really changed,

853
00:32:10,380 --> 00:32:13,850
and one of the things
it fails at miserably

854
00:32:13,850 --> 00:32:17,630
is picking up
early-onset disease.

855
00:32:17,630 --> 00:32:20,690
So here's the typical
profile, a little facetious.

856
00:32:20,690 --> 00:32:24,230
So people like me
in our early 40s,

857
00:32:24,230 --> 00:32:26,365
start to already
have some problems

858
00:32:26,365 --> 00:32:27,490
with some of these numbers.

859
00:32:27,490 --> 00:32:29,930
So I like to joke that, since I
came back to the Harvard system

860
00:32:29,930 --> 00:32:31,847
from California, my blood
pressure has gone up

861
00:32:31,847 --> 00:32:34,450
10 points which is
true, unfortunately.

862
00:32:34,450 --> 00:32:38,020
So these changes
already start to happen,

863
00:32:38,020 --> 00:32:40,310
and nobody does
anything about it.

864
00:32:40,310 --> 00:32:43,495
So you can go to your doctor,
and you're also saying,

865
00:32:43,495 --> 00:32:45,120
no, I don't want to
be on any medicine.

866
00:32:45,120 --> 00:32:47,412
They're like, no, no, you
shouldn't be on any medicine.

867
00:32:47,412 --> 00:32:52,223
So you kind hem and haw, and a
decade goes by, 15 years go by.

868
00:32:52,223 --> 00:32:53,890
And then finally,
you're like, OK, well,

869
00:32:53,890 --> 00:32:55,380
it looks like at
least my coworkers

870
00:32:55,380 --> 00:32:58,733
are on some medicines, or maybe
I'll be willing to do that.

871
00:32:58,733 --> 00:33:00,900
And so they've got lots of
stuff you can be treated,

872
00:33:00,900 --> 00:33:03,382
but it is often very
difficult, and you see

873
00:33:03,382 --> 00:33:04,590
this at the doctor level too.

874
00:33:04,590 --> 00:33:05,155
Yes.

875
00:33:05,155 --> 00:33:06,530
AUDIENCE: For the
optical values,

876
00:33:06,530 --> 00:33:11,990
how much personal deviation
is there for the values?

877
00:33:11,990 --> 00:33:17,860
RAHUL DEO: So the optimal
value is fixed and is just

878
00:33:17,860 --> 00:33:19,780
like a reference value.

879
00:33:19,780 --> 00:33:21,900
And you can be off--

880
00:33:21,900 --> 00:33:23,380
so blood pressure, let's say.

881
00:33:23,380 --> 00:33:25,420
So people consider optimal
to be less than 120

882
00:33:25,420 --> 00:33:27,600
over less than 80.

883
00:33:27,600 --> 00:33:29,715
People are in the 200s.

884
00:33:29,715 --> 00:33:31,590
So you'd be treated in
the 200s, but there'll

885
00:33:31,590 --> 00:33:34,077
be lots of people in
the 140s and the 150s,

886
00:33:34,077 --> 00:33:35,910
and there'll be a degree
of kind of nihilism

887
00:33:35,910 --> 00:33:38,010
about that for some time.

888
00:33:38,010 --> 00:33:40,560
And my patients would be
like, oh, I got into the fight

889
00:33:40,560 --> 00:33:42,930
with the parking attendant.

890
00:33:42,930 --> 00:33:45,160
I just had a really bad phone--

891
00:33:45,160 --> 00:33:47,250
there's like countless
excuses for why

892
00:33:47,250 --> 00:33:49,552
it is that one shouldn't
start a medication,

893
00:33:49,552 --> 00:33:51,010
and this can go on
for a long time.

894
00:33:51,010 --> 00:33:51,553
Yes.

895
00:33:51,553 --> 00:33:52,470
AUDIENCE: [INAUDIBLE].

896
00:33:52,470 --> 00:33:55,676
How can you assess the risk
[INAUDIBLE] for blood pressure?

897
00:33:55,676 --> 00:33:59,127
Is that, like,
noise [INAUDIBLE]??

898
00:34:03,933 --> 00:34:04,600
RAHUL DEO: Yeah.

899
00:34:04,600 --> 00:34:05,130
So OK.

900
00:34:05,130 --> 00:34:06,720
So that's a great point.

901
00:34:06,720 --> 00:34:08,100
So yeah.

902
00:34:08,100 --> 00:34:12,030
So the question is
that many of the things

903
00:34:12,030 --> 00:34:14,260
that we're seeing
as risk factors have

904
00:34:14,260 --> 00:34:15,750
inherent variability to them.

905
00:34:15,750 --> 00:34:19,050
Blood sugar is another great
example of those things.

906
00:34:19,050 --> 00:34:21,210
If you could have a
single-point estimate that

907
00:34:21,210 --> 00:34:23,370
arises in the setting of
a single clinic visit,

908
00:34:23,370 --> 00:34:24,270
how much do you trust that?

909
00:34:24,270 --> 00:34:26,062
So it's a couple of
things related to that.

910
00:34:26,062 --> 00:34:28,139
So one of them is
that people could

911
00:34:28,139 --> 00:34:31,710
be sent home with monitors, and
they can have 24-hour monitors.

912
00:34:31,710 --> 00:34:34,710
In Europe, that's much
more often done than here.

913
00:34:34,710 --> 00:34:37,290
And then, the thing is that
often they'll say that,

914
00:34:37,290 --> 00:34:40,020
and then you go look at
like six consecutive visits,

915
00:34:40,020 --> 00:34:42,420
and they all have something
elevate, but it's true.

916
00:34:42,420 --> 00:34:46,260
This is a noisy point
estimate, and people

917
00:34:46,260 --> 00:34:49,270
have shown that averages
tend to do better.

918
00:34:49,270 --> 00:34:53,090
But at the same time,
if that's all you have--

919
00:34:53,090 --> 00:34:54,449
and the bias is interesting.

920
00:34:54,449 --> 00:34:57,685
Because the bias comes
from some degree of stress,

921
00:34:57,685 --> 00:34:59,310
but we have lots of
stress in our life.

922
00:34:59,310 --> 00:35:01,060
I hopefully am not the
most stressful part

923
00:35:01,060 --> 00:35:04,050
of my patient's life, and so
I think that ultimately there

924
00:35:04,050 --> 00:35:05,760
are--

925
00:35:05,760 --> 00:35:07,742
and the problem
with that is it's

926
00:35:07,742 --> 00:35:09,450
a good reason for
someone to talk you out

927
00:35:09,450 --> 00:35:10,930
of them starting
them on anything.

928
00:35:10,930 --> 00:35:13,320
And that's what
ends up happening,

929
00:35:13,320 --> 00:35:16,270
and so this can be a
really long period of time.

930
00:35:16,270 --> 00:35:16,770
OK.

931
00:35:16,770 --> 00:35:17,530
So this is the grim part.

932
00:35:17,530 --> 00:35:18,030
OK?

933
00:35:18,030 --> 00:35:19,560
So it turns out
that once symptoms

934
00:35:19,560 --> 00:35:22,830
develop for something like
heart failure, decline is fast.

935
00:35:22,830 --> 00:35:26,322
So 50% mortality in five
years, after somebody gets

936
00:35:26,322 --> 00:35:28,530
hospitalized for their first
heart failure admission,

937
00:35:28,530 --> 00:35:30,960
and often the symptoms
are just around that time.

938
00:35:30,960 --> 00:35:32,670
So unfortunately,
these things tend

939
00:35:32,670 --> 00:35:36,982
to be irreversible changes
that happen in the background,

940
00:35:36,982 --> 00:35:38,940
and largely, you don't
really have any symptoms

941
00:35:38,940 --> 00:35:40,030
until late in the game.

942
00:35:40,030 --> 00:35:42,630
So we have this problem, where
we have this huge stretch.

943
00:35:42,630 --> 00:35:44,250
We know that there
is risk factors,

944
00:35:44,250 --> 00:35:46,792
but we have this huge stretch,
where nobody is doing anything

945
00:35:46,792 --> 00:35:47,470
about them.

946
00:35:47,470 --> 00:35:49,740
And then we have sort things
going downhill relatively

947
00:35:49,740 --> 00:35:51,030
quickly after that.

948
00:35:51,030 --> 00:35:53,310
And unfortunately,
I would make a case

949
00:35:53,310 --> 00:35:55,050
that probably
responsiveness is probably

950
00:35:55,050 --> 00:35:56,850
best did this phase over there.

951
00:35:56,850 --> 00:35:59,160
Expense is really
all over there.

952
00:35:59,160 --> 00:36:00,600
So we really want to find--

953
00:36:00,600 --> 00:36:02,850
and this is what I consider
to be missing in medicine.

954
00:36:02,850 --> 00:36:04,620
I'm going to come back to
this again a little bit later

955
00:36:04,620 --> 00:36:06,390
on-- but really, we
want to have these--

956
00:36:06,390 --> 00:36:08,910
if you're going to do something
in this asymptomatic phase,

957
00:36:08,910 --> 00:36:09,930
it better be cheap.

958
00:36:09,930 --> 00:36:11,970
You're not going to be
getting MRIs every day

959
00:36:11,970 --> 00:36:16,820
or every year for people
who have no symptoms.

960
00:36:16,820 --> 00:36:18,570
The system would
bankrupt if you had that.

961
00:36:18,570 --> 00:36:20,460
So we need these
low cost metrics

962
00:36:20,460 --> 00:36:22,570
that can tell us, at
an individual level,

963
00:36:22,570 --> 00:36:25,260
not just if we had
1,000 people like you,

964
00:36:25,260 --> 00:36:27,053
somebody would benefit.

965
00:36:27,053 --> 00:36:28,470
And this is what
my patients would

966
00:36:28,470 --> 00:36:32,190
say is that they would be
so excited about their EKG

967
00:36:32,190 --> 00:36:33,690
or their echo being
done every year,

968
00:36:33,690 --> 00:36:35,130
because they want to know,
how does it look like

969
00:36:35,130 --> 00:36:36,047
compared to last year?

970
00:36:36,047 --> 00:36:38,710
They want some comparison
at their level,

971
00:36:38,710 --> 00:36:41,070
not just some
public health report

972
00:36:41,070 --> 00:36:45,520
about this being a benefit
to 100 people like you.

973
00:36:45,520 --> 00:36:48,480
And so it shouldn't
be both low cost,

974
00:36:48,480 --> 00:36:49,860
should be reflective
at something

975
00:36:49,860 --> 00:36:51,840
an individual level,
should be relatively

976
00:36:51,840 --> 00:36:55,200
specific to the disease
process, expressive in some way,

977
00:36:55,200 --> 00:36:56,670
and should get
better with therapy.

978
00:36:56,670 --> 00:36:57,870
I think that's one
of the things that's

979
00:36:57,870 --> 00:36:59,430
pretty important
is if somebody does

980
00:36:59,430 --> 00:37:01,380
the things you ask
them to do, hopefully,

981
00:37:01,380 --> 00:37:02,580
that will look better.

982
00:37:02,580 --> 00:37:04,920
And then that would
be motivating,

983
00:37:04,920 --> 00:37:06,930
and I think that's how
people get motivated is

984
00:37:06,930 --> 00:37:08,910
that they get responses.

985
00:37:08,910 --> 00:37:11,502
So I would make a case
that even simple things

986
00:37:11,502 --> 00:37:12,960
like an ultrasound--
and I have one

987
00:37:12,960 --> 00:37:14,310
showed here--
really does capture

988
00:37:14,310 --> 00:37:15,780
some of these things,
and not all those things,

989
00:37:15,780 --> 00:37:17,460
but they have some
of those things.

990
00:37:17,460 --> 00:37:20,100
So you have, for example, that
in the setting of high blood

991
00:37:20,100 --> 00:37:22,530
pressure, the left ventricular
mass starts to thicken,

992
00:37:22,530 --> 00:37:25,080
and this is a quantitative,
continuous measure.

993
00:37:25,080 --> 00:37:28,380
It just thickens over time,
and the heart starts to change.

994
00:37:28,380 --> 00:37:31,180
The pumping function
can get worse over time.

995
00:37:31,180 --> 00:37:33,570
The left atrium, which is
this structure over here,

996
00:37:33,570 --> 00:37:35,730
this thin-walled structure
is amazing in the sense

997
00:37:35,730 --> 00:37:38,447
that it's almost this barometer
for the pressure in the heart.

998
00:37:38,447 --> 00:37:39,780
Oh, that's a horrible reference.

999
00:37:39,780 --> 00:37:41,820
OK, but it tends to
get kind of bigger

1000
00:37:41,820 --> 00:37:44,460
and bigger in a very subtle
way before any symptoms happen.

1001
00:37:44,460 --> 00:37:46,320
So you have this, and
this is just one view.

1002
00:37:46,320 --> 00:37:46,820
Right?

1003
00:37:46,820 --> 00:37:48,240
So this is a simple
view acquired

1004
00:37:48,240 --> 00:37:50,250
from an ultrasound
that captures some

1005
00:37:50,250 --> 00:37:52,420
of these things at
an individual level.

1006
00:37:52,420 --> 00:37:54,270
So this gets to
some of my thoughts

1007
00:37:54,270 --> 00:37:57,590
around where we could imagine
automated interpretation

1008
00:37:57,590 --> 00:37:58,660
benefiting.

1009
00:37:58,660 --> 00:38:03,060
So if you want to think about
where you're less likely.

1010
00:38:03,060 --> 00:38:08,400
So with these very, very
difficult, end-stage,

1011
00:38:08,400 --> 00:38:09,960
or complex decisions,
where you have

1012
00:38:09,960 --> 00:38:12,423
a super skilled
person even collecting

1013
00:38:12,423 --> 00:38:13,590
the data in the first place.

1014
00:38:13,590 --> 00:38:14,840
They've gone through training.

1015
00:38:14,840 --> 00:38:16,320
They're super experienced.

1016
00:38:16,320 --> 00:38:18,210
You have a very expensive
piece of hardware

1017
00:38:18,210 --> 00:38:19,740
used to collect the data.

1018
00:38:19,740 --> 00:38:21,210
You have an expert
interpreting it.

1019
00:38:21,210 --> 00:38:23,310
This is done late in
the disease course.

1020
00:38:23,310 --> 00:38:25,590
You have to make
really hard decisions,

1021
00:38:25,590 --> 00:38:27,120
and you don't want
to mess it up.

1022
00:38:27,120 --> 00:38:29,040
So probably not
good places to try

1023
00:38:29,040 --> 00:38:32,303
to stick in an automated
system in there,

1024
00:38:32,303 --> 00:38:33,720
but what would be
attractive would

1025
00:38:33,720 --> 00:38:37,120
be to try to enable studies that
are not even being done at all.

1026
00:38:37,120 --> 00:38:40,470
So move to the
primary care setting.

1027
00:38:40,470 --> 00:38:41,830
Use low cost handhelds.

1028
00:38:41,830 --> 00:38:43,890
So there's even
now companies that

1029
00:38:43,890 --> 00:38:46,710
are starting to try to automate
acquisition of the data

1030
00:38:46,710 --> 00:38:48,780
by helping people
collect it and guide them

1031
00:38:48,780 --> 00:38:50,700
to collecting the right views.

1032
00:38:50,700 --> 00:38:53,280
Early in the disease course,
no real symptoms here.

1033
00:38:53,280 --> 00:38:55,350
Decision support just
around whether you

1034
00:38:55,350 --> 00:38:58,040
should start some meds or
intensify them, low liability,

1035
00:38:58,040 --> 00:38:58,843
low cost.

1036
00:38:58,843 --> 00:39:00,260
So this is a place
where we wanted

1037
00:39:00,260 --> 00:39:02,093
to focus in terms of
being able to introduce

1038
00:39:02,093 --> 00:39:05,460
some kind of innovations
in this space.

1039
00:39:05,460 --> 00:39:05,960
OK.

1040
00:39:05,960 --> 00:39:07,760
So this comes back
to this slide of I

1041
00:39:07,760 --> 00:39:10,190
talked about where you could
imagine some of these things

1042
00:39:10,190 --> 00:39:12,680
being low hanging fruit, but
maybe those aren't the ones

1043
00:39:12,680 --> 00:39:14,638
that we should be focusing
on we should instead

1044
00:39:14,638 --> 00:39:19,100
be focusing on enabling
more data at low cost,

1045
00:39:19,100 --> 00:39:21,870
getting more out of the
data that we're collecting,

1046
00:39:21,870 --> 00:39:24,120
and helping people even
acquire it in the first place.

1047
00:39:24,120 --> 00:39:26,287
So that's one category of
things, and that's the one

1048
00:39:26,287 --> 00:39:28,105
I just highlighted in
the previous slide.

1049
00:39:28,105 --> 00:39:29,480
You can imagine
something running

1050
00:39:29,480 --> 00:39:31,380
in the background at a
hospital system level

1051
00:39:31,380 --> 00:39:33,380
and just checking to see
whether there's anybody

1052
00:39:33,380 --> 00:39:34,753
who was missed in some ways.

1053
00:39:34,753 --> 00:39:37,170
And then triage I'm going to
talk about in the next slide.

1054
00:39:37,170 --> 00:39:39,140
I'll come back to that,
and then really-- and this

1055
00:39:39,140 --> 00:39:40,280
is, again, one of
the reasons I got

1056
00:39:40,280 --> 00:39:41,947
into this-- we want
to do something that

1057
00:39:41,947 --> 00:39:44,000
elevates practice beyond
just simply repeating

1058
00:39:44,000 --> 00:39:45,540
what we already do.

1059
00:39:45,540 --> 00:39:48,230
And so this idea of
quantitative tracking

1060
00:39:48,230 --> 00:39:50,147
of intermediate states,
subclasses of disease,

1061
00:39:50,147 --> 00:39:52,438
which is actually the real
reason I got into this space

1062
00:39:52,438 --> 00:39:54,410
is because I wanted to
increase scale of data

1063
00:39:54,410 --> 00:39:57,412
to be able to do this, and
this is where you potentially

1064
00:39:57,412 --> 00:39:58,120
would like to go.

1065
00:39:58,120 --> 00:40:00,470
So the ECG example is
an interesting one,

1066
00:40:00,470 --> 00:40:02,810
because automated systems
for ECG interpretation

1067
00:40:02,810 --> 00:40:05,240
have been around
for 40 or 50 years,

1068
00:40:05,240 --> 00:40:11,030
and they really got going
around the early 2000s, when

1069
00:40:11,030 --> 00:40:13,590
people realized--

1070
00:40:13,590 --> 00:40:15,568
there's a pattern
called an ST elevation.

1071
00:40:15,568 --> 00:40:17,360
I'm not sure if you
guys talked about that.

1072
00:40:17,360 --> 00:40:20,225
This is a marker of
complete stoppage

1073
00:40:20,225 --> 00:40:21,350
of blood flow to the heart.

1074
00:40:21,350 --> 00:40:24,080
So muscle starts to die.

1075
00:40:24,080 --> 00:40:28,880
And then the early 2000s, there
was a quality movement that

1076
00:40:28,880 --> 00:40:31,190
said, as soon as
anybody sees that, you

1077
00:40:31,190 --> 00:40:33,200
should get to somebody
doing something

1078
00:40:33,200 --> 00:40:35,700
about it within an
hour and a half or so.

1079
00:40:35,700 --> 00:40:38,180
And so the problem was that in
the old days and the old way

1080
00:40:38,180 --> 00:40:39,020
to do this--

1081
00:40:39,020 --> 00:40:40,395
and even this was
around the time

1082
00:40:40,395 --> 00:40:43,460
I was a resident--
you would have

1083
00:40:43,460 --> 00:40:45,440
to first call the cardiologist.

1084
00:40:45,440 --> 00:40:46,213
Wake him up.

1085
00:40:46,213 --> 00:40:46,880
They would come.

1086
00:40:46,880 --> 00:40:47,963
You'd send them the image.

1087
00:40:47,963 --> 00:40:49,060
They would look at it.

1088
00:40:49,060 --> 00:40:50,180
Then, they would
decide whether or not

1089
00:40:50,180 --> 00:40:51,830
this was the pattern
they were seeing,

1090
00:40:51,830 --> 00:40:54,140
and then they would activate
the lab, the cath lab.

1091
00:40:54,140 --> 00:40:56,870
They would come in, and you
were losing about an hour, hour

1092
00:40:56,870 --> 00:40:58,290
and a half in this process.

1093
00:40:58,290 --> 00:41:01,670
And so instead they decided
that automated systems could

1094
00:41:01,670 --> 00:41:05,270
be used to be able to
enable ambulance personnel

1095
00:41:05,270 --> 00:41:07,550
or emergency room docs,
so non-cardiologists,

1096
00:41:07,550 --> 00:41:09,042
to be able to say,
hey, look, this

1097
00:41:09,042 --> 00:41:10,250
is what we think is going on.

1098
00:41:10,250 --> 00:41:12,890
Let's bring the team in, and
so people would get mobilized.

1099
00:41:12,890 --> 00:41:14,390
People would come
to the hospital.

1100
00:41:14,390 --> 00:41:17,480
Nobody would do anything in
terms of starting the case,

1101
00:41:17,480 --> 00:41:20,598
until somebody confirmed it,
but already, the whole wheels

1102
00:41:20,598 --> 00:41:21,140
were turning.

1103
00:41:21,140 --> 00:41:22,730
And so you have
this triage system,

1104
00:41:22,730 --> 00:41:24,212
where you're making a decision.

1105
00:41:24,212 --> 00:41:25,670
You're not finalizing
the decision,

1106
00:41:25,670 --> 00:41:26,850
but you're speeding things up.

1107
00:41:26,850 --> 00:41:28,040
And so this is an
example where you

1108
00:41:28,040 --> 00:41:29,498
could imagine it's
important to try

1109
00:41:29,498 --> 00:41:31,070
to offload this to something.

1110
00:41:31,070 --> 00:41:33,363
So this is an
example, and there's

1111
00:41:33,363 --> 00:41:34,530
going to be false positives.

1112
00:41:34,530 --> 00:41:36,950
And people will laugh and mock
the emergency room doctors

1113
00:41:36,950 --> 00:41:38,972
and mock the ambulance
drivers and say, ah,

1114
00:41:38,972 --> 00:41:40,430
they don't know
what they're doing.

1115
00:41:40,430 --> 00:41:41,480
They don't have any experience.

1116
00:41:41,480 --> 00:41:43,130
But ultimately,
people were dying,

1117
00:41:43,130 --> 00:41:45,047
because they were waiting
for the cardiologist

1118
00:41:45,047 --> 00:41:47,150
to be available to read the ECG.

1119
00:41:47,150 --> 00:41:50,390
So you've got to think about
those in terms of places

1120
00:41:50,390 --> 00:41:51,930
where there may
be cost for delay.

1121
00:41:51,930 --> 00:41:52,430
OK.

1122
00:41:52,430 --> 00:41:54,420
So coming back to echoes.

1123
00:41:54,420 --> 00:41:54,920
OK.

1124
00:41:54,920 --> 00:41:56,253
So why does an echo get studied?

1125
00:41:56,253 --> 00:42:00,110
Because this is probably not
something that is typical.

1126
00:42:00,110 --> 00:42:04,310
It's a compilation
of videos, and there

1127
00:42:04,310 --> 00:42:06,680
are about 70 different videos
typically in the studies

1128
00:42:06,680 --> 00:42:08,690
that we do at the
centers that we're at.

1129
00:42:08,690 --> 00:42:10,400
And they're taken
over multiple cycles

1130
00:42:10,400 --> 00:42:12,860
and multiple different
views, and often it

1131
00:42:12,860 --> 00:42:15,382
takes somebody pretty skilled
to acquire those views.

1132
00:42:15,382 --> 00:42:17,090
And they take about
45 minutes to an hour

1133
00:42:17,090 --> 00:42:19,820
to gather that data,
multiple different views,

1134
00:42:19,820 --> 00:42:22,010
and the stenographer
is changing the depth

1135
00:42:22,010 --> 00:42:24,050
to zoom in on given structures.

1136
00:42:24,050 --> 00:42:26,090
And so you can understand
that there's already

1137
00:42:26,090 --> 00:42:27,830
somebody who was
already very experienced

1138
00:42:27,830 --> 00:42:30,350
in this process even collecting
the data which is a problem.

1139
00:42:30,350 --> 00:42:32,600
Because you need to take
them out of the picture,

1140
00:42:32,600 --> 00:42:35,490
because they're expensive to
be able to do those things.

1141
00:42:35,490 --> 00:42:38,170
So we were doing at
UCSF 12,000 to 50,000.

1142
00:42:38,170 --> 00:42:41,540
Brigham was probably a little
busier at 30,000 to 35,000.

1143
00:42:41,540 --> 00:42:44,300
Medicare back, in 2011, had
seven million of these perform,

1144
00:42:44,300 --> 00:42:47,630
and there's probably hundreds
of millions of these archives,

1145
00:42:47,630 --> 00:42:50,420
so lots of data.

1146
00:42:50,420 --> 00:42:54,050
So we published
a paper last year

1147
00:42:54,050 --> 00:42:57,440
trying to automate really all of
the main processes around this,

1148
00:42:57,440 --> 00:43:01,130
and part of the reason to do all
is it doesn't help you to have

1149
00:43:01,130 --> 00:43:02,540
one little bit automated.

1150
00:43:02,540 --> 00:43:04,550
Because at the end
of the day, if you

1151
00:43:04,550 --> 00:43:06,140
have to have a
cardiologist doing

1152
00:43:06,140 --> 00:43:07,220
everything else
and a stenographer

1153
00:43:07,220 --> 00:43:09,262
doing everything else,
what have you really saved

1154
00:43:09,262 --> 00:43:10,900
by having one little step?

1155
00:43:10,900 --> 00:43:14,030
So the goal here was to start
from raw study, coming straight

1156
00:43:14,030 --> 00:43:16,410
off the machine, and
try to do everything.

1157
00:43:16,410 --> 00:43:18,110
And so that involves
sorting through all

1158
00:43:18,110 --> 00:43:20,660
these different views, coming
up with empirical quality score

1159
00:43:20,660 --> 00:43:25,100
with it, segmenting all the
five primary views that we use.

1160
00:43:25,100 --> 00:43:27,080
Directly detecting
some diseases,

1161
00:43:27,080 --> 00:43:29,032
and then computing
all the standard mass

1162
00:43:29,032 --> 00:43:31,240
and volume types of measurements
that come from this.

1163
00:43:31,240 --> 00:43:34,070
So we wanted to do it all,
and this was, I think,

1164
00:43:34,070 --> 00:43:37,880
it wasn't strikingly original in
the algorithms that were used.

1165
00:43:37,880 --> 00:43:40,358
But at the same time, it
was very bold for anybody

1166
00:43:40,358 --> 00:43:42,650
in the community to try to
take this on, and of course,

1167
00:43:42,650 --> 00:43:44,983
in general, all the backlash
you could imagine when, you

1168
00:43:44,983 --> 00:43:46,460
try to do something like this.

1169
00:43:46,460 --> 00:43:49,712
I still hear it, but
there's excitement.

1170
00:43:49,712 --> 00:43:51,170
And certainly on
the industry side,

1171
00:43:51,170 --> 00:43:54,110
there's really excitement
in that this is feasible.

1172
00:43:54,110 --> 00:44:01,020
So I was running biology
lab, back in 2016 or so,

1173
00:44:01,020 --> 00:44:02,130
and then decided--

1174
00:44:02,130 --> 00:44:06,720
so my cousin's husband is the
Dean of Engineering at Penn,

1175
00:44:06,720 --> 00:44:09,270
and I emailed him and said, do
you know anyone at Berkeley?

1176
00:44:09,270 --> 00:44:10,112
I live near there.

1177
00:44:10,112 --> 00:44:12,570
I have a very long commute,
and I was like closer to there.

1178
00:44:12,570 --> 00:44:13,653
Is anybody you know there?

1179
00:44:13,653 --> 00:44:14,760
So he's like, yeah.

1180
00:44:14,760 --> 00:44:16,410
I know Ruzena Bajcsy there.

1181
00:44:16,410 --> 00:44:18,840
She used to be a Penn,
and I know Alyosha Efros.

1182
00:44:18,840 --> 00:44:21,970
And so he just emailed them
and said, can you meet?

1183
00:44:21,970 --> 00:44:22,998
[INAUDIBLE]

1184
00:44:22,998 --> 00:44:24,540
And so I met some
of them, and then I

1185
00:44:24,540 --> 00:44:26,260
tried to find some people
who were willing to work.

1186
00:44:26,260 --> 00:44:28,950
So I just spent a day a week
there for about two years,

1187
00:44:28,950 --> 00:44:30,720
just hanging out,
writing, code and try

1188
00:44:30,720 --> 00:44:32,530
to get this project
off the ground.

1189
00:44:32,530 --> 00:44:34,910
So we have a few
different institutions.

1190
00:44:34,910 --> 00:44:37,260
Jeff Zhang was a senior
undergraduate at the time.

1191
00:44:37,260 --> 00:44:40,020
He's at Illinois right
now as a graduate student.

1192
00:44:40,020 --> 00:44:43,170
It's interesting, because it's
hard to get grad student level

1193
00:44:43,170 --> 00:44:45,630
people excited over
stuff that's applications

1194
00:44:45,630 --> 00:44:51,080
of existing algorithms, but
they're happy to advise.

1195
00:44:51,080 --> 00:44:54,070
So I ended up having to write
a lot of the code myself.

1196
00:44:54,070 --> 00:44:55,572
And undergraduates
are, of course,

1197
00:44:55,572 --> 00:44:57,030
excited to do these
kind of things,

1198
00:44:57,030 --> 00:44:59,920
because it's better than
homework, and I can pay.

1199
00:44:59,920 --> 00:45:01,920
But I think, ultimately,
it's interesting to try

1200
00:45:01,920 --> 00:45:05,940
to find that sweet spot and
also find things that ultimately

1201
00:45:05,940 --> 00:45:09,627
could be interesting from an
algorithmic standpoint too.

1202
00:45:09,627 --> 00:45:11,460
So I'm trying to do
more of that these days.

1203
00:45:11,460 --> 00:45:13,230
OK.

1204
00:45:13,230 --> 00:45:15,330
So we aren't the first
to even do something

1205
00:45:15,330 --> 00:45:17,030
around classifying views.

1206
00:45:17,030 --> 00:45:18,780
So somebody already
had publish something,

1207
00:45:18,780 --> 00:45:20,860
but we wanted to be a little
bit more nuanced than that.

1208
00:45:20,860 --> 00:45:22,693
In that we wanted to
be able to distinguish,

1209
00:45:22,693 --> 00:45:25,590
for example, whether this
structure, the left ventricle,

1210
00:45:25,590 --> 00:45:26,220
is cut off.

1211
00:45:26,220 --> 00:45:28,387
Because we don't want to
measure it if it's cut off,

1212
00:45:28,387 --> 00:45:30,803
and we don't want to measure
the atrium if it's completely

1213
00:45:30,803 --> 00:45:31,380
cut off here.

1214
00:45:31,380 --> 00:45:33,390
So we wanted to be able
to have a classifier

1215
00:45:33,390 --> 00:45:35,432
able to distinguish between
some of those things.

1216
00:45:35,432 --> 00:45:37,740
It's not an easy task,
and a lot of these labels

1217
00:45:37,740 --> 00:45:41,400
were me riding the train in
my very long commute from East

1218
00:45:41,400 --> 00:45:44,040
Bay, in California, to UCSF.

1219
00:45:44,040 --> 00:45:47,340
And so I did a lot of labeling,
and I did a lot of segmentation

1220
00:45:47,340 --> 00:45:47,840
too.

1221
00:45:47,840 --> 00:45:49,175
So I could fly a lot.

1222
00:45:49,175 --> 00:45:50,550
And that's the
other thing that's

1223
00:45:50,550 --> 00:45:52,512
kind of interesting is
that you often need--

1224
00:45:52,512 --> 00:45:53,970
even to do the
grunt work-- you may

1225
00:45:53,970 --> 00:45:56,890
need somebody fairly specialized
to do it which is OK, but yeah,

1226
00:45:56,890 --> 00:45:58,723
so that ended up being
me for a lot of this.

1227
00:45:58,723 --> 00:46:01,390
So I traced a lot
of these images,

1228
00:46:01,390 --> 00:46:03,400
and then I got some
other people to help out.

1229
00:46:03,400 --> 00:46:05,900
But you're not going to get a
computer science undergraduate

1230
00:46:05,900 --> 00:46:08,448
to trace art structures
for you, nor are you

1231
00:46:08,448 --> 00:46:10,240
going to get them
excited about doing this.

1232
00:46:10,240 --> 00:46:11,760
So we didn't end up
having that much data,

1233
00:46:11,760 --> 00:46:13,885
and I think we could probably
get better than that.

1234
00:46:13,885 --> 00:46:17,190
But we had the five main views,
and we implemented a modified

1235
00:46:17,190 --> 00:46:18,600
version of unit algorithm.

1236
00:46:18,600 --> 00:46:21,060
We imposed a bit of
a penalty to keep

1237
00:46:21,060 --> 00:46:24,480
this problem of, for example,
a little stray ventricle

1238
00:46:24,480 --> 00:46:25,500
being out there.

1239
00:46:25,500 --> 00:46:27,420
We imposed a penalty
to say, well,

1240
00:46:27,420 --> 00:46:30,022
if that's too far away
from the center then,

1241
00:46:30,022 --> 00:46:31,980
we're going to have the
loss function take that

1242
00:46:31,980 --> 00:46:32,670
into account.

1243
00:46:32,670 --> 00:46:37,130
That helped somewhat, but so
that was our approach to--

1244
00:46:37,130 --> 00:46:38,587
this is a pretty
substantial deal

1245
00:46:38,587 --> 00:46:40,170
to be able to do all
these things that

1246
00:46:40,170 --> 00:46:42,120
normally would be very tedious.

1247
00:46:42,120 --> 00:46:44,310
And as a result, when we
start to analyze things,

1248
00:46:44,310 --> 00:46:47,340
we can segment every single
frame of every single video.

1249
00:46:47,340 --> 00:46:49,900
The typical echo reader will
take two frames and trace them.

1250
00:46:49,900 --> 00:46:50,400
That's it.

1251
00:46:50,400 --> 00:46:51,330
That's all you get.

1252
00:46:51,330 --> 00:46:53,910
So we can do everything over
every single cardiac cycle,

1253
00:46:53,910 --> 00:46:56,290
because there's amazing
variability from beat to beat.

1254
00:46:56,290 --> 00:46:58,560
And so it's silly
to think that that

1255
00:46:58,560 --> 00:47:02,020
should be the gold standard,
but that is the gold standard.

1256
00:47:02,020 --> 00:47:03,837
So we had thousands of echoes.

1257
00:47:03,837 --> 00:47:04,920
So that's the other thing.

1258
00:47:04,920 --> 00:47:07,740
So it turns out that it's
almost impossible to get access

1259
00:47:07,740 --> 00:47:09,930
to echoes, so I wrote a
keystroke encoder that

1260
00:47:09,930 --> 00:47:13,440
sat at the front end and just
mimicked me entering in studies

1261
00:47:13,440 --> 00:47:14,320
and downloading them.

1262
00:47:14,320 --> 00:47:15,862
So that was the only
way I could get.

1263
00:47:15,862 --> 00:47:18,370
So I had about 30,000
studies built up over a year,

1264
00:47:18,370 --> 00:47:21,310
but there's no way
to do bulk download.

1265
00:47:21,310 --> 00:47:23,730
And so again, you've got
to do some grunt work to be

1266
00:47:23,730 --> 00:47:25,570
willing to play this space.

1267
00:47:25,570 --> 00:47:29,495
So we had a fair
number of studies

1268
00:47:29,495 --> 00:47:30,870
we could use in
terms of where we

1269
00:47:30,870 --> 00:47:35,010
had measurements and decent
values in terms of that.

1270
00:47:35,010 --> 00:47:36,570
I think it's
interesting in terms

1271
00:47:36,570 --> 00:47:39,415
of thinking about how good one
can-- how close one can get.

1272
00:47:39,415 --> 00:47:41,040
And one of the things
we found is that,

1273
00:47:41,040 --> 00:47:43,890
when there were big deviations--
these are Bland-Altman plots--

1274
00:47:43,890 --> 00:47:46,170
almost always the
manual ones were wrong.

1275
00:47:46,170 --> 00:47:47,200
AUDIENCE: Why is that?

1276
00:47:47,200 --> 00:47:48,090
RAHUL DEO: Oh, OK.

1277
00:47:48,090 --> 00:47:48,590
OK.

1278
00:47:48,590 --> 00:47:53,340
So Bland-Altman plots, so people
don't like using correlations

1279
00:47:53,340 --> 00:47:54,185
in the medical--

1280
00:47:54,185 --> 00:47:56,310
so Bland and Altman published
a paper in the Lancet

1281
00:47:56,310 --> 00:47:59,190
about 30 years ago
complaining that correlations

1282
00:47:59,190 --> 00:48:01,830
and correlation coefficient are
ultimately not good metrics.

1283
00:48:01,830 --> 00:48:03,940
Because you could have
some substantial bias,

1284
00:48:03,940 --> 00:48:06,480
and really you want to know,
if this is the gold standard,

1285
00:48:06,480 --> 00:48:08,160
you need to get that value.

1286
00:48:08,160 --> 00:48:11,490
So it really is just
looking at differences

1287
00:48:11,490 --> 00:48:15,540
between, let's say, the
reference value and the,

1288
00:48:15,540 --> 00:48:18,360
let's say, automated
value, and then

1289
00:48:18,360 --> 00:48:20,758
plotting that against
the mean of the two.

1290
00:48:20,758 --> 00:48:21,300
So that's it.

1291
00:48:21,300 --> 00:48:23,850
I did it as percentages here,
but ultimately, it's just that.

1292
00:48:23,850 --> 00:48:27,330
It's that you're just
taking the mean of,

1293
00:48:27,330 --> 00:48:29,790
let's, say the left
ventricular volume.

1294
00:48:29,790 --> 00:48:33,450
You have a mean of the automated
versus the manually measured

1295
00:48:33,450 --> 00:48:36,318
one, and then you compare
what the difference is

1296
00:48:36,318 --> 00:48:37,860
of one minus the
other, and so you'll

1297
00:48:37,860 --> 00:48:39,150
be on one side or the other.

1298
00:48:39,150 --> 00:48:41,692
So ideally, you would just be
sitting perfectly on this line,

1299
00:48:41,692 --> 00:48:43,233
and then you're
going to look and see

1300
00:48:43,233 --> 00:48:45,760
whether or not you're clustered
on one side or the other.

1301
00:48:45,760 --> 00:48:48,060
So that's just
the typical thing.

1302
00:48:48,060 --> 00:48:50,280
People try to avoid
correlation coefficients,

1303
00:48:50,280 --> 00:48:52,890
because they kind of consider
them to be not really telling

1304
00:48:52,890 --> 00:48:53,970
you whether or not--

1305
00:48:53,970 --> 00:48:55,970
there really is a gold
standard, and there truly

1306
00:48:55,970 --> 00:48:59,450
is a value here, and you
want to be near that value.

1307
00:48:59,450 --> 00:49:04,670
And so that's the
standard for looking

1308
00:49:04,670 --> 00:49:06,980
at comparison of diagnostics.

1309
00:49:06,980 --> 00:49:09,080
So we had about 8,000 things.

1310
00:49:09,080 --> 00:49:11,417
The reviewers gave us a hard
time for the space up here,

1311
00:49:11,417 --> 00:49:13,250
and there are not that
many studies up here,

1312
00:49:13,250 --> 00:49:14,240
but ultimately, there are some.

1313
00:49:14,240 --> 00:49:16,490
And when we manually looked
at a bunch of them, always

1314
00:49:16,490 --> 00:49:17,990
the manual ones were just wrong.

1315
00:49:17,990 --> 00:49:20,360
Either there is a typo
or something like that,

1316
00:49:20,360 --> 00:49:23,657
so that was reassuring, but
we were sometimes very wrong.

1317
00:49:23,657 --> 00:49:25,490
And you'd find that the
places we'd be wrong

1318
00:49:25,490 --> 00:49:28,640
would be these ridiculously
complex congenital heart

1319
00:49:28,640 --> 00:49:32,390
studies that we had never
been given examples like that

1320
00:49:32,390 --> 00:49:33,360
before.

1321
00:49:33,360 --> 00:49:36,190
So that's a lesson to be learned
is that, sometimes, you're

1322
00:49:36,190 --> 00:49:39,382
going to be really off in
these sorts of approaches,

1323
00:49:39,382 --> 00:49:40,840
and you have to
think a little bit.

1324
00:49:40,840 --> 00:49:41,990
And what we ended
up doing is having

1325
00:49:41,990 --> 00:49:44,448
an interative cycle, where we
would identify those and feed

1326
00:49:44,448 --> 00:49:46,550
them back and of
keep on doing that,

1327
00:49:46,550 --> 00:49:49,750
but that still needs
to be improved upon.

1328
00:49:49,750 --> 00:49:50,300
OK.

1329
00:49:50,300 --> 00:49:55,550
So function, again, there's, a
couple of measures a function.

1330
00:49:55,550 --> 00:49:57,050
There's a company
that has something

1331
00:49:57,050 --> 00:49:58,970
out there in this
space, got FDA approved

1332
00:49:58,970 --> 00:50:00,720
for having an automated
ejection fraction.

1333
00:50:00,720 --> 00:50:02,780
So I think we're better
than their numbers,

1334
00:50:02,780 --> 00:50:04,405
overall, but yeah.

1335
00:50:04,405 --> 00:50:06,530
I think that that's just
one of those things you're

1336
00:50:06,530 --> 00:50:09,090
expected to be able to do.

1337
00:50:09,090 --> 00:50:11,500
And then here's a
problem that we run into.

1338
00:50:11,500 --> 00:50:15,330
So we're comparing to the
status quo which, like I said,

1339
00:50:15,330 --> 00:50:18,220
is one person tracing two
images and comparing them.

1340
00:50:18,220 --> 00:50:19,340
That's it.

1341
00:50:19,340 --> 00:50:24,290
So we're processing potentially
200, 300 different frames

1342
00:50:24,290 --> 00:50:29,210
per study and competing
median, smoothing across.

1343
00:50:29,210 --> 00:50:31,520
We're doing a whole
lot more than that.

1344
00:50:31,520 --> 00:50:34,820
So what do we do about that
in terms of the gold standard?

1345
00:50:34,820 --> 00:50:37,350
And if you just take into
observer variability,

1346
00:50:37,350 --> 00:50:39,080
you're going to
have up to 8% to 9%

1347
00:50:39,080 --> 00:50:41,540
in absolute compared to
60% of the reference.

1348
00:50:41,540 --> 00:50:43,470
So that's horrible.

1349
00:50:43,470 --> 00:50:44,890
So what are you supposed to do?

1350
00:50:44,890 --> 00:50:46,460
And I think so one
thing people do

1351
00:50:46,460 --> 00:50:48,963
is they take multiple readers
and ask them to do that.

1352
00:50:48,963 --> 00:50:50,380
But this is like,
are you're going

1353
00:50:50,380 --> 00:50:51,922
to get a bunch of
cardiologists to do

1354
00:50:51,922 --> 00:50:54,040
like 1,000 studies for you?

1355
00:50:54,040 --> 00:50:57,470
It's very hard to imagine
somebody doing that.

1356
00:50:57,470 --> 00:50:59,210
You could compare it
to another modality.

1357
00:50:59,210 --> 00:51:01,020
So we haven't done this yet,
but you could, for example,

1358
00:51:01,020 --> 00:51:02,840
compare it to MRI and
say whether or not

1359
00:51:02,840 --> 00:51:05,420
you're more consistent
with another modality.

1360
00:51:05,420 --> 00:51:07,100
And then this is
indirect, but you

1361
00:51:07,100 --> 00:51:09,038
can go to like
outcomes in a trial

1362
00:51:09,038 --> 00:51:10,830
and see whether or not
you do a better job.

1363
00:51:10,830 --> 00:51:12,493
So there are things you can do.

1364
00:51:12,493 --> 00:51:13,910
One of the things
we decided to do

1365
00:51:13,910 --> 00:51:16,880
is look for correlations
of structures

1366
00:51:16,880 --> 00:51:22,250
within a study itself
and say, well, the mass--

1367
00:51:22,250 --> 00:51:24,830
so we know that, for
example, thickened hearts

1368
00:51:24,830 --> 00:51:26,583
lead to larger
increases of pressure

1369
00:51:26,583 --> 00:51:27,750
and left atrial enlargement.

1370
00:51:27,750 --> 00:51:29,630
So we can look for correlations
between those things

1371
00:51:29,630 --> 00:51:31,270
and see whether we
do a better job.

1372
00:51:31,270 --> 00:51:33,920
I'd say, for, the most
part we're about on par

1373
00:51:33,920 --> 00:51:35,200
with everything that's there.

1374
00:51:35,200 --> 00:51:36,617
So I don't think
we're any better.

1375
00:51:36,617 --> 00:51:37,575
Sometimes we're better.

1376
00:51:37,575 --> 00:51:38,580
Sometimes we're worse.

1377
00:51:38,580 --> 00:51:40,400
And I think, for the most
part, this was another way

1378
00:51:40,400 --> 00:51:42,740
to try to get at this, because
we were stuck with this.

1379
00:51:42,740 --> 00:51:44,900
How do you work
with a gold standard

1380
00:51:44,900 --> 00:51:46,790
that ultimately I don't
think anybody really

1381
00:51:46,790 --> 00:51:49,130
trusts as a gold standard?

1382
00:51:49,130 --> 00:51:52,970
And this is a problem that
just has to keep on coming up.

1383
00:51:52,970 --> 00:51:54,560
This is just an
example of where you

1384
00:51:54,560 --> 00:51:58,070
could facilitate this idea
of low cost serial imaging

1385
00:51:58,070 --> 00:51:58,860
and point of care.

1386
00:51:58,860 --> 00:52:01,910
So these are patients who
are getting chemotherapy,

1387
00:52:01,910 --> 00:52:05,300
and so so Herceptin-- not
herception, Herceptin,

1388
00:52:05,300 --> 00:52:06,365
it's like inception--

1389
00:52:10,340 --> 00:52:13,160
is an EGFR inhibitor that
causes cardiac toxicity,

1390
00:52:13,160 --> 00:52:15,170
and so people are
getting screening echoes.

1391
00:52:15,170 --> 00:52:17,353
So you could imagine,
if you make it easier

1392
00:52:17,353 --> 00:52:18,770
to acquire and
interpret that, all

1393
00:52:18,770 --> 00:52:20,420
you want to care about is
the function and the size.

1394
00:52:20,420 --> 00:52:21,550
So you can imagine
automating that.

1395
00:52:21,550 --> 00:52:23,420
So we just did this
as proof of concept

1396
00:52:23,420 --> 00:52:26,430
that you could imagine
doing something like this.

1397
00:52:26,430 --> 00:52:29,230
And for the last thing
I want to talk about--

1398
00:52:29,230 --> 00:52:31,010
or sorry, the last
thing in this space--

1399
00:52:31,010 --> 00:52:34,530
is that you could also imagine
directly detecting disease.

1400
00:52:34,530 --> 00:52:37,850
And so you have to say, well,
why is that even worthwhile?

1401
00:52:37,850 --> 00:52:38,389
Yes.

1402
00:52:38,389 --> 00:52:39,389
AUDIENCE: I was curious.

1403
00:52:39,389 --> 00:52:42,049
I guess it's going back
to the idea of if you look

1404
00:52:42,049 --> 00:52:45,375
at blended models
between human groud truth

1405
00:52:45,375 --> 00:52:54,650
and maybe a biological ground
truth, [INAUDIBLE] versus sort

1406
00:52:54,650 --> 00:52:57,500
of what you could get
from an MRI or something--

1407
00:52:57,500 --> 00:53:00,615
or maybe not necessarily an MRI,
but what you were saying based

1408
00:53:00,615 --> 00:53:03,240
on the underlying biology, or if
those two things are generally

1409
00:53:03,240 --> 00:53:03,860
kept separate?

1410
00:53:03,860 --> 00:53:05,720
RAHUL DEO: Yeah.

1411
00:53:05,720 --> 00:53:07,550
These are early days
for a lot of this,

1412
00:53:07,550 --> 00:53:10,040
and I think, anytime you make
anything more complicated,

1413
00:53:10,040 --> 00:53:12,320
then the readers will
give you a hard time,

1414
00:53:12,320 --> 00:53:13,450
but you can imagine that.

1415
00:53:13,450 --> 00:53:15,242
And especially, you
may want to tune things

1416
00:53:15,242 --> 00:53:18,080
to be able to be closer
to something like that.

1417
00:53:18,080 --> 00:53:20,348
So yeah, I think,
unfortunately, people

1418
00:53:20,348 --> 00:53:22,640
are pretty conservative in
terms of how they interpret,

1419
00:53:22,640 --> 00:53:24,890
but it does make some
sense that there's probably

1420
00:53:24,890 --> 00:53:26,060
something that--

1421
00:53:26,060 --> 00:53:29,900
Ideally, you want to be able to
have something that is useful,

1422
00:53:29,900 --> 00:53:33,020
and useful may not be exactly
the same thing as mimicking

1423
00:53:33,020 --> 00:53:34,250
what humans are doing.

1424
00:53:34,250 --> 00:53:35,630
So no, I think it's a good idea.

1425
00:53:35,630 --> 00:53:37,070
And I think that
this is going to be--

1426
00:53:37,070 --> 00:53:39,487
this next wave-- is going to
be thinking a little bit more

1427
00:53:39,487 --> 00:53:41,650
about that in terms
of like how do we

1428
00:53:41,650 --> 00:53:44,150
improve on what's going on
over there, rather than simply

1429
00:53:44,150 --> 00:53:46,940
dragging it back to that?

1430
00:53:46,940 --> 00:53:48,310
OK.

1431
00:53:48,310 --> 00:53:50,028
So there are multiple
rare diseases.

1432
00:53:50,028 --> 00:53:52,070
I use to have a clinic
that would focus on these,

1433
00:53:52,070 --> 00:53:53,653
and they tend to get
missed at centers

1434
00:53:53,653 --> 00:53:54,980
that don't see them that often.

1435
00:53:54,980 --> 00:53:57,140
So one place you
could imagine is

1436
00:53:57,140 --> 00:53:58,940
you can focus on trying
to pick those up,

1437
00:53:58,940 --> 00:54:00,590
and you could imagine, this
could be just surveillance

1438
00:54:00,590 --> 00:54:01,760
running in the background.

1439
00:54:01,760 --> 00:54:06,080
It doesn't have to be kind
of real time identification.

1440
00:54:06,080 --> 00:54:08,510
So there's a few
diseases where it's

1441
00:54:08,510 --> 00:54:10,430
very reasonable to
do these things,

1442
00:54:10,430 --> 00:54:11,490
where it's very obvious.

1443
00:54:11,490 --> 00:54:13,160
So this is a disease called
hypertrophic cardiomyopathy.

1444
00:54:13,160 --> 00:54:14,790
I used to see it in my clinic.

1445
00:54:14,790 --> 00:54:17,660
So abnormally thickened hearts,
leading cause of sudden death

1446
00:54:17,660 --> 00:54:18,560
in young athletes.

1447
00:54:18,560 --> 00:54:24,110
So Reggie Lewis, there's a bunch
of people who've died suddenly

1448
00:54:24,110 --> 00:54:25,970
from this condition.

1449
00:54:25,970 --> 00:54:28,760
Unstable heart rhythm,
sudden death, heart failure,

1450
00:54:28,760 --> 00:54:30,320
it runs in families,
and there are

1451
00:54:30,320 --> 00:54:32,390
things you can do,
if you identified it.

1452
00:54:32,390 --> 00:54:35,180
And so it's actually a fairly
easy task, in the sense

1453
00:54:35,180 --> 00:54:38,130
that it tends to
be quite obvious.

1454
00:54:38,130 --> 00:54:40,460
So we built the classification
model around this,

1455
00:54:40,460 --> 00:54:43,472
and we tried to understand
what it was doing in part.

1456
00:54:43,472 --> 00:54:45,680
And so we tried to do some
of these kind of attention

1457
00:54:45,680 --> 00:54:47,780
or saliency type things, and
they were very unsatisfying,

1458
00:54:47,780 --> 00:54:49,790
in part because I think there's
so many different features

1459
00:54:49,790 --> 00:54:50,895
across the whole image.

1460
00:54:50,895 --> 00:54:52,270
So you're just
getting this blob,

1461
00:54:52,270 --> 00:54:53,870
but I think maybe we just
weren't implementing it

1462
00:54:53,870 --> 00:54:54,370
correctly.

1463
00:54:54,370 --> 00:54:57,740
I'm not really sure, but you
have a left atrium gets bigger.

1464
00:54:57,740 --> 00:54:59,030
The heart gets thicker.

1465
00:54:59,030 --> 00:55:01,790
There's so many changes
across the image.

1466
00:55:01,790 --> 00:55:03,602
It was unsatisfying
in terms of that.

1467
00:55:03,602 --> 00:55:05,060
So we did something
simple and just

1468
00:55:05,060 --> 00:55:06,620
took the output of
the probabilities

1469
00:55:06,620 --> 00:55:08,510
and compared it to
some simple things

1470
00:55:08,510 --> 00:55:10,580
that we actually know
about these things

1471
00:55:10,580 --> 00:55:12,980
and found that there was
some degree of correlation.

1472
00:55:12,980 --> 00:55:16,520
But I would like to make
that a little bit better.

1473
00:55:16,520 --> 00:55:18,620
Cardiac amyloid, a very
popular disease for which

1474
00:55:18,620 --> 00:55:20,128
there are now therapies.

1475
00:55:20,128 --> 00:55:22,670
And so pharma is very interested
in identifying these people,

1476
00:55:22,670 --> 00:55:24,712
and they really get missed
at a pretty high rate.

1477
00:55:24,712 --> 00:55:26,990
So we built another
model for this.

1478
00:55:26,990 --> 00:55:29,420
Usually, we had about
250 or 300 cases

1479
00:55:29,420 --> 00:55:33,652
for each of these things and
maybe a few thousand controls.

1480
00:55:33,652 --> 00:55:35,360
And then this one's
a little interesting.

1481
00:55:35,360 --> 00:55:37,550
This is mitral valve prolapse.

1482
00:55:37,550 --> 00:55:41,830
So this is what a
prolapsing valve looks like.

1483
00:55:41,830 --> 00:55:45,980
If you imagine the plane of the
valve here, it buckles back.

1484
00:55:45,980 --> 00:55:50,425
So it does this,
and that's abnormal,

1485
00:55:50,425 --> 00:55:51,550
and this is a normal valve.

1486
00:55:51,550 --> 00:55:53,520
So you notice, it
doesn't buckle back in.

1487
00:55:53,520 --> 00:55:55,040
So it's a little interesting
in that there's really

1488
00:55:55,040 --> 00:55:57,260
only one part of the cardiac
cycle that would really

1489
00:55:57,260 --> 00:55:59,920
highlight this abnormality,
at least that's the way that--

1490
00:55:59,920 --> 00:56:01,910
so the way that
it's read clinically

1491
00:56:01,910 --> 00:56:04,490
is people wait for this one
part of the cardiac cycle

1492
00:56:04,490 --> 00:56:05,810
where it's buckled back.

1493
00:56:05,810 --> 00:56:07,440
They draw an
imaginary line across,

1494
00:56:07,440 --> 00:56:09,440
and they measure what the
displacement is there,

1495
00:56:09,440 --> 00:56:11,450
and so we built a
reasonable model focusing.

1496
00:56:11,450 --> 00:56:13,013
So we phased these
images and picked

1497
00:56:13,013 --> 00:56:14,930
the part of the cardiac
cycle, those relevant,

1498
00:56:14,930 --> 00:56:16,638
all in an automated
way and built a model

1499
00:56:16,638 --> 00:56:20,990
around that and pretty good, in
terms of being able to do that,

1500
00:56:20,990 --> 00:56:23,700
in terms of being
in detect that.

1501
00:56:23,700 --> 00:56:24,200
Yes.

1502
00:56:24,200 --> 00:56:27,460
AUDIENCE: And so is this model
on images at a certain time?

1503
00:56:27,460 --> 00:56:28,687
Like can you just go back?

1504
00:56:28,687 --> 00:56:30,520
Because obviously, you
weren't doing videos.

1505
00:56:30,520 --> 00:56:31,170
Right?

1506
00:56:31,170 --> 00:56:32,630
RAHUL DEO: Well, so we
would take the whole video.

1507
00:56:32,630 --> 00:56:33,980
We were segmenting it.

1508
00:56:33,980 --> 00:56:36,920
We were phasing it, figuring
out what the part of the--

1509
00:56:36,920 --> 00:56:38,360
when was the end
systole in that,

1510
00:56:38,360 --> 00:56:41,330
and then using those as the--
so using a stack of those

1511
00:56:41,330 --> 00:56:42,377
to be able to classify.

1512
00:56:42,377 --> 00:56:44,210
AUDIENCE: So how do you
know the time point?

1513
00:56:44,210 --> 00:56:45,668
RAHUL DEO: Well,
that's I'm saying.

1514
00:56:45,668 --> 00:56:47,653
So we we're using the
variation in the volumes.

1515
00:56:47,653 --> 00:56:48,710
AUDIENCE: The
segmentation would allow

1516
00:56:48,710 --> 00:56:50,220
you to know the time point.

1517
00:56:50,220 --> 00:56:54,470
RAHUL DEO: Exactly, because so
a typical echo will have an ECG

1518
00:56:54,470 --> 00:56:56,300
to use to gate, but
the handhelds don't.

1519
00:56:56,300 --> 00:56:58,400
So we want to move away
from the things that

1520
00:56:58,400 --> 00:57:01,040
involve the fanciness and
all the bells and whistles.

1521
00:57:01,040 --> 00:57:03,278
We're trying to
use the image alone

1522
00:57:03,278 --> 00:57:04,820
to be able to tell
the cardiac cycle.

1523
00:57:04,820 --> 00:57:06,450
So that's how we did it.

1524
00:57:06,450 --> 00:57:07,870
Yes.

1525
00:57:07,870 --> 00:57:10,280
AUDIENCE: So you
mentioned handhelds.

1526
00:57:10,280 --> 00:57:12,780
With the ultrasounds
[INAUDIBLE],,

1527
00:57:12,780 --> 00:57:14,090
are they different from these?

1528
00:57:14,090 --> 00:57:16,510
RAHUL DEO: They
look pretty similar.

1529
00:57:16,510 --> 00:57:19,340
We got some now, and
they look pretty similar

1530
00:57:19,340 --> 00:57:21,380
in terms of the
quality of the images,

1531
00:57:21,380 --> 00:57:23,810
and you can acquire
the very same view.

1532
00:57:23,810 --> 00:57:27,413
So I think we haven't shown that
we can do it off those, in part

1533
00:57:27,413 --> 00:57:29,330
because there just isn't
enough training data.

1534
00:57:29,330 --> 00:57:32,630
But they look pretty
nice, and I know at UCSF

1535
00:57:32,630 --> 00:57:35,080
and at Brigham, all the
fellows are using it.

1536
00:57:35,080 --> 00:57:38,330
It looks pretty much the same in
terms of the-- the transducers

1537
00:57:38,330 --> 00:57:40,450
are similar, and image
quality is very good.

1538
00:57:40,450 --> 00:57:41,450
Resolution is very good.

1539
00:57:41,450 --> 00:57:43,850
Frame rate probably doesn't
get up as high necessarily,

1540
00:57:43,850 --> 00:57:47,750
but for the most part, I don't
think it's that different.

1541
00:57:47,750 --> 00:57:50,640
So that is the next phase.

1542
00:57:50,640 --> 00:57:51,400
Yes.

1543
00:57:51,400 --> 00:57:52,793
AUDIENCE: Could you comment on--

1544
00:57:52,793 --> 00:57:54,993
so you mentioned how each
of these three examples

1545
00:57:54,993 --> 00:57:56,910
could be used within a
surveillance algorithm.

1546
00:57:56,910 --> 00:57:57,577
RAHUL DEO: Yeah.

1547
00:57:57,577 --> 00:57:59,680
AUDIENCE: Could you
comment on where

1548
00:57:59,680 --> 00:58:02,647
along this true positive,
false positive trade-off

1549
00:58:02,647 --> 00:58:04,480
you would actually be
realistic to use this?

1550
00:58:04,480 --> 00:58:04,880
RAHUL DEO: Yeah.

1551
00:58:04,880 --> 00:58:05,713
That's a good point.

1552
00:58:05,713 --> 00:58:07,880
I think it would vary for
every single one of those,

1553
00:58:07,880 --> 00:58:10,060
and you really want to have
some costs on what the--

1554
00:58:10,060 --> 00:58:14,500
so I would typically err on
the side of higher sensitivity

1555
00:58:14,500 --> 00:58:19,100
and dump it on the
cardiologists to be able to--

1556
00:58:19,100 --> 00:58:23,550
so I would work, but I think
you have to pick some--

1557
00:58:23,550 --> 00:58:25,152
let's say, you're
a product manager.

1558
00:58:25,152 --> 00:58:27,360
AUDIENCE: Just choose one
of these three, and maybe--

1559
00:58:27,360 --> 00:58:28,570
RAHUL DEO: OK.

1560
00:58:28,570 --> 00:58:29,860
Yeah.

1561
00:58:29,860 --> 00:58:32,440
So this is a pretty
rare disease.

1562
00:58:32,440 --> 00:58:36,770
So your priors are pretty low
in terms of these individuals.

1563
00:58:36,770 --> 00:58:39,760
And so I think you
probably would probably

1564
00:58:39,760 --> 00:58:46,330
want to err somewhere
along this area here,

1565
00:58:46,330 --> 00:58:50,110
and so just working
on what the--

1566
00:58:50,110 --> 00:58:53,830
so you probably will still
be a relatively high rate

1567
00:58:53,830 --> 00:58:56,120
of false positives
even that space.

1568
00:58:56,120 --> 00:59:01,810
But I would argue that it would
take the treating cardiologist

1569
00:59:01,810 --> 00:59:04,850
potentially just a few minutes
to look at that study again,

1570
00:59:04,850 --> 00:59:06,850
and if you picked up one
of those patients, that

1571
00:59:06,850 --> 00:59:08,147
would be a big win.

1572
00:59:08,147 --> 00:59:10,480
So I think that the cost
probably wouldn't be that high,

1573
00:59:10,480 --> 00:59:13,290
and you just have
to make the case.

1574
00:59:13,290 --> 00:59:15,840
So therapy for
amyloid, for example,

1575
00:59:15,840 --> 00:59:18,150
this is a nice sharp
up stroke there.

1576
00:59:18,150 --> 00:59:21,552
There's new drugs
out there that are

1577
00:59:21,552 --> 00:59:23,260
sort of begging for
patients, and they're

1578
00:59:23,260 --> 00:59:25,138
having a real hard
time identifying them.

1579
00:59:25,138 --> 00:59:26,680
So you could imagine
again, it's sort

1580
00:59:26,680 --> 00:59:29,830
of a calculus based on
what the benefits would

1581
00:59:29,830 --> 00:59:32,020
be for that identification
and what burden you're

1582
00:59:32,020 --> 00:59:35,460
placing on the individuals to
have to over read something.

1583
00:59:35,460 --> 00:59:37,210
And you could probably
tune that depending

1584
00:59:37,210 --> 00:59:42,070
on what the disease is and
who you're pitching it to.

1585
00:59:42,070 --> 00:59:44,230
But you're right, you're
going to crush people

1586
00:59:44,230 --> 00:59:47,530
if like 1 in 100 ends up
taking a true positive then

1587
00:59:47,530 --> 00:59:50,015
you're not going
to get many fans.

1588
00:59:50,015 --> 00:59:50,515
Yes.

1589
00:59:50,515 --> 00:59:53,930
AUDIENCE: Could you comment
on whether, for example,

1590
00:59:53,930 --> 00:59:56,160
[INAUDIBLE] basis,
the ones that you're

1591
00:59:56,160 --> 00:59:59,670
able to predict very
well at that point

1592
00:59:59,670 --> 01:00:03,470
you just chose what
distinguishes the ones that

1593
01:00:03,470 --> 01:00:04,753
are defined well?

1594
01:00:04,753 --> 01:00:06,170
RAHUL DEO: So
that's a good point,

1595
01:00:06,170 --> 01:00:10,060
and I don't really
know in the sense

1596
01:00:10,060 --> 01:00:11,860
that I haven't
looked that closely.

1597
01:00:11,860 --> 01:00:17,110
But I'm going to guess, they're
very thick and very obvious

1598
01:00:17,110 --> 01:00:18,910
in that sort of sense.

1599
01:00:18,910 --> 01:00:22,295
So we have a ECG model that
may pick this up early.

1600
01:00:22,295 --> 01:00:23,920
What you want is
something to fix it up

1601
01:00:23,920 --> 01:00:26,230
when it's treatable, not
having something that's

1602
01:00:26,230 --> 01:00:27,460
ridiculously exaggerated.

1603
01:00:27,460 --> 01:00:29,800
So you may need multiple
modalities some of which

1604
01:00:29,800 --> 01:00:33,310
are more sensitive than others
that can catch earlier stage

1605
01:00:33,310 --> 01:00:34,827
disease to be able to do that.

1606
01:00:34,827 --> 01:00:36,910
So there are interesting
things about this disease

1607
01:00:36,910 --> 01:00:37,493
in particular.

1608
01:00:37,493 --> 01:00:40,770
So cataracts sometimes
happen before--

1609
01:00:40,770 --> 01:00:43,535
so ideally, the way you do
this is-- and I'm actually

1610
01:00:43,535 --> 01:00:45,160
consulting around
something like this--

1611
01:00:45,160 --> 01:00:49,780
you ideally want a mixture
of electronic health record,

1612
01:00:49,780 --> 01:00:52,870
something from other findings--
mirror findings, eye findings,

1613
01:00:52,870 --> 01:00:54,785
plus maybe something
cardiac plus

1614
01:00:54,785 --> 01:00:56,410
and have something
that ideally catches

1615
01:00:56,410 --> 01:00:58,660
the disease in the ideal
most treated state.

1616
01:00:58,660 --> 01:01:00,130
And maybe echo's
not the best one,

1617
01:01:00,130 --> 01:01:04,070
and I think that we'll come
back to that at the end.

1618
01:01:04,070 --> 01:01:05,350
We have a little bit of time.

1619
01:01:05,350 --> 01:01:05,850
OK.

1620
01:01:08,260 --> 01:01:10,598
So UCSF is filing--

1621
01:01:10,598 --> 01:01:11,140
I don't know.

1622
01:01:11,140 --> 01:01:12,890
I don't think this is
actually patentable,

1623
01:01:12,890 --> 01:01:15,340
but they are filing
for a patent.

1624
01:01:15,340 --> 01:01:18,310
I'm just filling the paperwork
out today in terms of--

1625
01:01:18,310 --> 01:01:19,600
I don't know.

1626
01:01:19,600 --> 01:01:24,730
But my code is all
freely available anyway,

1627
01:01:24,730 --> 01:01:27,078
for academic, non-profit
use, and they're just

1628
01:01:27,078 --> 01:01:28,120
trying to make it better.

1629
01:01:28,120 --> 01:01:31,030
I think, ultimately, my
view as an academic here is

1630
01:01:31,030 --> 01:01:32,770
to try to show what's possible.

1631
01:01:32,770 --> 01:01:35,410
And then, if you want to
get a commercial product,

1632
01:01:35,410 --> 01:01:37,853
then you need people to
weigh in on the industry side

1633
01:01:37,853 --> 01:01:40,270
and make something pretty and
make it usable and all that.

1634
01:01:40,270 --> 01:01:42,010
But I think,
ultimately, I'm trying

1635
01:01:42,010 --> 01:01:44,830
to just show, hey, if we could
do this in a scalable way

1636
01:01:44,830 --> 01:01:46,540
and find out something
new, then you guys

1637
01:01:46,540 --> 01:01:48,430
can catch up and
do something that

1638
01:01:48,430 --> 01:01:51,050
ultimately can be deployed.

1639
01:01:51,050 --> 01:01:53,745
And what's interesting is I have
a collaborator in New Zealand.

1640
01:01:53,745 --> 01:01:55,120
There, they're
are resource poor.

1641
01:01:55,120 --> 01:01:56,948
So they have a huge
backlog of patients.

1642
01:01:56,948 --> 01:01:58,490
They don't have
enough stenographers,

1643
01:01:58,490 --> 01:02:00,198
and they don't have
enough cardiologists.

1644
01:02:00,198 --> 01:02:02,410
So they're trying to
implement this super ultra

1645
01:02:02,410 --> 01:02:06,680
quick five-minute study
and then have automation.

1646
01:02:06,680 --> 01:02:10,857
And so they want our accuracy
to be a little bit better,

1647
01:02:10,857 --> 01:02:12,440
but I think they're
ready to roll out,

1648
01:02:12,440 --> 01:02:15,790
if we're able to get something
that has probably more training

1649
01:02:15,790 --> 01:02:16,290
data.

1650
01:02:16,290 --> 01:02:16,920
Yes.

1651
01:02:16,920 --> 01:02:18,453
Are you from New Zealand?

1652
01:02:18,453 --> 01:02:18,995
AUDIENCE: No.

1653
01:02:18,995 --> 01:02:23,228
I think you started talking
about the trade-off between

1654
01:02:23,228 --> 01:02:24,710
accuracy and--

1655
01:02:24,710 --> 01:02:27,674
so in academia, I get the
sense that they're always

1656
01:02:27,674 --> 01:02:29,173
chasing perfect accuracy.

1657
01:02:29,173 --> 01:02:29,840
RAHUL DEO: Yeah.

1658
01:02:29,840 --> 01:02:31,215
AUDIENCE: But as
you said, you're

1659
01:02:31,215 --> 01:02:35,430
not going to get rid of
cardiologists in the diagnosis.

1660
01:02:35,430 --> 01:02:37,630
So I have a
philosophical question

1661
01:02:37,630 --> 01:02:40,940
of are you chasing
the wrong thing?

1662
01:02:40,940 --> 01:02:45,243
Should we chase
perfect accuracy?

1663
01:02:45,243 --> 01:02:45,910
RAHUL DEO: Yeah.

1664
01:02:45,910 --> 01:02:48,500
So the question is around
what should our goals be?

1665
01:02:51,420 --> 01:02:57,470
So should we be just chasing
after a level of accuracy

1666
01:02:57,470 --> 01:03:00,620
that may be either very,
very difficult to attain?

1667
01:03:00,620 --> 01:03:03,800
And especially, if there's never
a scenario where there'll be

1668
01:03:03,800 --> 01:03:06,680
no clinician involved,
should we instead

1669
01:03:06,680 --> 01:03:08,450
be thinking about
something that gets good

1670
01:03:08,450 --> 01:03:09,590
enough to that next step?

1671
01:03:09,590 --> 01:03:11,215
And I think that's
a really good point.

1672
01:03:15,230 --> 01:03:16,430
And what's interesting is--

1673
01:03:16,430 --> 01:03:18,513
and also it's interesting
from the industry side--

1674
01:03:18,513 --> 01:03:21,260
is the field starts
with the mimicking mode,

1675
01:03:21,260 --> 01:03:24,200
because it's much harder
to change practice.

1676
01:03:24,200 --> 01:03:28,158
It's much easier to just pop
something in and say, hey,

1677
01:03:28,158 --> 01:03:29,950
I know you have to make
these measurements.

1678
01:03:29,950 --> 01:03:32,210
Let me make them for you,
and you could look at them

1679
01:03:32,210 --> 01:03:33,532
and see if you agree.

1680
01:03:33,532 --> 01:03:34,490
So that's what ECGs do.

1681
01:03:34,490 --> 01:03:34,990
Right?

1682
01:03:34,990 --> 01:03:38,100
So nobody these days is
measuring the QR rests width.

1683
01:03:38,100 --> 01:03:38,990
Nobody does that.

1684
01:03:38,990 --> 01:03:39,878
That's just not done.

1685
01:03:39,878 --> 01:03:42,170
If you've got a number that's
absurd, you'll change it.

1686
01:03:42,170 --> 01:03:44,420
But for the most part, you're
like, it's close enough,

1687
01:03:44,420 --> 01:03:46,560
but you almost have
to start with that.

1688
01:03:46,560 --> 01:03:49,590
To do something
that's transformative

1689
01:03:49,590 --> 01:03:51,740
is very hard to do.

1690
01:03:51,740 --> 01:03:53,260
So I think something
that involves--

1691
01:03:53,260 --> 01:03:54,635
and I talked to
David about this.

1692
01:03:54,635 --> 01:03:57,800
It's sort of like the
man-machine interface is

1693
01:03:57,800 --> 01:03:59,780
fascinating to think
about how do we together

1694
01:03:59,780 --> 01:04:01,130
come up with something better?

1695
01:04:01,130 --> 01:04:04,310
But it's just much harder to
get that adopted, because it

1696
01:04:04,310 --> 01:04:07,190
requires buy-in in a way
that's different than just

1697
01:04:07,190 --> 01:04:10,610
you do my work for me, but
more that we come together

1698
01:04:10,610 --> 01:04:12,073
to do something better.

1699
01:04:12,073 --> 01:04:14,240
And I think that's going
to be interesting as to how

1700
01:04:14,240 --> 01:04:16,070
to chip away at that problem.

1701
01:04:20,270 --> 01:04:20,770
OK.

1702
01:04:20,770 --> 01:04:22,490
So a couple of
musings, then I'm going

1703
01:04:22,490 --> 01:04:24,948
to talk a little bit about One
Brave Idea, if we have time,

1704
01:04:24,948 --> 01:04:27,522
or I can stop and take
questions instead,

1705
01:04:27,522 --> 01:04:29,480
because it's a little
bit of a biology venture.

1706
01:04:29,480 --> 01:04:30,020
OK.

1707
01:04:30,020 --> 01:04:33,272
So I do think that we
should really look.

1708
01:04:33,272 --> 01:04:35,480
People give me a hard time
around echo, and I'm like,

1709
01:04:35,480 --> 01:04:37,100
well, ECG's been
around for a long time,

1710
01:04:37,100 --> 01:04:38,308
and there's automation there.

1711
01:04:38,308 --> 01:04:40,100
So let's think about
how it's used there,

1712
01:04:40,100 --> 01:04:41,575
and then see whether or not--

1713
01:04:41,575 --> 01:04:43,200
it's not as outlandish
as people think.

1714
01:04:43,200 --> 01:04:45,117
So I think a lot of these
routine measurements

1715
01:04:45,117 --> 01:04:48,625
are just going to be
done in an automated way.

1716
01:04:48,625 --> 01:04:51,000
Already in our software, you
can put out a little picture

1717
01:04:51,000 --> 01:04:53,060
and overlay the segmentation
on the original image

1718
01:04:53,060 --> 01:04:54,143
and say how good it looks.

1719
01:04:54,143 --> 01:04:54,830
So that's easy.

1720
01:04:54,830 --> 01:04:56,200
So you can do that.

1721
01:04:56,200 --> 01:04:59,540
And then this kind of idea
of point of care automated

1722
01:04:59,540 --> 01:05:01,880
diagnoses can make
some sense around

1723
01:05:01,880 --> 01:05:03,470
some emergency-type situations.

1724
01:05:03,470 --> 01:05:06,380
So maybe you need a
quick check of function.

1725
01:05:06,380 --> 01:05:08,030
Maybe you want to
know if they have

1726
01:05:08,030 --> 01:05:10,447
a lot of fluid around the
heart, and you don't necessarily

1727
01:05:10,447 --> 01:05:11,090
want to wait.

1728
01:05:11,090 --> 01:05:12,465
So those will be
the places where

1729
01:05:12,465 --> 01:05:15,088
there may be some
kind of innovations

1730
01:05:15,088 --> 01:05:16,880
around just getting
something done quickly.

1731
01:05:16,880 --> 01:05:18,380
And then you always
have somebody checking

1732
01:05:18,380 --> 01:05:19,980
in the background,
layer on, a little

1733
01:05:19,980 --> 01:05:21,860
the heart attack
thing I showed you,

1734
01:05:21,860 --> 01:05:23,730
and I think this problem
in echo is there.

1735
01:05:23,730 --> 01:05:26,287
And so if you need
skilled people

1736
01:05:26,287 --> 01:05:28,370
to be able to acquire the
data in the first place,

1737
01:05:28,370 --> 01:05:31,190
you're stuck, because
they can read an echo.

1738
01:05:31,190 --> 01:05:33,720
A really good stenography can
read the whole study for you.

1739
01:05:33,720 --> 01:05:35,750
So if you already have
that person involved

1740
01:05:35,750 --> 01:05:38,090
in the pipeline,
then it's really hard

1741
01:05:38,090 --> 01:05:41,683
to introduce a big advance.

1742
01:05:41,683 --> 01:05:43,850
So you need to figure out
how to take a primary care

1743
01:05:43,850 --> 01:05:46,320
doc off the street, put
a machine in their hand,

1744
01:05:46,320 --> 01:05:48,320
and let them get the image
and then automate all

1745
01:05:48,320 --> 01:05:49,550
the interpretation for them.

1746
01:05:49,550 --> 01:05:52,610
And so until you can task
shift into that space,

1747
01:05:52,610 --> 01:05:55,850
you're stuck with having still
too high a level of skill.

1748
01:05:55,850 --> 01:05:58,170
So there are these companies
that are in the space now,

1749
01:05:58,170 --> 01:06:00,280
and there's a few
that are trying.

1750
01:06:00,280 --> 01:06:03,440
It's easy to imagine, if you
can train a neural network

1751
01:06:03,440 --> 01:06:06,170
to classify a view,
you could get it to--

1752
01:06:06,170 --> 01:06:07,948
this gets to this
idea of registration

1753
01:06:07,948 --> 01:06:10,490
a little bit-- you can recognize
if you're off by 10 degrees,

1754
01:06:10,490 --> 01:06:11,510
or if you need a translation.

1755
01:06:11,510 --> 01:06:13,635
You could just train a
model to be able to do that.

1756
01:06:13,635 --> 01:06:15,830
So I think that's already
happening right now.

1757
01:06:15,830 --> 01:06:19,100
So it's a question as to whether
that will get adopted or not,

1758
01:06:19,100 --> 01:06:20,720
but I think that,
ultimately, if you

1759
01:06:20,720 --> 01:06:24,320
want to get shifting towards
sort of less skilled personnel,

1760
01:06:24,320 --> 01:06:26,460
you need to do
something in that space.

1761
01:06:26,460 --> 01:06:26,960
OK.

1762
01:06:26,960 --> 01:06:28,793
So this is where it
gets a little bit harder

1763
01:06:28,793 --> 01:06:31,940
is to think about how to make
stuff and elevate medicine

1764
01:06:31,940 --> 01:06:34,810
beyond what we're doing.

1765
01:06:34,810 --> 01:06:36,320
And this gets back
to this problem

1766
01:06:36,320 --> 01:06:38,460
I mentioned is, at
the end of the day,

1767
01:06:38,460 --> 01:06:41,990
you can't find
new uses for echo,

1768
01:06:41,990 --> 01:06:43,700
unless the data is
already there for you

1769
01:06:43,700 --> 01:06:45,450
to be able to show
that there's more value

1770
01:06:45,450 --> 01:06:48,110
than there currently is, sort
of this chicken and egg thing.

1771
01:06:48,110 --> 01:06:51,830
So in some sense, what I
hope to introduce in some way

1772
01:06:51,830 --> 01:06:54,180
that we can get much
bigger data sets,

1773
01:06:54,180 --> 01:06:57,310
and they don't have to
be 100 video data sets.

1774
01:06:57,310 --> 01:06:59,160
They can be three
video data sets,

1775
01:06:59,160 --> 01:07:01,005
but we want to be
able to figure out

1776
01:07:01,005 --> 01:07:02,880
how to enable more and
more of these studies.

1777
01:07:02,880 --> 01:07:04,752
So then you can sort
of imagine learning

1778
01:07:04,752 --> 01:07:05,960
many more complicated things.

1779
01:07:05,960 --> 01:07:07,863
You want to track
people over time.

1780
01:07:07,863 --> 01:07:09,530
You want to look at
treatment responses.

1781
01:07:09,530 --> 01:07:11,780
So you've got to look at
where the money is already

1782
01:07:11,780 --> 01:07:13,550
and see who could do this.

1783
01:07:13,550 --> 01:07:15,290
So pharma companies
are interested,

1784
01:07:15,290 --> 01:07:17,930
because they have
these phase II trials.

1785
01:07:17,930 --> 01:07:19,910
They may only have three
months or six months

1786
01:07:19,910 --> 01:07:22,850
to show some benefit
for a drug, and they're

1787
01:07:22,850 --> 01:07:24,770
really interested in
seeing whether there's

1788
01:07:24,770 --> 01:07:26,950
differences after a month,
two months, three months, four

1789
01:07:26,950 --> 01:07:27,450
months.

1790
01:07:27,450 --> 01:07:29,420
So that may be a
place where you get--

1791
01:07:29,420 --> 01:07:31,380
and they're being frugal,
but they have money.

1792
01:07:31,380 --> 01:07:32,797
So you could
imagine, if you could

1793
01:07:32,797 --> 01:07:37,940
introduce this pipeline in there
and just have handheld, simple,

1794
01:07:37,940 --> 01:07:40,850
quick to acquire, far
more frequency, and you

1795
01:07:40,850 --> 01:07:43,620
show a treatment response, and
that's kind of transformative

1796
01:07:43,620 --> 01:07:43,790
then.

1797
01:07:43,790 --> 01:07:44,810
Because then, you
could imagine, that

1798
01:07:44,810 --> 01:07:46,560
can get rolled out in
practice after that.

1799
01:07:46,560 --> 01:07:48,902
So you need somebody to
bankroll this to start with,

1800
01:07:48,902 --> 01:07:51,110
and then you could imagine,
once you have a use case,

1801
01:07:51,110 --> 01:07:53,030
then you could imagine
it getting much more.

1802
01:07:53,030 --> 01:07:54,710
And this idea of
surveillance, you

1803
01:07:54,710 --> 01:07:57,210
could imagine that would be
very doable, that you could just

1804
01:07:57,210 --> 01:07:58,695
have something taking--

1805
01:07:58,695 --> 01:08:01,070
The problem is, you can even
get the data in the archives

1806
01:08:01,070 --> 01:08:02,600
anyway, but let's
say you can get that.

1807
01:08:02,600 --> 01:08:04,820
You could just have this
system looking for amyloid,

1808
01:08:04,820 --> 01:08:06,740
looking for whatever,
and that would be a win

1809
01:08:06,740 --> 01:08:09,200
too is to be able to imagine
doing something like that.

1810
01:08:09,200 --> 01:08:11,720
It's not putting any pressure
on the clinical workflow.

1811
01:08:11,720 --> 01:08:13,107
It's not making
anybody look bad.

1812
01:08:13,107 --> 01:08:15,440
I think, ultimately, it's
trying to just figure out if--

1813
01:08:15,440 --> 01:08:17,609
well, maybe somebody
may be looking bad

1814
01:08:17,609 --> 01:08:19,250
if they miss
something, but yeah.

1815
01:08:19,250 --> 01:08:23,450
I think it is just trying
to identify individuals.

1816
01:08:23,450 --> 01:08:25,910
And so this is an area
I think that's hard,

1817
01:08:25,910 --> 01:08:27,529
and so this kind
of idea, this is

1818
01:08:27,529 --> 01:08:29,779
where I started a little
bit, around this kind of idea

1819
01:08:29,779 --> 01:08:31,880
of this disease
subclassification and risk

1820
01:08:31,880 --> 01:08:32,810
models.

1821
01:08:32,810 --> 01:08:35,939
And so that's like more
sophisticated than anything

1822
01:08:35,939 --> 01:08:36,439
we're doing.

1823
01:08:36,439 --> 01:08:39,040
I think we're pretty crude
at this kind of stuff,

1824
01:08:39,040 --> 01:08:42,260
but one of the
challenges is people just

1825
01:08:42,260 --> 01:08:46,550
aren't interested in new
categories or new risk models,

1826
01:08:46,550 --> 01:08:50,890
if they don't have some way
that they can change practice.

1827
01:08:50,890 --> 01:08:54,319
And that becomes more
difficult, because then you

1828
01:08:54,319 --> 01:08:56,420
need to not only
introduce the model,

1829
01:08:56,420 --> 01:08:58,640
you need to show
how incorporating

1830
01:08:58,640 --> 01:09:01,700
that model in some way is
able to either identify

1831
01:09:01,700 --> 01:09:03,080
people who respond.

1832
01:09:03,080 --> 01:09:04,680
It always comes
down to therapies

1833
01:09:04,680 --> 01:09:05,597
at the end of the day.

1834
01:09:05,597 --> 01:09:08,870
So can you tell me some subclass
of people who will do better

1835
01:09:08,870 --> 01:09:10,670
on this drug, which
means that you

1836
01:09:10,670 --> 01:09:13,399
have to have trial data that
has all those people with all

1837
01:09:13,399 --> 01:09:14,167
that data.

1838
01:09:14,167 --> 01:09:16,250
And unfortunately, because
echoes are so expensive

1839
01:09:16,250 --> 01:09:19,402
and places like the Brigham
charge like $3,000 per echo,

1840
01:09:19,402 --> 01:09:20,819
then you only have
like 100 people

1841
01:09:20,819 --> 01:09:23,111
who have an echo in a trial
or 300 people have an echo.

1842
01:09:23,111 --> 01:09:26,819
You have a 5,000 person trial,
and 5% of them have an echo.

1843
01:09:26,819 --> 01:09:29,700
So you need to change the way
that gets done, because you're

1844
01:09:29,700 --> 01:09:33,270
massively underpowered to be
able to detect anything that's

1845
01:09:33,270 --> 01:09:36,630
sort of a subgroup
within that kind of work.

1846
01:09:36,630 --> 01:09:39,300
So yeah, unfortunately,
the research pace of things

1847
01:09:39,300 --> 01:09:42,510
outpaces the change in
practice in terms of the space,

1848
01:09:42,510 --> 01:09:46,319
until we're able to enable
more data collection.

1849
01:09:46,319 --> 01:09:47,760
So I can stop there.

1850
01:09:47,760 --> 01:09:50,355
I was going to talk about
blood cells in slides.

1851
01:09:50,355 --> 01:09:52,163
PROFESSOR: We can
take some questions.

1852
01:09:52,163 --> 01:09:52,830
RAHUL DEO: Yeah.

1853
01:09:52,830 --> 01:09:53,040
Yeah.

1854
01:09:53,040 --> 01:09:53,250
Yeah.

1855
01:09:53,250 --> 01:09:53,479
OK.

1856
01:09:53,479 --> 01:09:54,354
Why don't we do that.

1857
01:09:57,110 --> 01:09:57,610
Yes.

1858
01:10:00,370 --> 01:10:04,480
AUDIENCE: When CT
reconstruction started,

1859
01:10:04,480 --> 01:10:08,510
I remember seeing some papers
where people said, well,

1860
01:10:08,510 --> 01:10:11,690
we know roughly what to the
anatomy should look like,

1861
01:10:11,690 --> 01:10:14,930
and so we can fill
in missing details.

1862
01:10:14,930 --> 01:10:18,902
In those days, the
slices were run before,

1863
01:10:18,902 --> 01:10:22,073
and so they would hallucinate
what the structure looked like.

1864
01:10:22,073 --> 01:10:22,740
RAHUL DEO: Yeah.

1865
01:10:22,740 --> 01:10:25,730
AUDIENCE: And of course, that
has the benefit of giving you

1866
01:10:25,730 --> 01:10:28,100
a better model, but
it also does risk

1867
01:10:28,100 --> 01:10:30,690
that it's hallucinated data.

1868
01:10:30,690 --> 01:10:34,810
Have you guys tried doing
that with some of the--

1869
01:10:34,810 --> 01:10:35,560
RAHUL DEO: Yeah.

1870
01:10:35,560 --> 01:10:36,500
That's a great point.

1871
01:10:36,500 --> 01:10:37,630
So OK.

1872
01:10:37,630 --> 01:10:40,780
So the question was
so cardiac imaging has

1873
01:10:40,780 --> 01:10:43,920
a very long history, and so
there was a period of time

1874
01:10:43,920 --> 01:10:45,820
where there's these
kind of active modelers

1875
01:10:45,820 --> 01:10:48,370
around morphologies
of the heart.

1876
01:10:48,370 --> 01:10:50,710
And so people had these
models around what

1877
01:10:50,710 --> 01:10:53,480
the heart should look like
from many, many, many studies.

1878
01:10:53,480 --> 01:10:55,480
And they were using that,
back at the time, when

1879
01:10:55,480 --> 01:10:59,560
you had these relatively coarse
multi-slice scanners for a CT,

1880
01:10:59,560 --> 01:11:02,800
they would reconstruct
the 3D image of the heart

1881
01:11:02,800 --> 01:11:06,040
based on some pre-existing
geometric model for what

1882
01:11:06,040 --> 01:11:07,310
the heart should look like.

1883
01:11:07,310 --> 01:11:08,650
And there's, of course,
a benefit to that,

1884
01:11:08,650 --> 01:11:10,317
but some risk in the
sense that somebody

1885
01:11:10,317 --> 01:11:12,555
may be very different in
the space that's missing.

1886
01:11:12,555 --> 01:11:14,680
And so the question is
whether those kind of priors

1887
01:11:14,680 --> 01:11:17,560
can be introduced
in some way, and it

1888
01:11:17,560 --> 01:11:23,290
hasn't been straightforward
as to how to do that.

1889
01:11:23,290 --> 01:11:25,357
Whenever you look at
these ridiculously poor

1890
01:11:25,357 --> 01:11:27,190
segmentations, you're
like, this is idiotic.

1891
01:11:27,190 --> 01:11:29,470
We should be able to
introduce some of that,

1892
01:11:29,470 --> 01:11:33,940
and I've seen people, for
example, put an autoencoder.

1893
01:11:33,940 --> 01:11:35,450
That's not exactly
getting at it,

1894
01:11:35,450 --> 01:11:36,992
but it's actually
getting it somewhat

1895
01:11:36,992 --> 01:11:38,740
with these coarser features.

1896
01:11:38,740 --> 01:11:40,960
But no, I think
in terms of using

1897
01:11:40,960 --> 01:11:43,203
some degree of
geometric priors, I

1898
01:11:43,203 --> 01:11:45,370
think I may have seen some
literature in that space.

1899
01:11:45,370 --> 01:11:46,880
We haven't tried anything there.

1900
01:11:46,880 --> 01:11:49,440
We don't have any data to
do that, unfortunately,

1901
01:11:49,440 --> 01:11:52,090
and I suspect,
yeah, I just don't

1902
01:11:52,090 --> 01:11:53,884
know how difficult that is.

1903
01:11:53,884 --> 01:11:56,104
AUDIENCE: You mentioned
that you don't

1904
01:11:56,104 --> 01:12:01,300
want to see a small additional
atrium off at a distance.

1905
01:12:01,300 --> 01:12:03,113
So that's, in a way,
building in knowledge.

1906
01:12:03,113 --> 01:12:03,780
RAHUL DEO: Yeah.

1907
01:12:03,780 --> 01:12:04,280
No.

1908
01:12:04,280 --> 01:12:06,385
I remember when I was
starting this space.

1909
01:12:06,385 --> 01:12:07,510
I was like this is idiotic.

1910
01:12:07,510 --> 01:12:08,480
Why can't we do this?

1911
01:12:08,480 --> 01:12:10,188
Why don't we have some
way of doing that?

1912
01:12:10,188 --> 01:12:13,270
We couldn't find at that
time any architectures that

1913
01:12:13,270 --> 01:12:16,030
were straightforward
to be able to do that,

1914
01:12:16,030 --> 01:12:20,150
but I'm sure there is
something in that space.

1915
01:12:20,150 --> 01:12:23,200
And we didn't also have the
data for those priors ourselves.

1916
01:12:23,200 --> 01:12:29,400
There's a long history of
these de novo heart modelers

1917
01:12:29,400 --> 01:12:31,715
that exist out there from
Oxford and the New Zealand

1918
01:12:31,715 --> 01:12:33,090
group for that
matter who've been

1919
01:12:33,090 --> 01:12:35,907
doing some of this kind
of multi-scale modeling.

1920
01:12:35,907 --> 01:12:37,740
It will be interesting
to see whether or not

1921
01:12:37,740 --> 01:12:40,440
there is anybody who pushes
forward in that space,

1922
01:12:40,440 --> 01:12:41,650
or is it just more data?

1923
01:12:41,650 --> 01:12:44,250
I think that's
always that tension.

1924
01:12:51,012 --> 01:12:52,950
AUDIENCE: Can I ask
about ultrasounds?

1925
01:12:52,950 --> 01:12:54,300
RAHUL DEO: Yeah.

1926
01:12:54,300 --> 01:12:56,008
AUDIENCE: You didn't
show us ultrasounds.

1927
01:12:56,008 --> 01:12:56,560
Right?

1928
01:12:56,560 --> 01:12:57,220
RAHUL DEO: Yeah, I did.

1929
01:12:57,220 --> 01:12:58,280
AUDIENCE: Oh, you did?

1930
01:12:58,280 --> 01:12:58,420
RAHUL DEO: Yeah.

1931
01:12:58,420 --> 01:12:59,450
The echoes are ultrasounds.

1932
01:12:59,450 --> 01:13:01,830
AUDIENCE: Oh, OK, but that's
really expensive ultrasound.

1933
01:13:01,830 --> 01:13:02,330
Right?

1934
01:13:02,330 --> 01:13:04,193
Like there are
cheaper ultrasounds

1935
01:13:04,193 --> 01:13:06,110
that you could imagine
that you constantly do.

1936
01:13:06,110 --> 01:13:06,610
Right?

1937
01:13:06,610 --> 01:13:07,790
RAHUL DEO: Yeah.

1938
01:13:07,790 --> 01:13:11,210
So there is a company
that just came out

1939
01:13:11,210 --> 01:13:14,210
with the $2,000 handheld
ultrasound, the subscription

1940
01:13:14,210 --> 01:13:15,930
model.

1941
01:13:15,930 --> 01:13:16,430
Yeah.

1942
01:13:16,430 --> 01:13:19,880
So I think that Philips
has a handheld device

1943
01:13:19,880 --> 01:13:24,150
around the $8,000 marker, so
$2,000 is getting quite cheap.

1944
01:13:24,150 --> 01:13:27,650
So that's I think the
space for handheld devices.

1945
01:13:27,650 --> 01:13:29,940
AUDIENCE: We're talking about
resource-poor countries.

1946
01:13:29,940 --> 01:13:30,280
RAHUL DEO: Yeah.

1947
01:13:30,280 --> 01:13:32,405
AUDIENCE: In a developing
country, where maybe they

1948
01:13:32,405 --> 01:13:35,240
have very few doctors per
population kind of thing.

1949
01:13:35,240 --> 01:13:38,130
What kind of imaging
might be useful

1950
01:13:38,130 --> 01:13:41,390
that we could then apply
computer vision algorithms to?

1951
01:13:41,390 --> 01:13:43,940
RAHUL DEO: I think ultrasound
is that sweet spot.

1952
01:13:43,940 --> 01:13:48,250
It has versatility, and
its cost is about where--

1953
01:13:48,250 --> 01:13:50,000
and I'm sure those
companies rented it out

1954
01:13:50,000 --> 01:13:52,590
for much lower cost in
those kinds of places too.

1955
01:13:52,590 --> 01:13:54,840
We're putting together-- or
I put together-- actually,

1956
01:13:54,840 --> 01:13:55,460
it may not have been funded.

1957
01:13:55,460 --> 01:13:56,120
I'm not sure.

1958
01:13:56,120 --> 01:13:59,450
But looking at
sub-Saharan Africa

1959
01:13:59,450 --> 01:14:02,420
and collaborating with
one of the Brigham doctors

1960
01:14:02,420 --> 01:14:05,068
who travels out to
sub-Saharan Africa

1961
01:14:05,068 --> 01:14:07,610
and looking to try to build some
of these automated detection

1962
01:14:07,610 --> 01:14:09,860
type of things in that space.

1963
01:14:09,860 --> 01:14:12,650
So no, I think there is
definite interest in that,

1964
01:14:12,650 --> 01:14:17,450
and then there may be a much
bigger win there then the stuff

1965
01:14:17,450 --> 01:14:18,380
I'm proposing.

1966
01:14:18,380 --> 01:14:20,338
But yeah, no, I think
that's a very good point,

1967
01:14:20,338 --> 01:14:21,218
and that would be--

1968
01:14:21,218 --> 01:14:22,260
it's also, it's portable.

1969
01:14:22,260 --> 01:14:24,260
You could have a
phone-based thing.

1970
01:14:24,260 --> 01:14:29,126
So it's actually very
attractive from that standpoint.

1971
01:14:29,126 --> 01:14:30,043
PROFESSOR: [INAUDIBLE]

1972
01:14:30,043 --> 01:14:30,918
RAHUL DEO: All right.

1973
01:14:30,918 --> 01:14:33,251
I feel like I'm changing the
topic substantially but not

1974
01:14:33,251 --> 01:14:33,751
totally.

1975
01:14:33,751 --> 01:14:34,310
OK.

1976
01:14:34,310 --> 01:14:39,320
So this is that slide I showed,
and I pitched it in a way

1977
01:14:39,320 --> 01:14:41,410
to try to motivate you
to think of ultrasound.

1978
01:14:41,410 --> 01:14:42,930
But I'm not sure
ultrasound really

1979
01:14:42,930 --> 01:14:45,680
achieves all these things, in
the sense I wouldn't call it

1980
01:14:45,680 --> 01:14:48,410
the greatest biological tool
to get at underlying disease

1981
01:14:48,410 --> 01:14:49,850
pathways.

1982
01:14:49,850 --> 01:14:52,070
Some of these things may
be late, like David said,

1983
01:14:52,070 --> 01:14:54,190
or maybe not so reversible.

1984
01:14:54,190 --> 01:14:58,580
So we've been given this One
Brave Idea thing $85 million

1985
01:14:58,580 --> 01:15:02,570
now to make some dent in
a specific disease, so

1986
01:15:02,570 --> 01:15:05,510
coronary artery disease
or coronary heart disease.

1987
01:15:05,510 --> 01:15:07,047
It's that arrogant
tech thing, where

1988
01:15:07,047 --> 01:15:08,630
you just dump a lot
of money somewhere

1989
01:15:08,630 --> 01:15:10,910
and think you're going
to solve all problems.

1990
01:15:10,910 --> 01:15:12,950
And happy to take
it, but I think

1991
01:15:12,950 --> 01:15:14,175
that there are some problems.

1992
01:15:14,175 --> 01:15:15,800
So this is what I
wanted to do, so I've

1993
01:15:15,800 --> 01:15:18,230
wanted to do this for
probably the last five, six

1994
01:15:18,230 --> 01:15:19,880
years, before I
even started here,

1995
01:15:19,880 --> 01:15:23,505
and this has motivated me
in part for quite a while.

1996
01:15:23,505 --> 01:15:24,630
And so here's our problems.

1997
01:15:24,630 --> 01:15:25,000
OK.

1998
01:15:25,000 --> 01:15:27,500
So we're studying heart disease,
so coronary artery disease

1999
01:15:27,500 --> 01:15:30,950
or coronary heart disease is
the arteries in the heart.

2000
01:15:30,950 --> 01:15:32,120
You can't get at those.

2001
01:15:32,120 --> 01:15:33,410
So you can't do any biology.

2002
01:15:33,410 --> 01:15:35,285
You can't do the stuff
the cancer people-- do

2003
01:15:35,285 --> 01:15:36,120
you can biopsy that.

2004
01:15:36,120 --> 01:15:37,575
You can't do anything there.

2005
01:15:37,575 --> 01:15:39,200
So you're stuck with
the thing that you

2006
01:15:39,200 --> 01:15:42,470
want to get at is inaccessible.

2007
01:15:42,470 --> 01:15:45,020
I talked about how a lot of
the imaging is expensive,

2008
01:15:45,020 --> 01:15:48,080
but all those other omic
stuff is really expensive too.

2009
01:15:48,080 --> 01:15:50,980
So that's going to
be not so possible,

2010
01:15:50,980 --> 01:15:54,920
and you're not going to be able
to do serial $1,000 proteomics

2011
01:15:54,920 --> 01:15:55,670
on people either.

2012
01:15:55,670 --> 01:15:57,680
That's not happening
anytime soon.

2013
01:15:57,680 --> 01:16:01,040
And then everything I talked
about, we were woefully

2014
01:16:01,040 --> 01:16:02,690
inadequate in terms
of sample size,

2015
01:16:02,690 --> 01:16:04,730
especially if we
want to characterize

2016
01:16:04,730 --> 01:16:06,833
underlying complex
biological processes.

2017
01:16:06,833 --> 01:16:09,125
So we expect we're going to
need high dimensional data,

2018
01:16:09,125 --> 01:16:10,875
and we're going to
need huge sample sizes.

2019
01:16:10,875 --> 01:16:12,617
There's Vladimir
Vapnik over there.

2020
01:16:12,617 --> 01:16:13,950
And then here's another problem.

2021
01:16:13,950 --> 01:16:14,450
OK?

2022
01:16:14,450 --> 01:16:16,490
So this stuff takes time.

2023
01:16:16,490 --> 01:16:17,720
These diseases take time.

2024
01:16:17,720 --> 01:16:20,085
So if I introduce a
new assay right now,

2025
01:16:20,085 --> 01:16:21,710
how am I going to
show that any of this

2026
01:16:21,710 --> 01:16:22,970
is going to be beneficial?

2027
01:16:22,970 --> 01:16:25,350
Because this disease
develops or 10 to 20 years.

2028
01:16:25,350 --> 01:16:27,517
So I'm not going to talk
about the solution to that,

2029
01:16:27,517 --> 01:16:29,330
well, a little bit.

2030
01:16:29,330 --> 01:16:29,870
OK.

2031
01:16:29,870 --> 01:16:32,630
So one of the issues with
a lot of the data that's

2032
01:16:32,630 --> 01:16:35,010
out there is it's not
particularly expressive.

2033
01:16:35,010 --> 01:16:37,460
It's a lot of that just
the same clinical stuff,

2034
01:16:37,460 --> 01:16:38,690
the same imaging stuff.

2035
01:16:38,690 --> 01:16:42,680
So all these big studies, these
billion dollar big studies,

2036
01:16:42,680 --> 01:16:45,262
ultimately just have
echoes and MRIs and maybe

2037
01:16:45,262 --> 01:16:46,970
a little bit of
genetics, but they really

2038
01:16:46,970 --> 01:16:48,920
don't have stuff
that is this low cost

2039
01:16:48,920 --> 01:16:51,303
expressive biological
stuff that we ideally

2040
01:16:51,303 --> 01:16:52,220
want to be able to do.

2041
01:16:52,220 --> 01:16:55,250
So this is really expensive
and makes $85 million look

2042
01:16:55,250 --> 01:16:57,800
like a joke, and
it's not all that

2043
01:16:57,800 --> 01:16:59,820
rich in terms of complexity.

2044
01:16:59,820 --> 01:17:02,520
So we wanted to do
something different,

2045
01:17:02,520 --> 01:17:05,240
and so this is the crazy thing.

2046
01:17:05,240 --> 01:17:08,340
We're focusing on
circulating cells,

2047
01:17:08,340 --> 01:17:11,270
and so this is a compromise.

2048
01:17:11,270 --> 01:17:12,950
And there's a
reasonably good case

2049
01:17:12,950 --> 01:17:15,250
to be made for
their involvement.

2050
01:17:15,250 --> 01:17:17,270
So there's lots
of data to suggest

2051
01:17:17,270 --> 01:17:19,910
that these are causal mediators
of coronary artery disease

2052
01:17:19,910 --> 01:17:21,270
or coronary heart disease.

2053
01:17:21,270 --> 01:17:24,920
So you can find
them in the plaques.

2054
01:17:24,920 --> 01:17:26,720
So patients who have
autoimmune diseases

2055
01:17:26,720 --> 01:17:29,390
certainly have accelerated
forms after atherosclerosis.

2056
01:17:29,390 --> 01:17:30,175
There are drugs.

2057
01:17:30,175 --> 01:17:31,550
There's a drug
called canakinumab

2058
01:17:31,550 --> 01:17:35,390
that inhibits IL-1 one beta
secretion from macrophages,

2059
01:17:35,390 --> 01:17:38,360
and this has mortality benefit
in coronary artery disease.

2060
01:17:38,360 --> 01:17:40,503
There are mutations in
the white blood cell

2061
01:17:40,503 --> 01:17:42,920
population themselves that are
associated with early heart

2062
01:17:42,920 --> 01:17:43,730
attack.

2063
01:17:43,730 --> 01:17:46,340
So there's a lot there,
and this has been going--

2064
01:17:46,340 --> 01:17:47,840
and there's plenty
of mouse models

2065
01:17:47,840 --> 01:17:49,340
that show that if
you make mutations

2066
01:17:49,340 --> 01:17:51,075
only in the white
blood cell compartment,

2067
01:17:51,075 --> 01:17:53,700
that you will completely change
that the disease course itself.

2068
01:17:53,700 --> 01:17:56,450
So there's a good
amount of data out there

2069
01:17:56,450 --> 01:17:58,940
to suggest that there is an
informative kind of cell type

2070
01:17:58,940 --> 01:17:59,600
there.

2071
01:17:59,600 --> 01:18:01,015
It's accessible.

2072
01:18:01,015 --> 01:18:02,390
There's lots of
predictive models

2073
01:18:02,390 --> 01:18:04,515
already there that could
be done with some of this,

2074
01:18:04,515 --> 01:18:07,010
and they express many of
the genes that are involved.

2075
01:18:07,010 --> 01:18:10,468
And there's a window on many
of these biological processes.

2076
01:18:10,468 --> 01:18:13,010
So we're focusing on computer
vision approaches to this data.

2077
01:18:13,010 --> 01:18:15,050
So we decided, if we
can't do the omic stuff,

2078
01:18:15,050 --> 01:18:16,940
because it costs too
much, we're going

2079
01:18:16,940 --> 01:18:20,240
to take slides and
have tens of thousands

2080
01:18:20,240 --> 01:18:21,850
of cells per individual.

2081
01:18:21,850 --> 01:18:23,600
And then we can introduce
fluorescent dyes

2082
01:18:23,600 --> 01:18:27,350
that can focus on lots
of different organelles.

2083
01:18:27,350 --> 01:18:30,860
And then we can potentially
expand the phenotypic space

2084
01:18:30,860 --> 01:18:32,780
by adding all kinds
of perturbations

2085
01:18:32,780 --> 01:18:35,540
that can be able to
unmask attributes

2086
01:18:35,540 --> 01:18:38,600
of people that may not even be
relatively there at baseline.

2087
01:18:38,600 --> 01:18:41,017
And I think I've been empowered
by the computer vision

2088
01:18:41,017 --> 01:18:43,100
experience with the echo
stuff, and I'm like, hey,

2089
01:18:43,100 --> 01:18:44,310
I can do this.

2090
01:18:44,310 --> 01:18:46,370
I can train these models.

2091
01:18:46,370 --> 01:18:49,790
So we're in a position
now where we can--

2092
01:18:49,790 --> 01:18:52,010
this stuff costs a few
dollars per person.

2093
01:18:52,010 --> 01:18:55,250
It's cheap, and
you can just keep

2094
01:18:55,250 --> 01:18:56,662
on expanding phenotypic space.

2095
01:18:56,662 --> 01:18:57,620
You can bring in drugs.

2096
01:18:57,620 --> 01:18:59,287
You can bring in
whatever you want here,

2097
01:18:59,287 --> 01:19:02,300
and you're still in
that dollars type range.

2098
01:19:02,300 --> 01:19:05,960
So we just piggy-back,
and we just hover around--

2099
01:19:05,960 --> 01:19:07,550
just a couple of
research assistants

2100
01:19:07,550 --> 01:19:09,380
were hovering around clinics.

2101
01:19:09,380 --> 01:19:11,180
And we can do
thousands of patients

2102
01:19:11,180 --> 01:19:13,340
a month, so tens of
thousands of patients a year.

2103
01:19:13,340 --> 01:19:18,410
So we can get into a deep
learning sample size here,

2104
01:19:18,410 --> 01:19:21,710
and so we want
these primary assays

2105
01:19:21,710 --> 01:19:23,570
to be low cost,
reproducible, expressive,

2106
01:19:23,570 --> 01:19:24,830
ideally responsive to therapy.

2107
01:19:24,830 --> 01:19:27,740
So that's this space here,
and there's lots of stuff

2108
01:19:27,740 --> 01:19:28,656
that we have.

2109
01:19:28,656 --> 01:19:31,470
We have all the medical record
data on all these people,

2110
01:19:31,470 --> 01:19:33,810
and we can selectively
do somatic sequencing.

2111
01:19:33,810 --> 01:19:35,130
We can do genome associations.

2112
01:19:35,130 --> 01:19:36,270
We have all ECG data.

2113
01:19:36,270 --> 01:19:38,160
We have selective
positron emission data.

2114
01:19:38,160 --> 01:19:39,960
So it's lots of
additional thought,

2115
01:19:39,960 --> 01:19:42,390
and we want to be able
to walk our cheap assay

2116
01:19:42,390 --> 01:19:45,000
towards those things
are more expensive

2117
01:19:45,000 --> 01:19:47,757
but for which there's
much more historical data.

2118
01:19:47,757 --> 01:19:49,590
So that's what I do
with my life these days,

2119
01:19:49,590 --> 01:19:51,240
and the time problem
has been solved.

2120
01:19:51,240 --> 01:19:54,570
Because we found a collaborary
MGH who has 3 1/2 million

2121
01:19:54,570 --> 01:19:57,870
of these records in terms of
cell counting and cytometer

2122
01:19:57,870 --> 01:19:59,860
data going back for
about three years.

2123
01:19:59,860 --> 01:20:03,390
So we should be able to get
some decent events in that time.

2124
01:20:03,390 --> 01:20:06,072
I need to build a document
classification model for 3 1/2

2125
01:20:06,072 --> 01:20:08,280
million records and decide
whether they have coronary

2126
01:20:08,280 --> 01:20:11,580
heart disease, but sounds
like that's doable.

2127
01:20:11,580 --> 01:20:13,920
We're fearless in this space.

2128
01:20:13,920 --> 01:20:15,960
And then they also
have 13 million images,

2129
01:20:15,960 --> 01:20:18,412
so hundreds of thousands
of people worth of slides.

2130
01:20:18,412 --> 01:20:20,370
So we can at the very
least, get decent weights

2131
01:20:20,370 --> 01:20:22,530
for transfer learning
from some of this data,

2132
01:20:22,530 --> 01:20:25,730
and we're doing this for
acute heart attack patients.

2133
01:20:25,730 --> 01:20:29,140
So yeah, so this is what
I'm doing, ultimately,

2134
01:20:29,140 --> 01:20:32,760
and so it's this bridge between
existing imaging, existing

2135
01:20:32,760 --> 01:20:36,660
conventional medical
data, and this low cost,

2136
01:20:36,660 --> 01:20:39,030
expressive, serial-type
of stuff that ultimately

2137
01:20:39,030 --> 01:20:42,090
hoping to expand phenotypic
space and keep the cost down.

2138
01:20:42,090 --> 01:20:44,670
I think all my lessons from
working with expensive imaging

2139
01:20:44,670 --> 01:20:47,300
data has motivated me to build
something around this space.

2140
01:20:47,300 --> 01:20:50,890
So this is my it's
my baby right now.

2141
01:20:50,890 --> 01:20:53,550
And so lots of things for
people to be involved in,

2142
01:20:53,550 --> 01:20:58,030
if they want to, and these are
some of the funding sources.

2143
01:20:58,030 --> 01:20:58,530
All right.

2144
01:20:58,530 --> 01:20:59,340
Thank you.

2145
01:20:59,340 --> 01:21:02,690
[APPLAUSE]