1
00:00:00,000 --> 00:00:02,522
SPEAKER: The following content
is provided under a Creative

2
00:00:02,522 --> 00:00:03,650
Commons license.

3
00:00:03,650 --> 00:00:06,600
Your support will help MIT
OpenCourseWare continue to

4
00:00:06,600 --> 00:00:10,030
offer high quality educational
resources for free.

5
00:00:10,030 --> 00:00:12,815
To make a donation, or to view
additional material from

6
00:00:12,815 --> 00:00:16,550
hundreds of MIT courses, visit
MIT OpenCourseWare at

7
00:00:16,550 --> 00:00:17,800
ocw.mit.edu.

8
00:00:22,370 --> 00:00:24,670
PROFESSOR: OK I want to review
a little bit what we said

9
00:00:24,670 --> 00:00:29,550
about detection at the end of
last hour, last hour and a

10
00:00:29,550 --> 00:00:35,620
half, because we were going
through it relatively quickly.

11
00:00:35,620 --> 00:00:38,570
Detection is a very
funny subject.

12
00:00:38,570 --> 00:00:43,850
Particularly the way that we
do it here because we go

13
00:00:43,850 --> 00:00:47,250
through a bunch of very,
very simple steps.

14
00:00:47,250 --> 00:00:48,620
Everything looks trivial--

15
00:00:48,620 --> 00:00:51,920
I hope it looks trivial-- as
we go, at least after you

16
00:00:51,920 --> 00:00:54,970
think about it for awhile, and
you work with it for awhile,

17
00:00:54,970 --> 00:00:58,010
you will come back and look at
it and you will say, "yes, in

18
00:00:58,010 --> 00:01:01,450
fact that is trivial." Nothing
when you look at it for the

19
00:01:01,450 --> 00:01:04,170
first time is trivial.

20
00:01:04,170 --> 00:01:07,460
And the kind of detection
problems that we're

21
00:01:07,460 --> 00:01:11,460
interested in is--

22
00:01:11,460 --> 00:01:15,280
to start out with-- we want to
look just at binary detection.

23
00:01:15,280 --> 00:01:19,490
We're sending a binary signal,
it's going to go through some

24
00:01:19,490 --> 00:01:24,350
signal encoder, which is the
kind of channel encoder we've

25
00:01:24,350 --> 00:01:26,270
been thinking about.

26
00:01:26,270 --> 00:01:32,460
It's going to go through a
baseband modulator, a baseband

27
00:01:32,460 --> 00:01:34,510
to passband modulator.

28
00:01:34,510 --> 00:01:37,440
It's going to have white
Gaussian noise, or some kind

29
00:01:37,440 --> 00:01:39,230
of noise, added to it.

30
00:01:39,230 --> 00:01:42,090
It's going to come out
the other end.

31
00:01:42,090 --> 00:01:44,900
We come back from passband
to baseband.

32
00:01:44,900 --> 00:01:47,610
We then go through a baseband
demodulator.

33
00:01:47,610 --> 00:01:50,330
We then sample at that point.

34
00:01:50,330 --> 00:01:53,920
And, the point is, when you're
all done with all of that,

35
00:01:53,920 --> 00:02:00,870
what you've done is you've
started out sending either

36
00:02:00,870 --> 00:02:07,220
plus a or minus a as a one-
dimensional numerical signal.

37
00:02:07,220 --> 00:02:11,640
And when you're all through,
there's some one- dimensional

38
00:02:11,640 --> 00:02:15,320
number that comes out, v. And
on the basis of that one-

39
00:02:15,320 --> 00:02:18,850
dimensional number, v, you're
supposed to guess whether the

40
00:02:18,850 --> 00:02:21,250
output is zero or one.

41
00:02:21,250 --> 00:02:23,640
Now, one of the things that
we're doing right now is we're

42
00:02:23,640 --> 00:02:28,660
simplifying the problem in the
sense that we're not looking

43
00:02:28,660 --> 00:02:32,060
at a sequence of inputs coming
in, and we're not looking at a

44
00:02:32,060 --> 00:02:34,560
sequence of outputs
coming out.

45
00:02:34,560 --> 00:02:37,750
We're only looking at a single
input coming in.

46
00:02:37,750 --> 00:02:41,190
In other words, you build this
piece of communication

47
00:02:41,190 --> 00:02:44,680
equipment, you get it all tuned
up, you get it into

48
00:02:44,680 --> 00:02:45,690
steady state.

49
00:02:45,690 --> 00:02:47,910
You send one bit.

50
00:02:47,910 --> 00:02:49,400
You receive something.

51
00:02:49,400 --> 00:02:52,300
You try to guess at the
receiver, what was sent.

52
00:02:52,300 --> 00:02:55,710
And at that point you tear the
whole thing down and you wait

53
00:02:55,710 --> 00:02:58,470
a year until you've set
it up perfectly again.

54
00:02:58,470 --> 00:03:00,480
You send another bit.

55
00:03:00,480 --> 00:03:03,250
And we're not going to worry at
all about what happens with

56
00:03:03,250 --> 00:03:05,840
the sequence, we're only
going to worry about

57
00:03:05,840 --> 00:03:07,660
this one shot problem.

58
00:03:07,660 --> 00:03:14,950
You sort of have some kind of
clue that if you send the

59
00:03:14,950 --> 00:03:19,100
whole sequence of bits, in a
system like this, and you

60
00:03:19,100 --> 00:03:21,950
don't have intersymbol
interference, and the noise is

61
00:03:21,950 --> 00:03:25,000
white, so it's sort of
independent from time to time.

62
00:03:25,000 --> 00:03:29,210
You sort of have a clue that
you're going to get the same

63
00:03:29,210 --> 00:03:32,330
answer whether you send the
sequence of data or whether

64
00:03:32,330 --> 00:03:35,290
you just send a single bit.

65
00:03:35,290 --> 00:03:39,730
And we're going to show later
that that, in fact, is true.

66
00:03:39,730 --> 00:03:42,810
But for the time being we want
to understand what's going on,

67
00:03:42,810 --> 00:03:45,610
and to understand what's going
on we take this simplest

68
00:03:45,610 --> 00:03:49,350
possible case where there's
only one bit that's being

69
00:03:49,350 --> 00:03:50,600
transmitted.

70
00:03:53,130 --> 00:03:56,120
It's the question, "Are we going
to destroy ourselves in

71
00:03:56,120 --> 00:04:00,750
the next five years or not?" And
this question is important

72
00:04:00,750 --> 00:04:05,340
to most of us, and at the output
we find out, in fact,

73
00:04:05,340 --> 00:04:07,530
whether we're going to destroy
ourselves or not.

74
00:04:07,530 --> 00:04:10,600
So it's one bit, but it's
one important bit.

75
00:04:10,600 --> 00:04:13,760
OK, why are we doing
things this way?

76
00:04:13,760 --> 00:04:18,330
Want to tell you a little story
about the first time I

77
00:04:18,330 --> 00:04:20,650
really talked to
Claude Shannon.

78
00:04:20,650 --> 00:04:25,170
I was a young member of the
faculty at that time, and I

79
00:04:25,170 --> 00:04:27,360
was working on a problem
which I thought was

80
00:04:27,360 --> 00:04:29,460
really a neat problem.

81
00:04:29,460 --> 00:04:31,830
It was interesting
theoretically.

82
00:04:31,830 --> 00:04:34,550
It was important practically.

83
00:04:34,550 --> 00:04:37,950
And I thought, "gee, I finally
have something I can go to

84
00:04:37,950 --> 00:04:41,020
this great man and talk to him
about." So I screwed up my

85
00:04:41,020 --> 00:04:43,440
courage for about two days.

86
00:04:43,440 --> 00:04:47,340
Finally I saw his door open and
him sitting there, so I

87
00:04:47,340 --> 00:04:50,640
went in and started to tell
him about this problem.

88
00:04:50,640 --> 00:04:54,570
He's a very kind person and he
listened very patiently.

89
00:04:54,570 --> 00:04:57,510
And after about 15 minutes
he said, "My god!

90
00:04:57,510 --> 00:04:59,400
I'm just sort of lost
with all of this.

91
00:04:59,400 --> 00:05:01,990
There's so much stuff going
on in this problem.

92
00:05:01,990 --> 00:05:05,950
Can't we simplify it a little
bit by throwing out this kind

93
00:05:05,950 --> 00:05:09,190
of practical constraint you've
put on it?" I said, "yeah, I

94
00:05:09,190 --> 00:05:13,040
guess so." So we threw that
out and the we went on for

95
00:05:13,040 --> 00:05:15,390
awhile later, and then he
said, "My god, I'm still

96
00:05:15,390 --> 00:05:18,240
terribly confused about
this whole thing.

97
00:05:18,240 --> 00:05:21,210
Why don't we simplify it
in some other way?"

98
00:05:21,210 --> 00:05:23,170
And this went on for
about an hour.

99
00:05:23,170 --> 00:05:26,050
As I say, he was a
very patient guy.

100
00:05:26,050 --> 00:05:29,500
And at the end of an hour I was
getting really depressed.

101
00:05:29,500 --> 00:05:31,920
Because here was this beautiful
problem that I

102
00:05:31,920 --> 00:05:35,720
thought was going to make me
famous, give me tenure, do all

103
00:05:35,720 --> 00:05:37,140
these neat things.

104
00:05:37,140 --> 00:05:39,290
And here he'd reduced
the thing to a

105
00:05:39,290 --> 00:05:41,750
totally trivial toy problem.

106
00:05:41,750 --> 00:05:42,930
And we looked at it.

107
00:05:42,930 --> 00:05:45,130
And we said, yes this is
a trivial toy problem.

108
00:05:45,130 --> 00:05:46,500
This is the answer.

109
00:05:46,500 --> 00:05:48,840
The problem is solved.

110
00:05:48,840 --> 00:05:51,590
But so what?

111
00:05:51,590 --> 00:05:53,920
And then he suggested putting
some of those

112
00:05:53,920 --> 00:05:56,590
constraints back in again.

113
00:05:56,590 --> 00:05:59,370
And as we started putting the
constraints back in, one- by-

114
00:05:59,370 --> 00:06:03,280
one, we saw that each time we
put a new constraint in--

115
00:06:03,280 --> 00:06:06,160
since we understood the problem
and its simplest

116
00:06:06,160 --> 00:06:09,830
form-- putting the constraint
in, it was still simple.

117
00:06:09,830 --> 00:06:13,060
And by the time we built the
whole thing back up again, it

118
00:06:13,060 --> 00:06:16,160
was clear what the answer was.

119
00:06:16,160 --> 00:06:22,410
OK, in other words, what theory
means is really solving

120
00:06:22,410 --> 00:06:23,680
these toy problems.

121
00:06:23,680 --> 00:06:26,670
And solving the toy
problems first.

122
00:06:26,670 --> 00:06:29,110
And in terms of practice, some
people think the most

123
00:06:29,110 --> 00:06:34,090
practical thing is
to be practical.

124
00:06:34,090 --> 00:06:38,100
And the whole point of this
course, and this particular

125
00:06:38,100 --> 00:06:42,890
subject of detection is a
wonderful example of this, the

126
00:06:42,890 --> 00:06:47,080
most practical thing is
to be theoretical.

127
00:06:47,080 --> 00:06:51,340
I mean, you need to add practice
to the theory, but

128
00:06:51,340 --> 00:06:54,260
the way you do things is you
start with a theory-- which

129
00:06:54,260 --> 00:06:57,000
means you start with the toy
problems, you build up from

130
00:06:57,000 --> 00:07:00,630
those toy problems, and after
you build up for awhile,

131
00:07:00,630 --> 00:07:03,900
understanding what the practical
problem is also--

132
00:07:03,900 --> 00:07:05,620
you then understand
how to deal with

133
00:07:05,620 --> 00:07:07,200
the practical problem.

134
00:07:07,200 --> 00:07:10,100
And the practical engineer who
doesn't have any of that

135
00:07:10,100 --> 00:07:12,970
fundamental knowledge about
how to deal with these

136
00:07:12,970 --> 00:07:17,630
problems is always submerged
in a sea of complexity.

137
00:07:17,630 --> 00:07:21,080
Always doing simulations of
something that he doesn't, he

138
00:07:21,080 --> 00:07:23,090
or she doesn't understand.

139
00:07:23,090 --> 00:07:25,970
Always trying to interpret
something from it, but with

140
00:07:25,970 --> 00:07:29,440
just too many things going on
to have any idea of what it

141
00:07:29,440 --> 00:07:30,610
really means.

142
00:07:30,610 --> 00:07:34,600
Ok, so that's why we're making
this trivial assumption here.

143
00:07:34,600 --> 00:07:37,480
We're only putting one bit in.

144
00:07:37,480 --> 00:07:39,520
We're ignoring what
happens all the

145
00:07:39,520 --> 00:07:41,090
way through the system.

146
00:07:41,090 --> 00:07:43,590
We only get one number out.

147
00:07:43,590 --> 00:07:46,880
We're going to assume that this
one number here is either

148
00:07:46,880 --> 00:07:49,730
plus or minus a, plus
the Gaussian

149
00:07:49,730 --> 00:07:51,840
random noise variable.

150
00:07:51,840 --> 00:07:55,090
And we're not quite sure why
it's going to be plus or minus

151
00:07:55,090 --> 00:07:57,960
a, plus a Gaussian noise random
variable, but we're

152
00:07:57,960 --> 00:07:59,980
going to assume that
for the time being.

153
00:07:59,980 --> 00:08:00,630
OK?

154
00:08:00,630 --> 00:08:06,610
So the detector observes the
sample values of the random

155
00:08:06,610 --> 00:08:11,750
variable for which this is the
sample value, and then guesses

156
00:08:11,750 --> 00:08:15,680
what the value of this random
variable, h, which is what we

157
00:08:15,680 --> 00:08:16,780
call the input now.

158
00:08:16,780 --> 00:08:21,310
Because we view it from the
standpoint of the detector--

159
00:08:21,310 --> 00:08:24,080
the detector has two possible
hypotheses--

160
00:08:24,080 --> 00:08:27,140
one is that a zero was
sent, and the other

161
00:08:27,140 --> 00:08:28,960
that a one was sent.

162
00:08:28,960 --> 00:08:33,690
And on the basis of this
observation, you take first

163
00:08:33,690 --> 00:08:36,430
the hypothesis zero and you
say, "Is this a reasonable

164
00:08:36,430 --> 00:08:39,840
hypothesis?" Then you look at
the hypothesis one, say, "Is

165
00:08:39,840 --> 00:08:46,480
this a reasonable hypothesis?"
And then you guess whether you

166
00:08:46,480 --> 00:08:49,990
think zero was more likely or
one is more likely, given this

167
00:08:49,990 --> 00:08:52,070
observation that you've had.

168
00:08:52,070 --> 00:08:56,700
So what they detector has,
at this point, is a full

169
00:08:56,700 --> 00:09:01,180
statistical characterization
of the entire problem.

170
00:09:01,180 --> 00:09:03,090
Mainly you have a model
of the problem.

171
00:09:03,090 --> 00:09:06,340
You understand every probability
in the universe

172
00:09:06,340 --> 00:09:09,340
that might have any
effect on this.

173
00:09:09,340 --> 00:09:13,240
And what might have any
effect on this--

174
00:09:13,240 --> 00:09:16,360
as far as the way we've
set up the problem--

175
00:09:16,360 --> 00:09:20,060
is only the question of what
are the probabilities that

176
00:09:20,060 --> 00:09:22,145
you're going to send
one or the other of

177
00:09:22,145 --> 00:09:25,540
these signals here?

178
00:09:25,540 --> 00:09:28,100
And conditional on each
of these, what are the

179
00:09:28,100 --> 00:09:30,780
probabilities of this
random variable

180
00:09:30,780 --> 00:09:32,730
appearing at the output?

181
00:09:32,730 --> 00:09:35,130
Because you have to base your
decision only on this.

182
00:09:35,130 --> 00:09:37,040
So all of the probabilities
only give you

183
00:09:37,040 --> 00:09:39,150
this one simple thing.

184
00:09:39,150 --> 00:09:43,310
Hypothesis testing, decision
making, decoding, all mean the

185
00:09:43,310 --> 00:09:43,850
same thing.

186
00:09:43,850 --> 00:09:45,900
They mean exactly
the same thing.

187
00:09:51,970 --> 00:09:54,740
And they're just done
by different people.

188
00:09:54,740 --> 00:09:58,140
OK, so what that says is we're
assuming the detector uses a

189
00:09:58,140 --> 00:10:01,050
known probability model.

190
00:10:01,050 --> 00:10:03,200
And in designing the detector,
you know what that

191
00:10:03,200 --> 00:10:05,890
probability model is.

192
00:10:05,890 --> 00:10:09,050
It might not be the right
probability model, and one of

193
00:10:09,050 --> 00:10:13,410
the things that many people
interested in detection study

194
00:10:13,410 --> 00:10:17,920
is the question of when you
think the probability model is

195
00:10:17,920 --> 00:10:21,620
one thing and it's actually
something else, how well does

196
00:10:21,620 --> 00:10:22,880
the detection work?

197
00:10:22,880 --> 00:10:25,690
It's a little like the
Lempel-Ziv algorithm that we

198
00:10:25,690 --> 00:10:31,040
studied earlier for doing
source coding.

199
00:10:31,040 --> 00:10:34,090
Which is, how do you do source
coding when you don't know

200
00:10:34,090 --> 00:10:35,930
what the probabilities are?

201
00:10:35,930 --> 00:10:38,450
And we found the best way to
study that, of course, was to

202
00:10:38,450 --> 00:10:41,470
first find out how to do source
encoding when you did

203
00:10:41,470 --> 00:10:43,460
know what the probabilities
were.

204
00:10:43,460 --> 00:10:46,150
So we're doing the
same thing here.

205
00:10:46,150 --> 00:10:49,090
We assume the detector is
designed to maximize the

206
00:10:49,090 --> 00:10:51,820
probability of guessing
correctly.

207
00:10:51,820 --> 00:10:54,240
In other words, it's trying
to minimize the

208
00:10:54,240 --> 00:10:56,160
probability of error.

209
00:10:56,160 --> 00:10:59,940
We call that a MAP detector--
maximum a posteriori

210
00:10:59,940 --> 00:11:03,890
probability decoding.

211
00:11:03,890 --> 00:11:06,830
You can try to do
other things.

212
00:11:06,830 --> 00:11:10,600
You can say that there's a cost
of one kind of error, and

213
00:11:10,600 --> 00:11:13,420
there's another cost of
another kind of error.

214
00:11:13,420 --> 00:11:16,780
I mean, if you're doing medical
testing or something.

215
00:11:16,780 --> 00:11:20,350
If you guess wrong in one way,
you tell the patient there's

216
00:11:20,350 --> 00:11:23,380
nothing wrong with them, the
patient goes out, drops dead

217
00:11:23,380 --> 00:11:25,120
the next day.

218
00:11:25,120 --> 00:11:27,430
And you don't care about that,
of course, but you care about

219
00:11:27,430 --> 00:11:30,730
the fact that the patient is
going to sue the hospital for

220
00:11:30,730 --> 00:11:32,600
100 million dollars and you're
going to lose your

221
00:11:32,600 --> 00:11:34,880
job because of it.

222
00:11:34,880 --> 00:11:37,860
So there's a big cost to
guessing wrong in that way.

223
00:11:40,810 --> 00:11:45,960
But for now, we're not going
to bother about the costs.

224
00:11:45,960 --> 00:11:49,970
One of the things that you'll
see when we get all done is

225
00:11:49,970 --> 00:11:51,640
that putting in cost
doesn't make the

226
00:11:51,640 --> 00:11:53,970
problem any harder, really.

227
00:11:53,970 --> 00:11:57,310
You really wind up with the
same kind of problem.

228
00:11:57,310 --> 00:11:59,660
OK, so h is the random
variable that will be

229
00:11:59,660 --> 00:12:02,250
detected, and v is the
random variable

230
00:12:02,250 --> 00:12:03,590
that's going to be observed.

231
00:12:03,590 --> 00:12:05,650
The experiment is performed.

232
00:12:05,650 --> 00:12:10,140
Some sample value of v is
observed, and some sample

233
00:12:10,140 --> 00:12:13,520
value of the hypothesis
has actually happened.

234
00:12:13,520 --> 00:12:16,320
In other words, what has
happened is you prepared the

235
00:12:16,320 --> 00:12:18,020
whole system.

236
00:12:18,020 --> 00:12:24,350
Then at the input end to the
whole system, the input to the

237
00:12:24,350 --> 00:12:28,610
channel, somebody has chosen
a one or a zero without the

238
00:12:28,610 --> 00:12:30,920
knowledge of the receiver.

239
00:12:30,920 --> 00:12:34,140
That one or zero has been sent
through this whole system, the

240
00:12:34,140 --> 00:12:38,330
receiver has observed some
output, v, so in fact we're

241
00:12:38,330 --> 00:12:41,320
now dealing with the
sample values of

242
00:12:41,320 --> 00:12:42,100
two different things.

243
00:12:42,100 --> 00:12:45,510
The sample value of the input,
which is h, the sample value

244
00:12:45,510 --> 00:12:48,880
of the output, which is v, and
in terms of the sample value

245
00:12:48,880 --> 00:12:52,440
of the output, we're trying to
guess what the sample value of

246
00:12:52,440 --> 00:12:54,620
the input is.

247
00:12:54,620 --> 00:13:00,630
OK, an error then occurs if,
after the output chooses a

248
00:13:00,630 --> 00:13:06,030
particular hypothesis as its
guess, and that hypothesis,

249
00:13:06,030 --> 00:13:08,620
then, is a function of
what it receives.

250
00:13:08,620 --> 00:13:12,410
In other words, after you
receive something, what the

251
00:13:12,410 --> 00:13:17,530
detector has to do is somehow
map what gets received, which

252
00:13:17,530 --> 00:13:22,100
is some number, into
either zero or one.

253
00:13:22,100 --> 00:13:24,250
It's like what a
Quantizer does.

254
00:13:24,250 --> 00:13:28,340
Mainly, it maps the whole
region into two

255
00:13:28,340 --> 00:13:30,360
different sub- regions.

256
00:13:30,360 --> 00:13:32,680
Some things are mapped into
zero, Some things

257
00:13:32,680 --> 00:13:36,020
are mapped into one.

258
00:13:36,020 --> 00:13:39,220
This H bar then becomes a random
variable, but is a

259
00:13:39,220 --> 00:13:43,400
random variable that is a
function of what's received.

260
00:13:43,400 --> 00:13:46,260
So we have one random variable,
H, which is what

261
00:13:46,260 --> 00:13:47,620
actually happened.

262
00:13:47,620 --> 00:13:50,720
There's another random variable,
H hat, which is what

263
00:13:50,720 --> 00:13:53,390
the detector guesses
will happen.

264
00:13:53,390 --> 00:13:58,030
This is an unusual random
variable, because it's not

265
00:13:58,030 --> 00:14:01,070
determined ahead of time.

266
00:14:01,070 --> 00:14:04,170
It's determined only in terms
of what you decide your

267
00:14:04,170 --> 00:14:06,020
detection rule is going to be.

268
00:14:06,020 --> 00:14:09,190
This is a random variable that
you have some control over.

269
00:14:09,190 --> 00:14:10,940
These other random variables
you have no

270
00:14:10,940 --> 00:14:14,540
control over at all.

271
00:14:14,540 --> 00:14:17,290
So that's the random variable
we're going to choose.

272
00:14:17,290 --> 00:14:20,330
And, in fact, what we're going
to do is we're going to say

273
00:14:20,330 --> 00:14:24,690
what we want to do is this MAP
decoding, maximum a posteriori

274
00:14:24,690 --> 00:14:27,710
probability decoding, where
we're trying to minimize the

275
00:14:27,710 --> 00:14:29,710
probability of screwing up.

276
00:14:29,710 --> 00:14:35,470
And we don't care whether we
make an error of one kind or

277
00:14:35,470 --> 00:14:38,590
make an error of
the other kind.

278
00:14:38,590 --> 00:14:44,840
OK, is that formulation of the
problem crystal clear?

279
00:14:44,840 --> 00:14:48,940
Anybody have any questions
about it?

280
00:14:48,940 --> 00:14:54,730
I mean, the easiest way to get
screwed up with detection is

281
00:14:54,730 --> 00:14:58,140
at a certain point to be going
through, studying a detection

282
00:14:58,140 --> 00:15:00,700
problem, and then you
suddenly realize you

283
00:15:00,700 --> 00:15:03,180
don't understand what--

284
00:15:03,180 --> 00:15:06,590
you don't understand what the
whole problem is about.

285
00:15:06,590 --> 00:15:12,650
OK, let's assume we do know
what the problem is, then.

286
00:15:12,650 --> 00:15:15,760
In principle it's simple.

287
00:15:15,760 --> 00:15:19,190
Given a particular observed
value, what we're going to do

288
00:15:19,190 --> 00:15:22,240
is we're going to calculate
the-- what we call the a

289
00:15:22,240 --> 00:15:25,850
posteriori probability, the
probability given that

290
00:15:25,850 --> 00:15:29,010
particular sample value
of the observation--

291
00:15:29,010 --> 00:15:32,120
we're going to calculate the
probability that what went

292
00:15:32,120 --> 00:15:35,340
into the system is a zero
and what went into

293
00:15:35,340 --> 00:15:38,790
the system is a one.

294
00:15:38,790 --> 00:15:39,390
OK?

295
00:15:39,390 --> 00:15:44,930
This is the probability that
j is the sample value of H,

296
00:15:44,930 --> 00:15:50,040
conditional on what
we observed.

297
00:15:50,040 --> 00:15:54,780
OK, if you can calculate this
quantity, it tells you if I

298
00:15:54,780 --> 00:15:56,030
decide that--

299
00:15:58,310 --> 00:16:02,870
if I guess that H is equal to
j-- this in fact tells me that

300
00:16:02,870 --> 00:16:06,950
this is the probability
that guess is correct.

301
00:16:06,950 --> 00:16:10,180
And if this is the probability
that guess is correct, and I

302
00:16:10,180 --> 00:16:12,570
want to maximize my probability
of guessing

303
00:16:12,570 --> 00:16:15,350
correctly, what do I do?

304
00:16:15,350 --> 00:16:22,500
Well, what I do is my MAP
rule is arg max of this

305
00:16:22,500 --> 00:16:23,820
probability.

306
00:16:23,820 --> 00:16:30,830
And "arg max" means instead
of trying to maximize this

307
00:16:30,830 --> 00:16:33,670
instead of trying to maximize
this quantity over something,

308
00:16:33,670 --> 00:16:37,870
what we're doing is trying to
find the value of j, which

309
00:16:37,870 --> 00:16:39,050
maximizes this.

310
00:16:39,050 --> 00:16:42,870
In other words, we calculate
this for each value of j, and

311
00:16:42,870 --> 00:16:46,330
then we picked the j for which
this quantity is largest.

312
00:16:46,330 --> 00:16:49,890
In other words, we maximize this
but we're not interested

313
00:16:49,890 --> 00:16:52,860
in the maximum value of
it at this point.

314
00:16:52,860 --> 00:16:55,420
We're interested in it later
because that's the probability

315
00:16:55,420 --> 00:16:56,820
that we're choosing correctly.

316
00:16:56,820 --> 00:17:01,100
What we're interested in, for
now, is what is the hypothesis

317
00:17:01,100 --> 00:17:02,820
that we're going to guess.

318
00:17:02,820 --> 00:17:03,480
OK?

319
00:17:03,480 --> 00:17:08,570
So this probability of being
correct is going to the this

320
00:17:08,570 --> 00:17:11,730
probability for this
maximal j.

321
00:17:11,730 --> 00:17:16,125
And when we average over v we
get the overall probability of

322
00:17:16,125 --> 00:17:17,690
being correct.

323
00:17:17,690 --> 00:17:21,840
There's a theorem which is
stated in the notes, which is

324
00:17:21,840 --> 00:17:25,120
one of the more trivial theorems
you can think of,

325
00:17:25,120 --> 00:17:28,240
which says that if you do the
best thing for every sample

326
00:17:28,240 --> 00:17:32,780
point you, in fact, have done
the best thing on average.

327
00:17:32,780 --> 00:17:35,050
I think that's pretty clear.

328
00:17:35,050 --> 00:17:39,605
If you, well I mean you can read
it if you want to form a

329
00:17:39,605 --> 00:17:44,650
proof, but if you do the best
thing all the time, then it's

330
00:17:44,650 --> 00:17:48,150
the overall best thing.

331
00:17:48,150 --> 00:17:57,570
OK, so that's the general
idea of detection.

332
00:17:57,570 --> 00:18:01,590
And in doing this we have to
be able to calculate these

333
00:18:01,590 --> 00:18:04,820
probabilities, so that's
the only constraint.

334
00:18:04,820 --> 00:18:07,940
These are probabilities, which
means that this set of

335
00:18:07,940 --> 00:18:12,080
hypothesis is discrete.

336
00:18:12,080 --> 00:18:16,460
If you have an uncountable
infinite number of hypotheses,

337
00:18:16,460 --> 00:18:19,070
at that point you're dealing
with an estimation problem.

338
00:18:19,070 --> 00:18:22,790
Because you don't have any
chance in hell of getting the

339
00:18:22,790 --> 00:18:25,700
right answer, exactly
the right answer.

340
00:18:25,700 --> 00:18:28,990
And therefore you have to
have some criterion for

341
00:18:28,990 --> 00:18:30,040
how close you are.

342
00:18:30,040 --> 00:18:32,280
And that's what's important
in estimation.

343
00:18:32,280 --> 00:18:36,090
And here what's important is
really, do we guess right or

344
00:18:36,090 --> 00:18:37,300
don't we guess right.

345
00:18:37,300 --> 00:18:38,850
And we don't, we don't care.

346
00:18:38,850 --> 00:18:41,130
There aren't any near
misses here.

347
00:18:41,130 --> 00:18:45,330
You either get it on the
nose or you don't.

348
00:18:45,330 --> 00:18:47,820
OK, so we want to study
binary detection now

349
00:18:47,820 --> 00:18:48,690
to start off with.

350
00:18:48,690 --> 00:18:51,420
We want to trivialize the
problem because even that

351
00:18:51,420 --> 00:18:53,980
problem we just stated
is too hard.

352
00:18:53,980 --> 00:18:56,140
So we're going to trivialize
it in two ways.

353
00:18:56,140 --> 00:18:58,960
We're going to assume that there
are only two hypotheses.

354
00:18:58,960 --> 00:19:01,670
That it's a binary detection
problem.

355
00:19:01,670 --> 00:19:05,370
And we're also going to assume
that it's Gaussian noise.

356
00:19:05,370 --> 00:19:10,380
And that will make it sort of
transparent what's happening.

357
00:19:10,380 --> 00:19:13,250
So H takes the values
zero or one.

358
00:19:13,250 --> 00:19:15,880
And we'll call the probabilities
with which it

359
00:19:15,880 --> 00:19:18,610
takes those values
P zero and P one.

360
00:19:18,610 --> 00:19:20,790
These are called a priori
probabilities.

361
00:19:20,790 --> 00:19:23,610
In other words, these are the
probabilities that the

362
00:19:23,610 --> 00:19:27,780
hypothesis takes to value zero
or one before seeing any

363
00:19:27,780 --> 00:19:29,680
observation.

364
00:19:29,680 --> 00:19:33,070
And the probabilities after you
see the observation are

365
00:19:33,070 --> 00:19:35,940
called a posteriori
probabilities.

366
00:19:35,940 --> 00:19:38,870
In other words, probabilities
after the observation and the

367
00:19:38,870 --> 00:19:41,420
probabilities before
the observation.

368
00:19:41,420 --> 00:19:45,170
Up until about 1950
statisticians used to argue

369
00:19:45,170 --> 00:19:50,150
terribly about whether it was
valid to assume a priori

370
00:19:50,150 --> 00:19:52,600
probabilities.

371
00:19:52,600 --> 00:19:56,150
And as you can see by thinking
about it a little bit, the

372
00:19:56,150 --> 00:19:59,630
problem they were facing was
they couldn't separate the

373
00:19:59,630 --> 00:20:02,680
problem of choosing a
mathematical model and

374
00:20:02,680 --> 00:20:07,730
analyzing it from the problem
of figuring out whether the

375
00:20:07,730 --> 00:20:10,330
model was valid or not.

376
00:20:10,330 --> 00:20:15,090
And at that point people
studying in that area had not

377
00:20:15,090 --> 00:20:18,100
gotten to the point where they
could say, "Well, maybe I

378
00:20:18,100 --> 00:20:21,660
ought to analyze the problem for
different models, and then

379
00:20:21,660 --> 00:20:24,630
after I understand what happens
for different models I

380
00:20:24,630 --> 00:20:27,470
then ought to go back because
I'll know what's important to

381
00:20:27,470 --> 00:20:33,140
find out in the real problem."
But up until that time, there

382
00:20:33,140 --> 00:20:36,200
was just fighting
among everyone.

383
00:20:36,200 --> 00:20:39,310
Bayes was the person who decided
you really ought to

384
00:20:39,310 --> 00:20:41,600
assume that there's a
model to start with.

385
00:20:41,600 --> 00:20:44,040
And he developed most
of detection

386
00:20:44,040 --> 00:20:46,810
theory at an early time.

387
00:20:46,810 --> 00:20:50,070
And people used to think that
Bayes was a terrible fraud.

388
00:20:50,070 --> 00:20:53,530
Because in fact he was using
models of the problem rather

389
00:20:53,530 --> 00:20:56,710
than nothing.

390
00:20:56,710 --> 00:20:59,430
But anyway, that's
where we were.

391
00:21:02,310 --> 00:21:06,910
We're also going to assume that,
after we get all through

392
00:21:06,910 --> 00:21:09,230
with modulation and
demodulation, and we really

393
00:21:09,230 --> 00:21:11,740
want to look at a general
problem here.

394
00:21:11,740 --> 00:21:16,950
There's one discrete random
variable, H, one analog random

395
00:21:16,950 --> 00:21:20,800
variable, which has a
probability density, v, what

396
00:21:20,800 --> 00:21:25,200
we want to assume is that
there's a probability density

397
00:21:25,200 --> 00:21:29,800
that we know, which is the
probability density of the

398
00:21:29,800 --> 00:21:34,050
observation conditional
on the hypothesis.

399
00:21:34,050 --> 00:21:37,860
We're assuming that we know
this, and we know this--

400
00:21:37,860 --> 00:21:41,160
we call these things
likelihoods--

401
00:21:41,160 --> 00:21:46,430
and in most communication
problems anyway, it's far

402
00:21:46,430 --> 00:21:49,650
easier to get your hand on these
likelihoods than it is

403
00:21:49,650 --> 00:21:53,260
to get your hand the a
posteriori probabilities,

404
00:21:53,260 --> 00:21:55,510
which you're really
interested in.

405
00:21:55,510 --> 00:22:01,390
So we find these likelihoods,
we can find the marginal

406
00:22:01,390 --> 00:22:05,070
density of the observation.

407
00:22:05,070 --> 00:22:10,720
Which is just the
weighted sum.

408
00:22:10,720 --> 00:22:13,790
The probability that the
hypothesis is zero times a

409
00:22:13,790 --> 00:22:18,200
conditional probability of the
observation and so forth.

410
00:22:21,440 --> 00:22:24,570
So we're going to assume that
those densities exist.

411
00:22:24,570 --> 00:22:26,150
We're going to assume
that we know them.

412
00:22:28,840 --> 00:22:35,840
And then with a great feat of
probability theory, we say the

413
00:22:35,840 --> 00:22:44,290
a posteriori probability is
equal to the a priori

414
00:22:44,290 --> 00:22:50,570
probability times the likelihood
divided by the

415
00:22:50,570 --> 00:22:53,820
marginal probability of v. OK?

416
00:22:53,820 --> 00:22:55,900
What was the first thing you
learn in probability when you

417
00:22:55,900 --> 00:22:58,340
started studying random
variables?

418
00:22:58,340 --> 00:23:02,680
It was probably this formula,
which says when you have a

419
00:23:02,680 --> 00:23:06,970
joint density of two random
variables you can

420
00:23:06,970 --> 00:23:08,370
write it in two ways.

421
00:23:08,370 --> 00:23:11,920
You can either write, "density
of one times density of two

422
00:23:11,920 --> 00:23:16,420
conditional and one is equal
to the density of two times

423
00:23:16,420 --> 00:23:20,510
density of one conditional in
two." And then you think about

424
00:23:20,510 --> 00:23:23,900
it a little bit and you say,
"A ha!" It doesn't matter

425
00:23:23,900 --> 00:23:26,890
whether the first one is the
density or whether it's the

426
00:23:26,890 --> 00:23:27,720
probability.

427
00:23:27,720 --> 00:23:29,640
You can deal with it
in the same way.

428
00:23:29,640 --> 00:23:32,390
And you get this formula,
which I hope is

429
00:23:32,390 --> 00:23:36,340
not unusual to you.

430
00:23:36,340 --> 00:23:40,310
OK, so our MAP decision rule
then, our MAP decision rule,

431
00:23:40,310 --> 00:23:45,720
remember, is to pick the more,
is to find the a posteriori

432
00:23:45,720 --> 00:23:48,860
probability which is
most probable.

433
00:23:48,860 --> 00:23:52,540
Because that is the probability
of being correct.

434
00:23:52,540 --> 00:23:56,530
So, in fact, if this probability
is bigger than

435
00:23:56,530 --> 00:24:01,260
this probability, this is the a
posteriori probability that

436
00:24:01,260 --> 00:24:03,200
H is equal to zero.

437
00:24:03,200 --> 00:24:06,810
This is the a posteriori
probability that

438
00:24:06,810 --> 00:24:08,290
H is equal to one.

439
00:24:08,290 --> 00:24:11,220
So we're just going to compare
those two and pick the larger.

440
00:24:11,220 --> 00:24:18,420
And if this one is larger than
this one, we pick our choice

441
00:24:18,420 --> 00:24:19,740
equal zero.

442
00:24:19,740 --> 00:24:22,850
And if it's smaller, we pick
our choice equal to one.

443
00:24:22,850 --> 00:24:27,240
So this is what MAP
detection is.

444
00:24:27,240 --> 00:24:30,670
Why did I make this greater
than or equal

445
00:24:30,670 --> 00:24:33,000
and this less than?

446
00:24:33,000 --> 00:24:36,320
Well, if we have densities,
it doesn't often make any

447
00:24:36,320 --> 00:24:37,730
difference.

448
00:24:37,730 --> 00:24:40,710
Strangely enough, it does
sometimes make a difference.

449
00:24:40,710 --> 00:24:43,280
Because sometimes you can have
a density, and the densities

450
00:24:43,280 --> 00:24:47,890
are the same for both of
these likelihoods.

451
00:24:47,890 --> 00:24:50,870
And you can find situations
where it's important.

452
00:24:50,870 --> 00:24:55,500
But when the two probabilities
are the same, the probability

453
00:24:55,500 --> 00:24:58,980
of being correct is the same in
both cases, so it doesn't

454
00:24:58,980 --> 00:25:01,750
make any difference what
you do when you

455
00:25:01,750 --> 00:25:04,250
have a quality here.

456
00:25:04,250 --> 00:25:06,580
And therefore we've just
made a decision.

457
00:25:06,580 --> 00:25:09,610
We've said, OK, what we're going
to do is whenever this

458
00:25:09,610 --> 00:25:13,950
is equal to this, we're
going to choose zero.

459
00:25:13,950 --> 00:25:18,020
If you prefer choosing
one, be my guest.

460
00:25:18,020 --> 00:25:20,390
All of your MAP error
probabilities will

461
00:25:20,390 --> 00:25:21,660
be exactly the same.

462
00:25:21,660 --> 00:25:23,580
Nothing will change.

463
00:25:23,580 --> 00:25:27,640
It just is easier to do the
same thing all the time.

464
00:25:27,640 --> 00:25:29,700
OK, well then we look at this
formula, and we say, "Well, I

465
00:25:29,700 --> 00:25:33,990
can simplify this a little
bit." If I take this

466
00:25:33,990 --> 00:25:39,700
likelihood and move it over to
this side, and if I take this

467
00:25:39,700 --> 00:25:43,010
marginal density and move it
over to this side, and if I

468
00:25:43,010 --> 00:25:45,970
take p zero and move it over
to this side, then the

469
00:25:45,970 --> 00:25:48,270
marginal densities cancel out.

470
00:25:48,270 --> 00:25:50,430
They had nothing to do
with the problem.

471
00:25:50,430 --> 00:25:54,080
And I wind up with a ratio
of the likelihoods.

472
00:25:54,080 --> 00:25:55,860
And what do you think
the ratio of the

473
00:25:55,860 --> 00:25:57,460
likelihoods is called?

474
00:25:57,460 --> 00:25:59,415
Somebody got the smart
idea of calling that

475
00:25:59,415 --> 00:26:02,930
a likelihood ratio.

476
00:26:02,930 --> 00:26:05,810
Somehow the people in statistics
were much better at

477
00:26:05,810 --> 00:26:09,640
generating notation than the
people in communication theory

478
00:26:09,640 --> 00:26:12,360
who have done just an abominable
job of choosing

479
00:26:12,360 --> 00:26:14,660
notation for things.

480
00:26:14,660 --> 00:26:17,500
But anyway, they call this
a likelihood ratio.

481
00:26:17,500 --> 00:26:21,230
And the rule then becomes: if
the likelihood ratio is

482
00:26:21,230 --> 00:26:24,180
greater than or equal to
the ratio of p1 to

483
00:26:24,180 --> 00:26:26,710
p0, we choose zero.

484
00:26:26,710 --> 00:26:29,030
And if it's less,
we choose one.

485
00:26:29,030 --> 00:26:32,220
And we call this ratio
the threshold.

486
00:26:32,220 --> 00:26:38,240
So in fact what this says is
binary MAP tests are always

487
00:26:38,240 --> 00:26:39,840
threshold tests.

488
00:26:39,840 --> 00:26:44,610
And by a threshold test I mean
finds the likelihood ratio,

489
00:26:44,610 --> 00:26:47,520
compare the likelihood ratio
with the threshold-- the

490
00:26:47,520 --> 00:26:51,010
threshold in fact is this
ratio of a priori

491
00:26:51,010 --> 00:26:53,120
probabilities--

492
00:26:53,120 --> 00:26:55,520
and at that point you
have actually

493
00:26:55,520 --> 00:26:57,980
achieved the MAP test.

494
00:26:57,980 --> 00:27:02,850
In other words, you have done
something which actually, for

495
00:27:02,850 --> 00:27:06,950
real, minimizes the probability
of error.

496
00:27:06,950 --> 00:27:10,610
Maximizes the probability
of being correct.

497
00:27:10,610 --> 00:27:15,730
Well because of that, this thing
here, this likelihood

498
00:27:15,730 --> 00:27:19,820
ratio, is called a sufficient
statistic.

499
00:27:19,820 --> 00:27:22,830
And it's called a sufficient
statistic because you can do

500
00:27:22,830 --> 00:27:27,100
math decoding just by
knowing this number.

501
00:27:27,100 --> 00:27:27,420
OK?

502
00:27:27,420 --> 00:27:30,920
In other words, it says you can
you can calculate these

503
00:27:30,920 --> 00:27:33,770
likelihoods.

504
00:27:33,770 --> 00:27:36,410
You can find the ratio of
them-- which is this

505
00:27:36,410 --> 00:27:40,120
likelihood ratio-- and after you
know the likelihood ratio,

506
00:27:40,120 --> 00:27:42,810
you don't have to worry about
these likelihoods anymore.

507
00:27:42,810 --> 00:27:47,030
This is the only thing relevant
to the problem.

508
00:27:47,030 --> 00:27:50,420
Now this doesn't seem to be a
huge saving, because here

509
00:27:50,420 --> 00:27:53,270
we're dealing with two real
numbers-- well here we've

510
00:27:53,270 --> 00:27:55,820
reduced it to one real number--
which is something.

511
00:27:55,820 --> 00:27:59,140
When we start dealing with
vectors, when we start dealing

512
00:27:59,140 --> 00:28:02,830
with wave forms, this is
really a big thing.

513
00:28:02,830 --> 00:28:05,580
Because what you're
doing is reducing

514
00:28:05,580 --> 00:28:08,270
the vectors to numbers.

515
00:28:08,270 --> 00:28:11,540
And when you reduce accountibly
infinite

516
00:28:11,540 --> 00:28:13,960
dimensional vector to a number,

517
00:28:13,960 --> 00:28:16,930
that's a big advantage.

518
00:28:16,930 --> 00:28:19,870
It also, in terms of the
communication problems we're

519
00:28:19,870 --> 00:28:24,270
facing, breaks up a detector
into two pieces in an

520
00:28:24,270 --> 00:28:25,900
interesting way.

521
00:28:25,900 --> 00:28:28,400
Mainly it says there are things
you do with the wave

522
00:28:28,400 --> 00:28:33,400
form in order to calculate what
this likelihood ratio is,

523
00:28:33,400 --> 00:28:36,490
and then after you find the
likelihood ratio you just

524
00:28:36,490 --> 00:28:39,150
forget about what the wave
form was and you

525
00:28:39,150 --> 00:28:40,630
deal only with that.

526
00:28:40,630 --> 00:28:43,390
What we're going to find out
is in this problem we were

527
00:28:43,390 --> 00:28:44,640
looking at here--

528
00:28:47,540 --> 00:28:49,390
we're going to find out
later when we look

529
00:28:49,390 --> 00:28:52,060
at the vector problem--

530
00:28:52,060 --> 00:28:59,010
in fact this thing here is in
fact the likelihood ratio if

531
00:28:59,010 --> 00:29:02,560
you make an observation out
at this point here.

532
00:29:02,560 --> 00:29:06,090
In other words, right at the
front end of the receiver,

533
00:29:06,090 --> 00:29:07,950
that's where you have all
the information you

534
00:29:07,950 --> 00:29:10,620
can possibly have.

535
00:29:10,620 --> 00:29:16,680
If you calculate likelihood
ratios at that point what

536
00:29:16,680 --> 00:29:20,640
you're going to do is to find
the likelihood ratio you're

537
00:29:20,640 --> 00:29:24,530
going to go through all this
stuff right here and wind up

538
00:29:24,530 --> 00:29:28,490
with that which is work which
is proportional to the

539
00:29:28,490 --> 00:29:30,760
likelihood ratio right here.

540
00:29:30,760 --> 00:29:34,140
OK, so one of the things we're
doing right now is we're not

541
00:29:34,140 --> 00:29:35,360
looking at that problem.

542
00:29:35,360 --> 00:29:38,070
We're only looking at the
simpler problem, assuming a

543
00:29:38,070 --> 00:29:40,210
one dimensional problem.

544
00:29:40,210 --> 00:29:42,990
But the reason we're looking
at it is that later we're

545
00:29:42,990 --> 00:29:45,970
going to show that this is, in
fact, the solution to the more

546
00:29:45,970 --> 00:29:48,730
general problem.

547
00:29:48,730 --> 00:29:53,040
Which was Shannon's idea
in the first place.

548
00:29:53,040 --> 00:29:56,630
Of, how do you solve the trivial
problem first and then

549
00:29:56,630 --> 00:30:02,520
see what the complicated
problem is.

550
00:30:02,520 --> 00:30:10,340
OK, so that's what we're trying
to do, summarized here

551
00:30:10,340 --> 00:30:14,870
for any binary detection problem
where the observation

552
00:30:14,870 --> 00:30:18,520
has a sample value of
a random something.

553
00:30:18,520 --> 00:30:21,180
Namely, a random vector, a
random process, a random

554
00:30:21,180 --> 00:30:24,730
variable, a complex variable,
a complex anything.

555
00:30:24,730 --> 00:30:27,770
Anything whatsoever, so long
as you can assign a

556
00:30:27,770 --> 00:30:30,950
probability density to it.

557
00:30:30,950 --> 00:30:34,540
You calculate the likelihood
ratio, which is this ratio

558
00:30:34,540 --> 00:30:38,250
here, so long as you have then
cities even talk about.

559
00:30:38,250 --> 00:30:42,340
The MAP rule is to compare this
likelihood ratio with the

560
00:30:42,340 --> 00:30:45,460
threshold data-- which is just
the ratio of the a priori

561
00:30:45,460 --> 00:30:47,980
probabilities--

562
00:30:47,980 --> 00:30:49,660
if this is greater
than or equal to

563
00:30:49,660 --> 00:30:51,780
that, you choose zero.

564
00:30:51,780 --> 00:30:54,270
Otherwise you choose one.

565
00:30:54,270 --> 00:30:58,000
The MAP rule, as I said before,
partitions this

566
00:30:58,000 --> 00:31:01,940
observation space
into two pieces.

567
00:31:01,940 --> 00:31:05,670
Into two segments.

568
00:31:05,670 --> 00:31:09,320
And one of those pieces
gets mapped into zero.

569
00:31:09,320 --> 00:31:11,850
One of the pieces gets
mapped into one.

570
00:31:11,850 --> 00:31:15,600
It's exactly like a
binary quanitizer.

571
00:31:15,600 --> 00:31:19,440
Except the rule you use to
choose the the quantization

572
00:31:19,440 --> 00:31:20,440
regions is different.

573
00:31:20,440 --> 00:31:25,430
But a quanitizer maps a space
into a finite set of regions.

574
00:31:25,430 --> 00:31:29,200
And this detection rule does
exactly the same thing.

575
00:31:29,200 --> 00:31:32,270
And since the beginning of
information theory people have

576
00:31:32,270 --> 00:31:36,750
been puzzling over how to make
use of the correspondence

577
00:31:36,750 --> 00:31:41,020
between quanitization on one
hand and detection on the

578
00:31:41,020 --> 00:31:42,410
other hand.

579
00:31:42,410 --> 00:31:44,300
And there are some
correspondences but they

580
00:31:44,300 --> 00:31:47,830
aren't all that good
most of the time.

581
00:31:47,830 --> 00:31:52,640
OK, so you get an error when
the actual hypothesis that

582
00:31:52,640 --> 00:31:57,290
occurred, namely the bit that
got sent was i and if the

583
00:31:57,290 --> 00:32:01,630
observation landed in
the other subset.

584
00:32:01,630 --> 00:32:03,960
We know that the MAP rule
minimizes the error

585
00:32:03,960 --> 00:32:05,130
probability.

586
00:32:05,130 --> 00:32:08,490
So you have a rule which you
can use for all binary

587
00:32:08,490 --> 00:32:12,390
detection problems so long
as you have the density.

588
00:32:12,390 --> 00:32:15,010
And if you don't have a density
you can generalize it

589
00:32:15,010 --> 00:32:18,560
without too much trouble.

590
00:32:18,560 --> 00:32:24,330
OK, so we want to look at the
problem in Gaussian noise.

591
00:32:24,330 --> 00:32:27,530
In particular we want to
look at it for 2PAM.

592
00:32:27,530 --> 00:32:32,720
In other words for a standard
PAM system, where zero gets

593
00:32:32,720 --> 00:32:37,670
mapped into plus a and one
gets mapped into minus a.

594
00:32:37,670 --> 00:32:40,850
This is often called antipodal
signaling because you're

595
00:32:40,850 --> 00:32:43,420
sending a plus something
and a minus something.

596
00:32:43,420 --> 00:32:46,600
They are at opposite ends
of the spectrum.

597
00:32:46,600 --> 00:32:49,760
You push them as far away as you
can, because as you push

598
00:32:49,760 --> 00:32:51,560
them further and further
away it requires

599
00:32:51,560 --> 00:32:52,930
more and more energy.

600
00:32:52,930 --> 00:32:54,780
So you use the energy
you have.

601
00:32:54,780 --> 00:32:57,880
You get them as far apart as you
can, and you hope that's

602
00:32:57,880 --> 00:32:58,940
going to help you.

603
00:32:58,940 --> 00:33:01,720
And we'll see that
it does help you.

604
00:33:01,720 --> 00:33:07,110
OK, so what you receive then,
we'll assume, is either plus

605
00:33:07,110 --> 00:33:08,020
or minus a--

606
00:33:08,020 --> 00:33:11,560
depending on which hypothesis
occured--

607
00:33:11,560 --> 00:33:14,750
plus a Gaussian random
variable.

608
00:33:14,750 --> 00:33:19,040
And here's where the notation
of communication theorists

609
00:33:19,040 --> 00:33:22,030
rears its ugly head.

610
00:33:22,030 --> 00:33:27,450
We call the variance of this
random variable n 0 over 2.

611
00:33:27,450 --> 00:33:31,900
I would prefer to call
it sigma squared but

612
00:33:31,900 --> 00:33:34,470
unfortunately you can't
fight city hall on

613
00:33:34,470 --> 00:33:36,410
something like this.

614
00:33:36,410 --> 00:33:40,510
And everybody talks about
n 0 and n 0 over 2.

615
00:33:40,510 --> 00:33:42,670
And you got to get used to it.

616
00:33:42,670 --> 00:33:45,260
So here's where we're starting
to get used to it.

617
00:33:45,260 --> 00:33:50,510
So that's the variance of this
noise random variable.

618
00:33:50,510 --> 00:33:54,460
OK, we're only going to send one
binary digit, H, so this

619
00:33:54,460 --> 00:33:57,850
is the only, this is the sole
problem we have to deal with.

620
00:33:57,850 --> 00:34:00,750
We've made a binary choice.

621
00:34:00,750 --> 00:34:04,110
Added one Gaussian random
variable to it.

622
00:34:04,110 --> 00:34:07,970
You observe the sum,
and you guess.

623
00:34:07,970 --> 00:34:12,330
So what are these likelihoods
in this case?

624
00:34:12,330 --> 00:34:16,130
Well, the likelihood if H is
equal zero, in other words if

625
00:34:16,130 --> 00:34:20,270
you're sending a plus a, the
likelihood is just a Gaussian

626
00:34:20,270 --> 00:34:24,350
density shifted over by a.

627
00:34:24,350 --> 00:34:27,620
And if you're sending, on
the other hand, a one--

628
00:34:27,620 --> 00:34:30,600
which means you're sending minus
a-- you have a Gaussian

629
00:34:30,600 --> 00:34:33,420
density shifted over
the other way.

630
00:34:33,420 --> 00:34:36,200
Let me show you a
picture of that.

631
00:34:40,530 --> 00:34:44,080
We'll come back to analyze more
things about the picture

632
00:34:44,080 --> 00:34:46,080
in a little bit, so don't
worry about most of the

633
00:34:46,080 --> 00:34:47,330
picture at this point.

634
00:34:49,820 --> 00:34:55,030
OK, this is the likelihood
probability density of the

635
00:34:55,030 --> 00:34:58,410
output given that
you sent a zero.

636
00:34:58,410 --> 00:35:00,380
Mainly that you sent plus a.

637
00:35:00,380 --> 00:35:02,940
So we have a Gaussian density--
this bell shaped

638
00:35:02,940 --> 00:35:05,620
curve-- centered
around plus a.

639
00:35:05,620 --> 00:35:09,750
If you sent a one, you're
sending minus a-- one gets

640
00:35:09,750 --> 00:35:13,230
mapped into minus a-- and you
have the same bell shaped

641
00:35:13,230 --> 00:35:16,560
curve centered around minus a.

642
00:35:16,560 --> 00:35:21,560
If you receive any particular
value of v, mainly suppose you

643
00:35:21,560 --> 00:35:25,220
receive the value of v here,
you calculate these two

644
00:35:25,220 --> 00:35:26,160
likelihoods.

645
00:35:26,160 --> 00:35:27,490
One of them is this.

646
00:35:27,490 --> 00:35:29,850
One of them is that.

647
00:35:29,850 --> 00:35:33,200
You compare them, the ratio with
the threshold, and you

648
00:35:33,200 --> 00:35:35,860
make your choice.

649
00:35:35,860 --> 00:35:39,910
OK, so let's go back to do it.

650
00:35:39,910 --> 00:35:42,730
To do the arithmetic.

651
00:35:42,730 --> 00:35:45,080
Here are the two likelihoods.

652
00:35:45,080 --> 00:35:47,860
You take the ratio of
these two things.

653
00:35:47,860 --> 00:35:51,180
When you take the ratio
of them, what happens?

654
00:35:51,180 --> 00:35:52,560
And this sort of always
happens in

655
00:35:52,560 --> 00:35:54,980
these Gaussian problems.

656
00:35:54,980 --> 00:35:57,320
These terms cancel out.

657
00:35:57,320 --> 00:35:58,330
Well it always happens in these

658
00:35:58,330 --> 00:36:00,150
additive Gaussian problems.

659
00:36:00,150 --> 00:36:02,810
These terms cancel out.

660
00:36:02,810 --> 00:36:06,150
You take a ratio of
two exponents.

661
00:36:06,150 --> 00:36:08,520
You just get the difference.

662
00:36:08,520 --> 00:36:12,600
So the likelihood ratio--
this divided by this--

663
00:36:12,600 --> 00:36:18,140
is then e to the minus v minus
a squared over n0.

664
00:36:18,140 --> 00:36:21,550
and v plus a squared over n0.

665
00:36:21,550 --> 00:36:21,900
OK?

666
00:36:21,900 --> 00:36:26,940
Because normally the Gaussian
density is something divided

667
00:36:26,940 --> 00:36:30,830
by two sigma squared, and sigma
squared here is n0 over

668
00:36:30,830 --> 00:36:32,580
2, so the 2's cancel out.

669
00:36:32,580 --> 00:36:35,560
One nice thing about the
notation anyway, you get rid

670
00:36:35,560 --> 00:36:37,260
of one factor of two in it.

671
00:36:37,260 --> 00:36:39,760
Well so you have this
minus this.

672
00:36:39,760 --> 00:36:43,190
When you take the difference
of these two things the v

673
00:36:43,190 --> 00:36:46,480
squareds cancel out.

674
00:36:46,480 --> 00:36:50,130
Because one of these things is
in the numerator, the other

675
00:36:50,130 --> 00:36:51,970
one was in the denominator.

676
00:36:51,970 --> 00:36:55,200
So you have this term
comes through as is.

677
00:36:55,200 --> 00:36:56,640
This is--

678
00:36:56,640 --> 00:36:58,830
you're dividing by this--

679
00:36:58,830 --> 00:37:01,800
so when you multiply this
turns into a plus sign.

680
00:37:01,800 --> 00:37:05,930
So the v squared here cancels
out with the v squared here.

681
00:37:05,930 --> 00:37:09,680
The a squared here cancels out
with the a squared here.

682
00:37:09,680 --> 00:37:12,810
And it's only the inner product
term that survives

683
00:37:12,810 --> 00:37:14,490
this whole thing.

684
00:37:14,490 --> 00:37:16,750
And here you have plus 2va.

685
00:37:16,750 --> 00:37:18,810
Here you have plus 2va.

686
00:37:18,810 --> 00:37:24,920
So you wind up with e to
the 4av divided by n0.

687
00:37:24,920 --> 00:37:28,060
Which is very nice, because
what it says is this

688
00:37:28,060 --> 00:37:31,140
likelihood, which is what
determines everything in the

689
00:37:31,140 --> 00:37:35,210
world, is just a scalar in
multiple of the observation.

690
00:37:37,740 --> 00:37:40,870
And that that's going to
simplify things a fair amount.

691
00:37:40,870 --> 00:37:43,170
It's why that picture comes
out as simply is it does.

692
00:37:47,500 --> 00:37:52,760
OK, so to do a little more
of the arithmetic.

693
00:37:52,760 --> 00:37:57,220
This is the likelihood here,
e to the 4av over n0.

694
00:37:57,220 --> 00:38:02,340
So our rule is you compare
this likelihood to the

695
00:38:02,340 --> 00:38:05,690
threshold-- which is p1 over
p0, which we call eta--

696
00:38:05,690 --> 00:38:09,590
and you look at that for awhile
and you say, "Gee, this

697
00:38:09,590 --> 00:38:12,500
is going to be much easier
to deal with.

698
00:38:12,500 --> 00:38:15,630
Instead of looking at the
likelihood ratio, I look at

699
00:38:15,630 --> 00:38:18,530
the log likelihood ratio."

700
00:38:18,530 --> 00:38:21,950
And people who deal with
Gaussian problems a lot, you

701
00:38:21,950 --> 00:38:25,220
never hear them talk about
likelihood ratios, you always

702
00:38:25,220 --> 00:38:28,890
hear them talk about log
likelihood ratios.

703
00:38:28,890 --> 00:38:33,200
And you can find one from the
other, so either one is

704
00:38:33,200 --> 00:38:34,110
equally good.

705
00:38:34,110 --> 00:38:36,830
In other words, the log
likelihood ratio is a

706
00:38:36,830 --> 00:38:40,110
sufficient statistic, because
you can calculate the

707
00:38:40,110 --> 00:38:43,790
likelihood ratio from it.

708
00:38:43,790 --> 00:38:46,040
So this is a sufficient
statistic.

709
00:38:46,040 --> 00:38:49,110
It's equal to 4av over n0.

710
00:38:49,110 --> 00:38:52,200
And when this is greater than
or equal the log of the

711
00:38:52,200 --> 00:38:54,410
threshold, you go this way.

712
00:38:54,410 --> 00:38:56,960
When it's less than,
you go this way.

713
00:38:56,960 --> 00:39:05,170
So when you then multiply by n0
over 4a, your decision rule

714
00:39:05,170 --> 00:39:07,330
is you just look at
the observation.

715
00:39:07,330 --> 00:39:13,710
You compare it with n0 times
log of eta divided by 4a.

716
00:39:13,710 --> 00:39:18,330
And at this point we can go back
to this picture and sort

717
00:39:18,330 --> 00:39:22,770
of sort out what all
of it means.

718
00:39:22,770 --> 00:39:26,800
Because this point here
is now the threshold.

719
00:39:26,800 --> 00:39:29,640
It's n0 times log of
eta divided 4eta.

720
00:39:32,680 --> 00:39:34,230
By 4a.

721
00:39:34,230 --> 00:39:37,090
That's what we said the
threshold had to be.

722
00:39:37,090 --> 00:39:41,090
So we have these two Gaussian
curves now.

723
00:39:41,090 --> 00:39:43,440
Why do we have to go back and
look at these Gaussian curves?

724
00:39:43,440 --> 00:39:48,640
I told you that once we
calculated the likelihood

725
00:39:48,640 --> 00:39:52,230
ratio we could forget
about the curves.

726
00:39:52,230 --> 00:39:55,480
So why do I want to put
the curves back in?

727
00:39:55,480 --> 00:39:57,710
Well because I want to calculate
the probability of

728
00:39:57,710 --> 00:40:01,150
error at this point.

729
00:40:01,150 --> 00:40:01,900
OK?

730
00:40:01,900 --> 00:40:04,810
And it's easier to calculate the
probability of error if,

731
00:40:04,810 --> 00:40:08,150
in fact, I draw curve for
myself and I look at

732
00:40:08,150 --> 00:40:10,380
what's going on.

733
00:40:10,380 --> 00:40:12,420
So here's the threshold.

734
00:40:12,420 --> 00:40:18,220
Here's the density when H equals
one is the correct

735
00:40:18,220 --> 00:40:20,000
hypothesis.

736
00:40:20,000 --> 00:40:23,250
The probability of error is the
probability that when I

737
00:40:23,250 --> 00:40:29,970
send one, the observation is
going to be a random variable

738
00:40:29,970 --> 00:40:36,460
with this probability density
and if it's a wild case, and I

739
00:40:36,460 --> 00:40:41,190
got an enormous value of noise,
positive noise, the

740
00:40:41,190 --> 00:40:44,310
noise is going to push me over
that threshold there and I'm

741
00:40:44,310 --> 00:40:45,860
going to make a mistake.

742
00:40:45,860 --> 00:40:49,010
So, in fact, the probability
of error-- conditional on

743
00:40:49,010 --> 00:40:50,570
sending one--

744
00:40:50,570 --> 00:40:55,050
is just this probability of that
little space in there.

745
00:40:55,050 --> 00:40:55,350
OK?

746
00:40:55,350 --> 00:40:59,360
Which is the probability that
I'm going to say that zero

747
00:40:59,360 --> 00:41:02,330
occurred when, in fact,
one occurred.

748
00:41:02,330 --> 00:41:06,830
So that's my probability of
error when one occurs.

749
00:41:06,830 --> 00:41:09,660
What's the probability of
error when zero occurs?

750
00:41:09,660 --> 00:41:11,350
Well it's the same analysis.

751
00:41:11,350 --> 00:41:13,520
When zero occurs--

752
00:41:13,520 --> 00:41:16,620
mainly when the correct
hypothesis is zero--

753
00:41:16,620 --> 00:41:21,850
the output, v, follows this
probability density here.

754
00:41:21,850 --> 00:41:27,410
And I'm going to screw up if
the noise carries me beyond

755
00:41:27,410 --> 00:41:29,860
this point here.

756
00:41:29,860 --> 00:41:33,680
So you can see what the
threshold is doing now.

757
00:41:33,680 --> 00:41:38,150
I mean, when you choose a
threshold which is positive it

758
00:41:38,150 --> 00:41:43,560
makes it much harder to screw
up when you send a minus a.

759
00:41:43,560 --> 00:41:45,580
It makes it much easier
to screw up when

760
00:41:45,580 --> 00:41:47,750
you send a plus a.

761
00:41:47,750 --> 00:41:50,690
But you see that's what we
wanted to do, because the

762
00:41:50,690 --> 00:41:58,230
threshold was positive in this
case, because p1 was so much

763
00:41:58,230 --> 00:42:00,300
larger than p0.

764
00:42:00,300 --> 00:42:03,250
And because p1 is so much
larger than p0--

765
00:42:03,250 --> 00:42:06,800
p1 happens almost
all the time--

766
00:42:06,800 --> 00:42:10,320
and therefore you would normally
almost choose p1

767
00:42:10,320 --> 00:42:13,860
without looking at v. Which says
you want to push it over

768
00:42:13,860 --> 00:42:16,730
that way a little bit.

769
00:42:16,730 --> 00:42:21,630
OK, when you calculate this
probability of error, it's the

770
00:42:21,630 --> 00:42:26,650
probability of the tail of a
Gaussian random variable.

771
00:42:26,650 --> 00:42:33,790
So you define this tail
variable, q of x, is the

772
00:42:33,790 --> 00:42:36,710
complimentary distribution
function of a

773
00:42:36,710 --> 00:42:38,900
normal random variable.

774
00:42:38,900 --> 00:42:41,710
It's the integral from x to
infinity of one over the

775
00:42:41,710 --> 00:42:48,750
square root of 2pi, e to the
minus c squared over 2.

776
00:42:48,750 --> 00:42:51,890
I guess this would make better
sense if this were a z--

777
00:42:51,890 --> 00:42:54,510
one and--

778
00:42:54,510 --> 00:42:55,810
oh, no.

779
00:42:55,810 --> 00:42:57,800
No, I did it right
the first time.

780
00:42:57,800 --> 00:43:01,380
That's an x, because x is the
limit in there, you see.

781
00:43:01,380 --> 00:43:06,250
So I'm calculating all of the
probability density that's off

782
00:43:06,250 --> 00:43:08,550
to the right of x.

783
00:43:08,550 --> 00:43:13,900
And the probability of error
when H is equal to one is this

784
00:43:13,900 --> 00:43:17,640
probability-- which looks like
it's the tail on the negative

785
00:43:17,640 --> 00:43:20,120
side but if you think about
it a little bit, since the

786
00:43:20,120 --> 00:43:24,480
Gaussian curve is symmetric, you
can also look at it as a q

787
00:43:24,480 --> 00:43:32,070
function where now when you have
this is equal to zero,

788
00:43:32,070 --> 00:43:37,720
this corresponds to changing
this plus to a minus here and

789
00:43:37,720 --> 00:43:40,280
that's the only change.

790
00:43:40,280 --> 00:43:46,080
OK, so this looks a
little ugly and it

791
00:43:46,080 --> 00:43:49,410
looks a little strange.

792
00:43:49,410 --> 00:43:52,000
I mean you can sort
of interpret

793
00:43:52,000 --> 00:43:55,180
this part of it here--

794
00:43:55,180 --> 00:43:57,780
I can interpret this
part if I'm using

795
00:43:57,780 --> 00:44:00,580
maximum likelihood decoding--

796
00:44:00,580 --> 00:44:04,390
maximum likelihood is mapped
decoding where the threshold

797
00:44:04,390 --> 00:44:05,270
is equal to one.

798
00:44:05,270 --> 00:44:08,360
In other words, it's where
you're assuming that the

799
00:44:08,360 --> 00:44:13,340
hypothesis is equally likely to
be zero or one-- a priori--

800
00:44:13,340 --> 00:44:17,330
which is a good assumption
almost always in communication

801
00:44:17,330 --> 00:44:22,680
because we work very hard in
doing source coding to make

802
00:44:22,680 --> 00:44:27,500
those binary digits equally
likely zero or one.

803
00:44:27,500 --> 00:44:29,420
And there are other reasons
for choosing maximum

804
00:44:29,420 --> 00:44:30,320
likelihood.

805
00:44:30,320 --> 00:44:34,020
If you don't know anything about
the probabilities it's a

806
00:44:34,020 --> 00:44:37,910
good assumption in a sort
of a max/min sense.

807
00:44:37,910 --> 00:44:42,130
It sort of limits how much you
can screw up by having the

808
00:44:42,130 --> 00:44:47,390
wrong probability, so it's a
very robust choice also.

809
00:44:47,390 --> 00:44:52,690
OK, but now this, we're taking
the ratio of a with the square

810
00:44:52,690 --> 00:44:54,580
root of n0.

811
00:44:54,580 --> 00:44:59,110
Well the square root of n0 over
2 is really the standard

812
00:44:59,110 --> 00:45:00,820
deviation of the noise.

813
00:45:00,820 --> 00:45:03,770
So what we're doing is comparing
the amount of input

814
00:45:03,770 --> 00:45:10,000
we've put in with the standard
deviation of the noise.

815
00:45:10,000 --> 00:45:11,260
Now does that make any sense?

816
00:45:11,260 --> 00:45:13,950
The probability of error
depends on that ratio?

817
00:45:13,950 --> 00:45:16,600
Well yeah, it makes a
whole lot of sense.

818
00:45:16,600 --> 00:45:19,540
Because if, for example, I
wanted to look at this problem

819
00:45:19,540 --> 00:45:22,450
in a different scaling
system--

820
00:45:22,450 --> 00:45:26,440
if this is volts and I want to
look at it in milli-volts--

821
00:45:26,440 --> 00:45:32,020
I'm going to divide a by, I'm
going to multiply a by 1000.

822
00:45:32,020 --> 00:45:35,670
I'm going to multiply the
standard deviation of the

823
00:45:35,670 --> 00:45:37,430
noise by 1000.

824
00:45:37,430 --> 00:45:39,660
Because one of the things we
always do here-- the way we

825
00:45:39,660 --> 00:45:42,390
choose n0 over 2--

826
00:45:42,390 --> 00:45:46,210
n0 over 2 is sort of a
meaningless quantity.

827
00:45:46,210 --> 00:45:52,990
It's the noise energy in one
degree of freedom in the

828
00:45:52,990 --> 00:45:56,920
scaling reference that we're
using for the data.

829
00:45:56,920 --> 00:45:57,890
OK?

830
00:45:57,890 --> 00:46:00,190
And that's the only definition
you can come up with that

831
00:46:00,190 --> 00:46:01,640
makes any sense.

832
00:46:01,640 --> 00:46:06,420
I mean you scale the data in
whatever way you please, and

833
00:46:06,420 --> 00:46:12,580
when we've gone from baseband
to passband, we in fact have

834
00:46:12,580 --> 00:46:19,140
multiplied the energy in the
input by a factor of two.

835
00:46:19,140 --> 00:46:21,510
And therefore, because of
that, we're going to--

836
00:46:21,510 --> 00:46:26,190
n0 at passband is going to be a
square root of 2 bigger than

837
00:46:26,190 --> 00:46:27,870
it is at baseband.

838
00:46:27,870 --> 00:46:31,150
If you don't like that,
live with it.

839
00:46:31,150 --> 00:46:33,140
That's the way it is.

840
00:46:33,140 --> 00:46:36,030
Nobody will change n0,
no matter who wants

841
00:46:36,030 --> 00:46:37,910
them to change it.

842
00:46:37,910 --> 00:46:40,870
OK, so this term make sense.

843
00:46:40,870 --> 00:46:48,080
It's the ratio of the signal
amplitude to the standard

844
00:46:48,080 --> 00:46:50,130
deviation of the noise.

845
00:46:50,130 --> 00:46:53,720
And that should be the only way
that the signal amplitude

846
00:46:53,720 --> 00:46:57,500
or the standard deviation of the
noise enters in, because

847
00:46:57,500 --> 00:47:00,250
it's really the ratio that
has to be important.

848
00:47:00,250 --> 00:47:04,070
Why this crazy term?

849
00:47:04,070 --> 00:47:09,410
Well if you look at the curve
you can sort of see why it is.

850
00:47:09,410 --> 00:47:13,300
The threshold test is comparing
the likelihood ratio

851
00:47:13,300 --> 00:47:15,250
of this curve with
the likelihood

852
00:47:15,250 --> 00:47:17,310
ratio of this curve.

853
00:47:17,310 --> 00:47:27,150
What's going to happen as
a gets very, very large?

854
00:47:27,150 --> 00:47:32,240
You move a out, and the thing
that's happening then is this

855
00:47:32,240 --> 00:47:36,460
curve-- which is now coming down
in a modest way here-- if

856
00:47:36,460 --> 00:47:39,370
you move a way out here, you're
going to have almost

857
00:47:39,370 --> 00:47:40,970
nothing there.

858
00:47:40,970 --> 00:47:44,750
And it's going to be going
down very fast.

859
00:47:44,750 --> 00:47:46,690
It's going to be going
down very fast

860
00:47:46,690 --> 00:47:50,470
relative to its magnitude.

861
00:47:50,470 --> 00:47:53,960
In other words the bigger
a gets, the bigger this

862
00:47:53,960 --> 00:47:57,850
difference is going to be
for any given threshold.

863
00:47:57,850 --> 00:48:03,370
And that's why you get a over
square root of n0 here.

864
00:48:03,370 --> 00:48:06,130
And here you get exactly
the opposite thing.

865
00:48:06,130 --> 00:48:09,400
That's because for a given
threshold, as this signal to

866
00:48:09,400 --> 00:48:17,730
noise ratio gets bigger, this
threshold term becomes almost

867
00:48:17,730 --> 00:48:20,560
totally unimportant.

868
00:48:20,560 --> 00:48:23,270
I mean you get so much
information out of the reading

869
00:48:23,270 --> 00:48:30,470
you're making, because it's
so reliable, that having a

870
00:48:30,470 --> 00:48:34,240
threshold is almost completely
irrelevant.

871
00:48:34,240 --> 00:48:35,990
And therefore you can sort
of forget about it.

872
00:48:35,990 --> 00:48:40,100
If a is very large, this
term is zilch.

873
00:48:40,100 --> 00:48:41,350
OK?

874
00:48:43,240 --> 00:48:45,770
So if you want to have reliable
communication and you

875
00:48:45,770 --> 00:48:49,000
use a large amount of single
noise ratio to get it, that's

876
00:48:49,000 --> 00:48:55,320
another reason for forgetting
about whether the threshold is

877
00:48:55,320 --> 00:48:58,000
one or something else.

878
00:48:58,000 --> 00:49:02,520
And we would certainly like to
deal with problems where the

879
00:49:02,520 --> 00:49:05,790
threshold is equal to one
because most people who

880
00:49:05,790 --> 00:49:10,820
remember q of a signal
to noise ratio.

881
00:49:10,820 --> 00:49:14,340
I don't know anybody who can
remember this formula.

882
00:49:14,340 --> 00:49:18,330
I'm sure there's some people,
but I don't think anybody who

883
00:49:18,330 --> 00:49:20,940
works in the communication
field ever thinks

884
00:49:20,940 --> 00:49:22,390
about this at all.

885
00:49:22,390 --> 00:49:25,020
Except the first time they
derive it and the say, "Oh,

886
00:49:25,020 --> 00:49:29,160
that's very nice." And then they
promptly forget about it.

887
00:49:29,160 --> 00:49:31,420
The only reason I think
about it more is I

888
00:49:31,420 --> 00:49:33,790
teach the course sometimes.

889
00:49:33,790 --> 00:49:35,780
Otherwise I would forget
about it, too.

890
00:49:41,730 --> 00:49:44,200
OK, which is what this says.

891
00:49:44,200 --> 00:49:48,390
For communication we assume
p0 is equal to p1.

892
00:49:48,390 --> 00:49:50,920
So we assume that eta
is equal to one.

893
00:49:50,920 --> 00:49:54,330
So the probability of error,
which is also the probability

894
00:49:54,330 --> 00:49:58,100
of error when H is equal to
one-- in other words when a

895
00:49:58,100 --> 00:50:01,230
one actually enters the
communication system-- is

896
00:50:01,230 --> 00:50:04,890
equal to the probability of
error when H is equal to 0.

897
00:50:04,890 --> 00:50:08,980
In other words these two tails
here, when the threshold is

898
00:50:08,980 --> 00:50:12,340
equal to one you set the
threshold right there.

899
00:50:12,340 --> 00:50:15,540
The probability of this tail
is clearly equal to the

900
00:50:15,540 --> 00:50:16,960
probability of this tail.

901
00:50:16,960 --> 00:50:19,400
Just by symmetry.

902
00:50:19,400 --> 00:50:22,280
So these two error probabilities
are the same.

903
00:50:22,280 --> 00:50:26,540
And in fact they are just
q of a over the square

904
00:50:26,540 --> 00:50:29,660
root of n0 over 2.

905
00:50:29,660 --> 00:50:31,870
It's nice to put this
in terms of energy.

906
00:50:31,870 --> 00:50:34,400
We said before that energy is
sort of important in the

907
00:50:34,400 --> 00:50:36,800
communication field.

908
00:50:36,800 --> 00:50:40,330
So we call e sub b the energy
per bit that we're

909
00:50:40,330 --> 00:50:42,900
spending to send data.

910
00:50:42,900 --> 00:50:45,020
I mean don't worry about the
fact that we're only sending

911
00:50:45,020 --> 00:50:48,150
one bit and then we're tearing
the communication system down.

912
00:50:48,150 --> 00:50:51,300
Because pretty soon we're going
to send multiple bits.

913
00:50:51,300 --> 00:50:55,830
But the amount of energy we're
spending sending this one bit

914
00:50:55,830 --> 00:50:57,970
is a squared.

915
00:50:57,970 --> 00:51:01,270
At least back in this frame of
reference that we're looking

916
00:51:01,270 --> 00:51:03,640
at now, where we're just looking
at this discrete

917
00:51:03,640 --> 00:51:08,980
signal and single
variable noise.

918
00:51:08,980 --> 00:51:14,010
And n0 over 2 is the noise
variance of this particular

919
00:51:14,010 --> 00:51:20,960
random variable, z, so when we
write this out in terms of Eb,

920
00:51:20,960 --> 00:51:27,010
which is a squared, it looks
like this-- it's 2eb over n0.

921
00:51:27,010 --> 00:51:31,970
So the probability of error for
this binary communication

922
00:51:31,970 --> 00:51:36,210
problem is just the square
root of 2Eb over n0.

923
00:51:36,210 --> 00:51:39,750
Which is a formula that
you want to remember.

924
00:51:39,750 --> 00:51:47,850
It's the error probability for
binary detection when n0 over

925
00:51:47,850 --> 00:51:55,520
2 is the noise energy on this
one degree of freedom and Eb

926
00:51:55,520 --> 00:51:57,680
is the amount of energy you're
spending on this

927
00:51:57,680 --> 00:52:00,660
one degree of freedom.

928
00:52:00,660 --> 00:52:02,960
You will see about
50 variations of

929
00:52:02,960 --> 00:52:06,090
this as we go on.

930
00:52:06,090 --> 00:52:11,220
If you try to remember this
fundamental definition, it'll

931
00:52:11,220 --> 00:52:14,210
save you a lot of agony.

932
00:52:14,210 --> 00:52:20,170
Even so, everybody I know who
deals with this kind of thing

933
00:52:20,170 --> 00:52:23,440
always screws up the
factors of twos.

934
00:52:23,440 --> 00:52:27,200
And finally when they get all
done, they try to figure out

935
00:52:27,200 --> 00:52:30,020
from common sense or from
something else, what the

936
00:52:30,020 --> 00:52:32,660
factors of two ought to be.

937
00:52:32,660 --> 00:52:35,850
And they reduce their
probability of error to about

938
00:52:35,850 --> 00:52:38,790
a quarter after they're all done
with doing all of that.

939
00:52:42,870 --> 00:52:46,490
OK, so we spent a lot
of time analyzing

940
00:52:46,490 --> 00:52:50,010
binary antipodal signals.

941
00:52:50,010 --> 00:52:53,430
What about the binary non
antipodal signals?

942
00:52:53,430 --> 00:52:57,530
This is beautiful example of
Shannon's idea of studying the

943
00:52:57,530 --> 00:52:58,910
simplest cases first.

944
00:53:01,660 --> 00:53:05,500
If you have two signals, one of
which is b and one of which

945
00:53:05,500 --> 00:53:14,330
is b prime, and they can be put
anywhere on the real line.

946
00:53:14,330 --> 00:53:17,140
And what I've done, because I
didn't want to plot this whole

947
00:53:17,140 --> 00:53:21,440
picture again, is I just took
the zero out, and I replaced

948
00:53:21,440 --> 00:53:25,650
the zero by the point halfway
between these two points which

949
00:53:25,650 --> 00:53:28,530
is b plus b prime over 2.

950
00:53:28,530 --> 00:53:33,040
And then we look at it and we
say what happens if you have

951
00:53:33,040 --> 00:53:38,500
an arbitrary set of two points
anywhere on the real line?

952
00:53:38,500 --> 00:53:44,900
Well when I send this point the
likelihood, conditional on

953
00:53:44,900 --> 00:53:48,350
this point being sent, is
this Gaussian curve

954
00:53:48,350 --> 00:53:50,470
centered on b prime.

955
00:53:50,470 --> 00:53:54,640
When I send b the likelihood
is a Gaussian

956
00:53:54,640 --> 00:53:57,310
curve centered on b.

957
00:53:57,310 --> 00:54:02,210
And it is in fact the same curve
that we drew before.

958
00:54:02,210 --> 00:54:06,870
If in fact I replaced zero with
a center point between

959
00:54:06,870 --> 00:54:10,930
these two, which is b
plus b prime over 2.

960
00:54:10,930 --> 00:54:28,500
And if I then define a as this
distance in here, the

961
00:54:28,500 --> 00:54:32,140
probability of error that I
calculated before is the same

962
00:54:32,140 --> 00:54:34,280
as it was before.

963
00:54:34,280 --> 00:54:40,050
Now I would suggest to all of
you that you try to find the

964
00:54:40,050 --> 00:54:46,700
probability of error for this
system here, just not using

965
00:54:46,700 --> 00:54:49,910
what we've already done, and
just writing out the

966
00:54:49,910 --> 00:54:53,160
likelihood ratios as a function
of an arbitrary b

967
00:54:53,160 --> 00:54:58,370
prime and an arbitrary b and
finding a likelihood ratio,

968
00:54:58,370 --> 00:55:00,640
and calculating through
all of that.

969
00:55:00,640 --> 00:55:04,760
And most of you are capable of
calculating through all of it.

970
00:55:04,760 --> 00:55:07,930
But when you do so, you will
get a god awful looking

971
00:55:07,930 --> 00:55:12,530
formula, which just looks
totally messy.

972
00:55:12,530 --> 00:55:17,360
And by looking at the formula
you are not going to be able

973
00:55:17,360 --> 00:55:20,540
to realize that what's going
on is what we see here from

974
00:55:20,540 --> 00:55:22,670
the picture.

975
00:55:22,670 --> 00:55:25,230
And the only reason we figured
out what was going like the

976
00:55:25,230 --> 00:55:27,610
picture is we already
solved the problem

977
00:55:27,610 --> 00:55:29,370
for the Simpson case.

978
00:55:29,370 --> 00:55:35,400
OK so it never makes any sense
in this problem to look at

979
00:55:35,400 --> 00:55:37,120
this general case.

980
00:55:37,120 --> 00:55:40,300
You always want to say the
general case is just a special

981
00:55:40,300 --> 00:55:43,360
case of a special case.

982
00:55:43,360 --> 00:55:48,110
Where you just have to define
things slightly differently.

983
00:55:48,110 --> 00:55:51,650
OK, so we're going to do the
center point, then, I mean it

984
00:55:51,650 --> 00:55:54,280
might be a pilot tone.

985
00:55:54,280 --> 00:55:58,600
It might be any other non
information bearing signal.

986
00:55:58,600 --> 00:56:00,900
In other words were sending
the one bit.

987
00:56:00,900 --> 00:56:03,470
Sometimes for some reason or
other, you need to get

988
00:56:03,470 --> 00:56:04,470
synchronization.

989
00:56:04,470 --> 00:56:07,470
You need to get other things
in a communication system.

990
00:56:07,470 --> 00:56:09,500
And for that reason, you
send other things.

991
00:56:09,500 --> 00:56:12,260
We'll talk about a lot
of those later.

992
00:56:12,260 --> 00:56:15,500
But they don't change the error
probability at all.

993
00:56:15,500 --> 00:56:19,370
The error probability is
determined solely by the

994
00:56:19,370 --> 00:56:21,910
distance between these two
points which we call 2a.

995
00:56:25,820 --> 00:56:28,660
So probability of error remains
the same in terms of

996
00:56:28,660 --> 00:56:30,310
this distance.

997
00:56:30,310 --> 00:56:33,020
The energy per bit
now changes.

998
00:56:33,020 --> 00:56:36,080
The energy per bit is
the energy here

999
00:56:36,080 --> 00:56:37,630
plus the energy here.

1000
00:56:37,630 --> 00:56:40,270
Which in fact is the energy
in the center

1001
00:56:40,270 --> 00:56:43,990
point plus a squared.

1002
00:56:43,990 --> 00:56:46,390
I mean we've done that in a
number of contexts, the way to

1003
00:56:46,390 --> 00:56:52,990
find the energy in a binary
random variable is to take the

1004
00:56:52,990 --> 00:56:56,650
energy in the center point
plus the energy in the

1005
00:56:56,650 --> 00:56:58,070
difference.

1006
00:56:58,070 --> 00:57:02,040
It's the same as finding the
fluctuation plus the

1007
00:57:02,040 --> 00:57:02,900
square of the mean.

1008
00:57:02,900 --> 00:57:05,530
It's that same underlying
idea.

1009
00:57:05,530 --> 00:57:10,280
So any time you use non
antipodal and you shift things

1010
00:57:10,280 --> 00:57:13,840
off the mean, you can
see what's going

1011
00:57:13,840 --> 00:57:15,840
on very, very easily.

1012
00:57:15,840 --> 00:57:17,580
You waste energy.

1013
00:57:17,580 --> 00:57:21,170
I mean it might not be wasted,
you might have to waste it for

1014
00:57:21,170 --> 00:57:22,500
some reason.

1015
00:57:22,500 --> 00:57:25,360
But as far as a communication
is concerned you're simply

1016
00:57:25,360 --> 00:57:26,830
wasting it.

1017
00:57:26,830 --> 00:57:30,960
So your energy per bit changes,
but your probability

1018
00:57:30,960 --> 00:57:33,310
of error remains the same.

1019
00:57:33,310 --> 00:57:38,450
Because of that, you get a very
clear cut idea of what

1020
00:57:38,450 --> 00:57:42,040
it's costing you to send
that pilot tone.

1021
00:57:42,040 --> 00:57:45,990
Because in fact what you've done
is to just increase this

1022
00:57:45,990 --> 00:57:50,480
energy, which we talk about
in terms of db.

1023
00:57:50,480 --> 00:57:54,340
If c is equal to a in this case
which, as will see, is a

1024
00:57:54,340 --> 00:57:57,910
common thing that happens in a
lot of systems, what you've

1025
00:57:57,910 --> 00:57:59,590
lost is a factor three db.

1026
00:57:59,590 --> 00:58:02,970
Because you're using twice as
much energy which is three db

1027
00:58:02,970 --> 00:58:05,850
more energy, than you have
to use for the pure

1028
00:58:05,850 --> 00:58:07,590
communication.

1029
00:58:07,590 --> 00:58:12,040
So it's costing you three db to
do whatever silly thing you

1030
00:58:12,040 --> 00:58:14,960
want to do for synchronization
or something else.

1031
00:58:14,960 --> 00:58:18,560
Which is why people work very
hard to try to send signals

1032
00:58:18,560 --> 00:58:20,850
which carry their own
synchronization

1033
00:58:20,850 --> 00:58:23,010
information in them.

1034
00:58:23,010 --> 00:58:25,030
And we will talk about that more
when we get to wireless

1035
00:58:25,030 --> 00:58:26,280
and things like that.

1036
00:58:31,100 --> 00:58:31,960
OK.

1037
00:58:31,960 --> 00:58:35,800
Let's go on to real antipodal
vectors in

1038
00:58:35,800 --> 00:58:39,030
white Gaussian noise.

1039
00:58:39,030 --> 00:58:42,160
And again, let me point out to
you again that one of the

1040
00:58:42,160 --> 00:58:45,500
remarkable things about
detection theory is once you

1041
00:58:45,500 --> 00:58:51,940
understand detection for
antipodal binary signals and

1042
00:58:51,940 --> 00:58:54,510
Gaussian noise, everything
else just follows along.

1043
00:58:58,260 --> 00:59:02,410
OK, so here what we're going to
do is to assume that under

1044
00:59:02,410 --> 00:59:05,840
the hypothesis H equals zero--
in other words conditional on

1045
00:59:05,840 --> 00:59:09,570
a zero entering the
communication system--

1046
00:59:09,570 --> 00:59:14,700
what we're going to send is not
a single degree, is not

1047
00:59:14,700 --> 00:59:16,960
something in a single
degree of freedom.

1048
00:59:16,960 --> 00:59:19,470
But we're actually going
to send a vector.

1049
00:59:19,470 --> 00:59:23,520
And you can think of that if you
want to as sending a way

1050
00:59:23,520 --> 00:59:27,930
form and breaking up the
waveform into an orthonormal

1051
00:59:27,930 --> 00:59:31,440
expansion and a1 to
ak as being the

1052
00:59:31,440 --> 00:59:33,940
coefficients in that expansion.

1053
00:59:33,940 --> 00:59:35,560
So we're going to use
several degrees of

1054
00:59:35,560 --> 00:59:37,870
freedom to send one signal.

1055
00:59:37,870 --> 00:59:38,210
Yes?

1056
00:59:38,210 --> 00:59:42,750
AUDIENCE: [INAUDIBLE]

1057
00:59:42,750 --> 00:59:46,270
PROFESSOR: I'm only
sending one bit.

1058
00:59:46,270 --> 00:59:50,100
And on Wednesday I'm going to
talk about what happens when

1059
00:59:50,100 --> 00:59:54,400
we want to send multiple bits
or when we want to send a

1060
00:59:54,400 --> 00:59:57,770
large number of signals in one
degree of freedom, or all

1061
00:59:57,770 --> 01:00:01,040
those cases, multiple
hypotheses.

1062
01:00:01,040 --> 01:00:03,640
You know what's going
to happen?

1063
01:00:03,640 --> 01:00:06,750
It's going to turn out to be
a trivial problem again.

1064
01:00:06,750 --> 01:00:10,560
Multiple hypotheses are no
harder than just binary

1065
01:00:10,560 --> 01:00:12,560
hypotheses.

1066
01:00:12,560 --> 01:00:15,720
So again, once you understand
the simplest case, all

1067
01:00:15,720 --> 01:00:18,660
Gaussian problems turn
out to be solved.

1068
01:00:18,660 --> 01:00:24,240
Just with minor variations
and minor tweaks.

1069
01:00:24,240 --> 01:00:26,790
OK, so I have antipodal
vectors.

1070
01:00:26,790 --> 01:00:30,020
One vector is a1 to a sub k.

1071
01:00:30,020 --> 01:00:34,290
Under the other hypothesis we're
going to send minus a,

1072
01:00:34,290 --> 01:00:36,550
which is the opposite vector.

1073
01:00:36,550 --> 01:00:40,030
So if we're dealing with two
dimensional space with

1074
01:00:40,030 --> 01:00:43,140
coordinates here and here,
if I send this I'm

1075
01:00:43,140 --> 01:00:44,360
going to send this.

1076
01:00:44,360 --> 01:00:47,330
If I send that I'm going to
send that, and so forth.

1077
01:00:47,330 --> 01:00:51,030
As the opposite alternative.

1078
01:00:51,030 --> 01:00:58,650
The likelihood ratio is then,
the probability of vector v

1079
01:00:58,650 --> 01:01:02,140
conditional on sending this.

1080
01:01:02,140 --> 01:01:06,780
I'm assuming here that the noise
is IID and each noise

1081
01:01:06,780 --> 01:01:12,650
variable has mean zero and
variance n0 over 2.

1082
01:01:12,650 --> 01:01:16,550
Namely, we're pretending we're
communication people here,

1083
01:01:16,550 --> 01:01:18,430
using and n0 over 2 here.

1084
01:01:18,430 --> 01:01:21,790
So the conditional density--

1085
01:01:21,790 --> 01:01:27,090
the likelihood of this output
vector given zero-- is just

1086
01:01:27,090 --> 01:01:32,350
this density of z shifted
over by a.

1087
01:01:32,350 --> 01:01:37,580
So it's what we've talked about
as the Gaussian density.

1088
01:01:37,580 --> 01:01:46,540
Just this, which is just the
energy in v minus a is what

1089
01:01:46,540 --> 01:01:49,040
that turns out to be.

1090
01:01:49,040 --> 01:01:55,460
So the log likelihood ratio is
the ratio of this quantity to

1091
01:01:55,460 --> 01:01:58,270
the density where H
is equal to one.

1092
01:01:58,270 --> 01:02:06,180
And when H is equal to one, what
happens is the same thing

1093
01:02:06,180 --> 01:02:07,390
as happened before.

1094
01:02:07,390 --> 01:02:14,040
One makes this sign turn
into a plus sign.

1095
01:02:14,040 --> 01:02:17,550
So when I look at the log
likelihood ratio, I want to

1096
01:02:17,550 --> 01:02:23,480
take the ratio of this quantity
to the same quantity

1097
01:02:23,480 --> 01:02:26,110
with a plus put into it.

1098
01:02:26,110 --> 01:02:29,250
And when I take the log of that,
what happens is I get

1099
01:02:29,250 --> 01:02:32,990
this term minus the opposite
term of the opposite side.

1100
01:02:32,990 --> 01:02:38,930
So I have minus the norm squared
of v minus a plus the

1101
01:02:38,930 --> 01:02:43,650
norm squared of v
plus a over n0.

1102
01:02:43,650 --> 01:02:46,080
And again, if you multiply
this out, the v squareds

1103
01:02:46,080 --> 01:02:47,130
cancel out.

1104
01:02:47,130 --> 01:02:48,950
The a squareds cancel out.

1105
01:02:48,950 --> 01:02:51,460
And you just get the inner
product terms.

1106
01:02:51,460 --> 01:02:54,060
And strangely enough you get the
same formula that you got

1107
01:02:54,060 --> 01:02:58,130
before, almost, except here you
have the inner product of

1108
01:02:58,130 --> 01:03:02,140
v with a instead of just the
product of v times a.

1109
01:03:02,140 --> 01:03:06,440
So in fact we just have a slight
generalization of the

1110
01:03:06,440 --> 01:03:09,140
thing that we did before.

1111
01:03:09,140 --> 01:03:21,980
In other words, the
scalar product is

1112
01:03:21,980 --> 01:03:25,100
a sufficient statistic.

1113
01:03:25,100 --> 01:03:27,850
Now what does that tell you?

1114
01:03:27,850 --> 01:03:30,900
It tells you how to
build a detector.

1115
01:03:30,900 --> 01:03:31,350
OK?

1116
01:03:31,350 --> 01:03:34,770
It tells you when you have a
vector detection problem, the

1117
01:03:34,770 --> 01:03:40,440
thing that you want to do is to
take this vector, v, that

1118
01:03:40,440 --> 01:03:47,260
you have and form the inner
product of v times a.

1119
01:03:47,260 --> 01:03:51,350
If in fact v is a waveform
and a is a waveform,

1120
01:03:51,350 --> 01:03:52,600
what do you do then?

1121
01:03:56,140 --> 01:03:58,550
Well the first thing you
do is to think of

1122
01:03:58,550 --> 01:04:01,730
v as being a vector--

1123
01:04:01,730 --> 01:04:04,150
where it's the vector of
coefficients-- in the

1124
01:04:04,150 --> 01:04:10,470
expansion for that waveform
and a in the same way.

1125
01:04:10,470 --> 01:04:13,670
You look at what the inner
product is then, and then you

1126
01:04:13,670 --> 01:04:16,860
say, "well what does that
correspond to when I deal with

1127
01:04:16,860 --> 01:04:19,550
L2 waveforms?" What's the inner

1128
01:04:19,550 --> 01:04:21,700
product for L2 waveforms?

1129
01:04:26,620 --> 01:04:28,550
It's the integral of the product
of the waveforms.

1130
01:04:31,260 --> 01:04:32,930
And how do you form the
integral of the

1131
01:04:32,930 --> 01:04:35,770
product of the waveforms?

1132
01:04:35,770 --> 01:04:38,520
You take this waveform here.

1133
01:04:38,520 --> 01:04:44,770
You turn it around and you call
it a matched filter to a.

1134
01:04:44,770 --> 01:04:46,890
And you take the received
waveform.

1135
01:04:46,890 --> 01:04:50,280
You pass it through the max
filter for a, and you look at

1136
01:04:50,280 --> 01:04:51,530
the output for it.

1137
01:04:53,960 --> 01:05:01,820
Now, let's go back and look at
what all of this was doing.

1138
01:05:01,820 --> 01:05:04,560
And for now let's forget
about the baseband to

1139
01:05:04,560 --> 01:05:05,730
the passband business.

1140
01:05:05,730 --> 01:05:10,070
Let's just look at this part
here because it's a little

1141
01:05:10,070 --> 01:05:11,670
easier to see this first.

1142
01:05:21,040 --> 01:05:23,540
So this comes in here.

1143
01:05:23,540 --> 01:05:25,120
Now remember what
we were saying

1144
01:05:25,120 --> 01:05:26,530
when we studied Nyquist.

1145
01:05:26,530 --> 01:05:32,095
We said a neat thing to do was
to use a square root of the

1146
01:05:32,095 --> 01:05:34,640
Nyquist pulse at the
transmitter.

1147
01:05:34,640 --> 01:05:37,040
When you use a square root of
the Nyquist pulse at the

1148
01:05:37,040 --> 01:05:41,920
transmitter, what you have is
orthogonality between the

1149
01:05:41,920 --> 01:05:44,140
pulse and all of its shifts.

1150
01:05:44,140 --> 01:05:46,760
Well now we don't much care
about the orthogonality

1151
01:05:46,760 --> 01:05:49,820
between the pulse and all of the
shifts because we're only

1152
01:05:49,820 --> 01:05:52,930
sending this one bit anyway.

1153
01:05:52,930 --> 01:05:54,990
But it sort of looks like we're
going to be able to put

1154
01:05:54,990 --> 01:05:58,180
that back in in a nice
convenient way.

1155
01:05:58,180 --> 01:06:02,000
So we're sending this one pulse,
p of t, city and what

1156
01:06:02,000 --> 01:06:04,150
did we do in this baseband
demodulator?

1157
01:06:06,670 --> 01:06:11,050
We passed this through another
filter, q of t, which was the

1158
01:06:11,050 --> 01:06:14,620
matched filter to p of t.

1159
01:06:14,620 --> 01:06:19,220
What's our optimal detector
for maximum likelihood?

1160
01:06:19,220 --> 01:06:22,490
It's to take whatever this
waveform was, pass it through

1161
01:06:22,490 --> 01:06:23,550
the matched filter.

1162
01:06:23,550 --> 01:06:26,120
In other words, to calculate
that inner product we just

1163
01:06:26,120 --> 01:06:28,990
talked about.

1164
01:06:28,990 --> 01:06:30,570
OK?

1165
01:06:30,570 --> 01:06:34,090
So in fact when we were looking
at the Nyquist problem

1166
01:06:34,090 --> 01:06:37,990
and worrying about inner symbol
interference, in fact

1167
01:06:37,990 --> 01:06:41,590
what we were doing was also
doing the first part of an

1168
01:06:41,590 --> 01:06:44,390
optimal MAP detector.

1169
01:06:44,390 --> 01:06:48,760
And at this point what comes
out of here is a single

1170
01:06:48,760 --> 01:06:54,740
number, v, which in fact now is
the inner product of this

1171
01:06:54,740 --> 01:06:57,330
waveform at this point.

1172
01:06:57,330 --> 01:07:01,450
With the waveform,
a, that we sent.

1173
01:07:01,450 --> 01:07:02,860
OK?

1174
01:07:02,860 --> 01:07:05,800
In other words, we started out
by saying, "let's suppose that

1175
01:07:05,800 --> 01:07:08,280
what we have here is a number.

1176
01:07:08,280 --> 01:07:12,390
What's the optimal detector to
build?" And then we go on and

1177
01:07:12,390 --> 01:07:15,850
say, "OK, let's suppose we
look at the problem here.

1178
01:07:15,850 --> 01:07:19,730
What's the optimal detector to
build now?" And the optimal

1179
01:07:19,730 --> 01:07:23,850
detector to build now at this
point is this matched filter

1180
01:07:23,850 --> 01:07:25,100
to this input waveform.

1181
01:07:27,610 --> 01:07:30,770
Followed by the inner product
here-- which is what the match

1182
01:07:30,770 --> 01:07:35,250
filter does for us-- followed
by our binary antipodal

1183
01:07:35,250 --> 01:07:37,620
detector again.

1184
01:07:37,620 --> 01:07:40,080
OK?

1185
01:07:40,080 --> 01:07:44,770
So by studying the problem at
this point, we now understand

1186
01:07:44,770 --> 01:07:46,100
what happens at this point.

1187
01:07:51,050 --> 01:07:53,760
And do I have time to show you
what happens at this point?

1188
01:07:53,760 --> 01:07:55,380
I don't know.

1189
01:07:55,380 --> 01:07:56,630
Let me--

1190
01:08:03,270 --> 01:08:07,710
let's not do that at
least right now--

1191
01:08:07,710 --> 01:08:10,520
let's look at the picture of
this that we get when we just

1192
01:08:10,520 --> 01:08:15,200
look at the problem when
we have two dimensions.

1193
01:08:15,200 --> 01:08:18,940
So we're either going to
transmit a vector, a, or we're

1194
01:08:18,940 --> 01:08:20,910
going to transmit a
vector, minus a.

1195
01:08:20,910 --> 01:08:23,100
And think of this in
two dimensions.

1196
01:08:23,100 --> 01:08:26,050
When we transmit the
vector, a, we have

1197
01:08:26,050 --> 01:08:28,100
two dimensional noise.

1198
01:08:28,100 --> 01:08:31,490
We've already pointed out that
two dimensional Gaussian noise

1199
01:08:31,490 --> 01:08:32,750
has circular symmetry.

1200
01:08:32,750 --> 01:08:35,530
Spherical symmetry in an
arbitrary number of dimension.

1201
01:08:35,530 --> 01:08:38,740
So what happens is you get
these equal probability

1202
01:08:38,740 --> 01:08:42,930
regions which are spreading out
like when you drop a rock

1203
01:08:42,930 --> 01:08:45,360
into a pool of water.

1204
01:08:45,360 --> 01:08:49,240
You see all of these things
spreading out in circles.

1205
01:08:49,240 --> 01:08:56,120
And you then say, "OK, what's
this inner product going to

1206
01:08:56,120 --> 01:09:01,330
correspond to?" Finding the
inner product and comparing it

1207
01:09:01,330 --> 01:09:03,480
with a threshold.

1208
01:09:03,480 --> 01:09:06,940
Well you can see geometrically
what's going to happen here.

1209
01:09:06,940 --> 01:09:11,090
You're trying to do maximum
likelihood.

1210
01:09:11,090 --> 01:09:13,230
And we already know we're
supposed to calculate the

1211
01:09:13,230 --> 01:09:16,260
inner product, so what the inner
product is going to do

1212
01:09:16,260 --> 01:09:19,250
is take whatever v that we
receive-- it's going to

1213
01:09:19,250 --> 01:09:25,920
project it on to this line
between 0 and a.

1214
01:09:25,920 --> 01:09:30,020
So if I got a v here I'm going
to project it down to here.

1215
01:09:30,020 --> 01:09:32,210
And then what I'm going to do
is I'm going to compare the

1216
01:09:32,210 --> 01:09:34,600
distance from here to there
with the distance

1217
01:09:34,600 --> 01:09:36,680
from here to there.

1218
01:09:36,680 --> 01:09:41,160
Which says first project, then
do the old decision in a one

1219
01:09:41,160 --> 01:09:42,510
dimensional way.

1220
01:09:42,510 --> 01:09:47,910
Now geometrically, this distance
squared is equal to

1221
01:09:47,910 --> 01:09:50,850
this distance squared plus
this distance squared.

1222
01:09:50,850 --> 01:09:53,880
And this distance squared is
equal to the same distance

1223
01:09:53,880 --> 01:09:56,920
squared plus this distance
squared.

1224
01:09:56,920 --> 01:10:00,980
So whatever you decide to do in
terms of these distances,

1225
01:10:00,980 --> 01:10:04,280
you will also decide to do in
terms of these distances.

1226
01:10:04,280 --> 01:10:07,940
Which also means that the
maximum likelihood regions

1227
01:10:07,940 --> 01:10:10,470
that you're going to develop,
or in fact the maximum a

1228
01:10:10,470 --> 01:10:15,480
posteriori probability regions
are simply planes.

1229
01:10:15,480 --> 01:10:18,520
Which are perpendicular
to the line between

1230
01:10:18,520 --> 01:10:21,330
minus a and plus a.

1231
01:10:21,330 --> 01:10:22,520
OK?

1232
01:10:22,520 --> 01:10:24,650
So if you're doing maximum
likelihood you just form a

1233
01:10:24,650 --> 01:10:27,330
plane halfway between
these two points.

1234
01:10:27,330 --> 01:10:27,560
Yeah?

1235
01:10:27,560 --> 01:10:42,100
AUDIENCE: [UNINTELLIGIBLE]

1236
01:10:42,100 --> 01:10:44,990
PROFESSOR: We got the error
probability just by first

1237
01:10:44,990 --> 01:10:48,820
doing the projection and then
turning it into this scale or

1238
01:10:48,820 --> 01:10:50,500
problem again.

1239
01:10:50,500 --> 01:10:51,830
So in fact the error
probability--

1240
01:10:51,830 --> 01:10:52,770
What?

1241
01:10:52,770 --> 01:10:55,870
AUDIENCE: [UNINTELLIGIBLE]

1242
01:10:55,870 --> 01:10:58,110
PROFESSOR: The probability of
error is just the probability

1243
01:10:58,110 --> 01:10:59,940
of error in the projection.

1244
01:10:59,940 --> 01:11:02,860
Did I write it down someplace?

1245
01:11:02,860 --> 01:11:04,770
Oh yeah, I did write it down.

1246
01:11:08,100 --> 01:11:14,000
But I wrote it down, well
I sort of cheated.

1247
01:11:14,000 --> 01:11:16,980
It's in the notes.

1248
01:11:16,980 --> 01:11:21,330
I mean the likelihood ratio is
just a center product here

1249
01:11:21,330 --> 01:11:23,410
which is a number.

1250
01:11:23,410 --> 01:11:26,290
And when you find the error
probability, you just use the

1251
01:11:26,290 --> 01:11:30,030
same q formula that
we used before.

1252
01:11:30,030 --> 01:11:32,360
And in place of a
you substitute--

1253
01:11:36,700 --> 01:11:42,800
in place of a you substitute the
inner product of v with a.

1254
01:11:42,800 --> 01:11:45,180
Which is the corresponding
quantity.

1255
01:11:45,180 --> 01:12:01,080
So it's q of 4va
divided by n0.

1256
01:12:01,080 --> 01:12:02,510
OK?

1257
01:12:02,510 --> 01:12:07,780
So that's the maximum likelihood
error probability.

1258
01:12:07,780 --> 01:12:08,010
OK?

1259
01:12:08,010 --> 01:12:11,870
In other words, nothing
new has happened here.

1260
01:12:11,870 --> 01:12:15,900
You just go through the match
filter and then you do this

1261
01:12:15,900 --> 01:12:19,470
same one dimensional problem
that we've already

1262
01:12:19,470 --> 01:12:21,620
figured out how to do.

1263
01:12:21,620 --> 01:12:26,090
I think I'm going to stop there
and we'll do the complex

1264
01:12:26,090 --> 01:12:33,410
case which really corresponds to
what happens after baseband

1265
01:12:33,410 --> 01:12:35,160
to passband then passband
to baseband.