This video just has a plain transcript, not time-aligned to the videoTHIS IS AN UNCORRECTED AUTOMATIC TRANSCRIPT. IT MAY BE CORRECTED LATER IF TIME PERMITSthat's what a remarkable can compute for us.Unfortunately, that's not really what we want.We're not really interested in the probability of some speech.We want the probability of particular word sequence, given that we've Bean given this speech and turned into a sequence of embassy sees given that speech.So that's after this.Bart could be given speech.Tell me, how likely is it was a one.How likely is it too? How likely is it is a three.Compute all of those and pick the biggest one.So really, What? We want to compute this thing here.Probability of a word.Be a particular word.Given the speech, Bob just means given.But the hit a market model does not compute that we could think of other models that might directly compute that some discriminative models fancy thing.But our hmm does not complete at the Richmond is a generative model, and it generates speech.Therefore, compute the probability of that speech, not the probability of the word.So it computes this thing here, so we're gonna have to get one of these things from the other thing that we could do that fairly straightforwardly.There's a very simple law of probability.I'm going to state it and just tell you it's true.We're just look intuitively about why this is reasonable.The joint probability of two things where one of them is informative about the other is equal to this is equal to the probability of one of the things.So let's look at one of them and then given that thing now we know its value.What's the conditional probability? The other one? So it's equal to the probability of one of the things it doesn't matter which one multiplied by the probability of the other thing.Given that we now know that the value now if these two things were independent, that will just reduced to probability of oh, probability of W because probably oh, given W.Would be just probably oh, if they were completely independent.Knowing W wouldn't change.Probably developed, we just reduced about.However, that's not the case.If they are co variant, they depend on each other.One is a war motive about the other, so we can write that equation down hope that's just intuitive, reasonable and CNN's own w we just letters.We could write ABC any letters we want.So we just swapo on w just notation so we could buy symmetry.Just write this other thing down.Okay? Now, probability off.Oh, on w is exactly equal to the probability of w on DH.Okay, Word on DH, just what things around.Okay, probability of someone being this and this is that it is equal to the probability of being two things just said in the other way, it doesn't change that probability.So because of that, these two things exactly equal to each other and therefore this thing is exactly equal to this thing.We rearrange that, then get this lovely equation down here on.This is so important.It's got a name.It's called Bays Rule and magically tells us to how to compute what we need to compute in terms of the thing.The hmm can compute on some other terms.That one we know.We're there with that one.We'll see in a minute exactly how to compute it.R h man will do that.Where on earth were gonna get P f w and P r go on their own from? We'll deal with those things.So there's the full thing spelled out.Let's give this thing some special names.This thing here is a conditional probability.So hmm, actually compute the conditional probability Probability often Observation sequence.In other words, a sequence of M sec vectors.Okay, this's oh, it compute the probability that, given that you give me a model to compute the probability the models got a label on the label is this w So is this conditional probability that's gonna be okay?
Bayes’ rule
Now we have correctly stated that the HMM computes P(O|W), we realise that actually we need to compute P(W|O). Bayes' rule comes to the rescue.
Log in if you want to mark this as completed
|
|