Token passing

This is a really nice way to understand, and to implement, dynamic programming in HMMs.

slownormalfast

This video just has a plain transcript, not time-aligned to the videoTHIS IS AN UNCORRECTED AUTOMATIC TRANSCRIPT. IT MAY BE CORRECTED LATER IF TIME PERMITS
Okay, so let's work that through now with an algorithm called token Passing, we're going to introduce article tokens, tokens of just partial paths.
So path, eh, could be a token.
Okay, so hopefully you play some board games that have maybe tokens will this checkers draughts? Something like that's imagine.
Little tokens on DH.
That's going too.
Conceptual thing.
It might be even a bit of object in the code called the Token.
Token is a partial path.
It makes a little journey.
So point am where am be meet.
Imagine toodle tokens bumping into each other.
They're just gonna have a fight on the one with The best probability is gonna win, and the other one's just going to be deleted.
So tokens a partial paths.
And what do they need to know? They just need to know the law of probability so they could be compared with other tokens, little number stored on them and that numbers updated every time they make a transition.
And every time they met in observation, how's it updated? We'll just multiplies by the probability of transition multiplies by the probability of the mission because we didn't logs where she just add in those long probabilities.
And if we want to be able to look a token that pops out of the end and say, Which model did you come through? Then we might also record the sequence of states or models that token went through.
They might actually record the past for a single, isolated model.
We don't even know that we don't need.
So here's soaking passing.
We need a model.
So let's just form a light model.
So we're ready to go with hokum passing his head of market fall.
It's a bit different to the ones in the lab on.
We could draw pictures of it.
That's the best way of picturing it.
And there's these transition probabilities and hopefully have already had a look inside the model files that your training in the lab and they have those transitions restores The Matrix on this model.
Has this transition matrix.
He's got a couple of funny ones here.
We wouldn't normally use these.
That's just to show that we've got extra.
That one is this number here, and that one is this number here.
So go model.
It's got probabilities on all the transitions problems on the transitions and in the states, there are multi, very Gaussian distributions.
Okay, with maybe some of the dimension of one of the observations on the problems.
They don't steal that they represented by mean understand deviation.
There's our complete model on what we're going to do.
We're going to put a dummy start state on the beginning and the end.
We'll see why that's very useful.
Later, when we try and connect models together to make models of sequences of words, formals of sequences of coatings and we're going to do the algorithm.
So here the animations start put a token at the beginning and this dummy start state on the token says my path is empty.
I'm not done anything yet.
On my probability is I'm the only token there ever wass The only thing possible this point is me and I got property of one.
Okay, it was extremely simple.
Every time frame handle, we just turn this handle blindly on the handle.
Does this first thing we do is all the tokens make one step forwards wherever they are.
I have to go along on a lark.
So there's only one art for this guy to go along We'll go here and end up here now.
Time, time, counter.
This is the time.
But let's say there's this time minus one before things that really happened.
This is time zero have been computer scientists, right? I'm going to start from zero, not one one.
This is the first real time step we met.
Observation.
That's observation that Time t zero this art here.
Since it's the only arc leaving, let's go have a probability of one on it.
So multiply our tokens.
Probability by one it's still one we get here.
This state has got a galaxy in it.
It's a multi area calcium because dimension off the observation vector.
We're really given this observation, this thing which I recognise.
So we just look it up, Read off the probability, remember how to do that.
Multiply the tokens Probability by the probability that that Gaussian generated this observation vector toking probability is now less than one good and then we turn the handle again.
This token has to leave the state so you can either go this way.
We'll go this way and what we're going to do that is explore all passing parallel.
We're gonna clone the token clone into two copies of itself and send one long one on one or the other.
Okay, The two possibilities to parallel ways of doing things like here as we leave a point in the great we could go up, Bagley or across we just passed copies itself.
One copy goes each of the three possible ways.
Here, a copy of the token goes each of the possible ways.
Here there were two.
And off we go along those arcs.
So one copy of the token goes this away.
He's gonna end up back in this state.
Another copy.
The token goes this way and then the next date And now the time counters ticked on Same time counters one one turn of the handle And we now generate observation vector oh, T equals one.
Now we generate the same observation vector from this state on from the state just under the same one time tables on the general Be different.
Garrisons in those two states will be different probabilities on these arcs of these tokens now have a different probably from each other.
This one represents the probability that the model generated the first two observation vectors.
Both from the first emitting state.
This token represents the probability that the first factor was emitted by the first emitting state and the second factor by the second meeting state two different ways.
This model could generate the same observation sequence.
What we do is just turn the handle over and over again.
So now we've got the tokens in these two states will generate that unfortunate used a different accounting system there.
That's time, too.
And we just generate our whole observation sequence.
Just keep doing that on every time.
Two tokens, meat, tokens of partial pass two tokens.
One token comes in along here.
Another token comes in along.
Here, we compare them.
Who's the winner? We limit everyone except the winner.
There's no point sending these tokens on through the model.
It's just a waste of computation, because whatever one token doo doo the other could do just as well.
The observation sequences given we've got that in advance, it's a given.
His big O.
Yeah, that's time that's given this handle is the clock.
It's a little vigil clock.
Every time we turn it around, we go forward in time.
One frame, one time step so all the tokens.
First, try and generate this one.
Turn the handle, and then they all try and generate this one.
Eventually, we just generate the whole observation sequence printed.
So the clock on the algorithm, the handle going round is this discreet time clock ticking through the frames.
But what the tokens are saying, the tokens of saying there are many different ways that this model could have got up frame for frame 56 on all the tokens that present to anyone time or are all those possibilities Okay, so there's a problem in terms of token passing tokens bump into each other.
Looks exactly the same things.
Path A and B.
There's a B B bump into each other.
We only need to keep the most probable one.
Eliminate whoever is not the most probable.
There might be more than two if there's more than two ways of arriving in this state, and we just keep on doing that, and eventually, out of the end of the model pops a token.
Now we turn the handle on this model and the models got three emitting.
States will start getting tokens popping out after frame three, but they're still more observations to generate.
So any token that pops out to Syrians premature has failed to generate the whole observation sequence just evaporates, doesn't count.
Keep turning the handle.
Tokens keep going through and popping out of the end.
Just as we've generated exactly all of the observation sequences, a token pops out of the end, and that's the winner.
It's successfully gone through the whole model on generated all the observation sequences tokens that are still back in ST three.
Having generated all the observation sequences they lose, that's like popping out the sides of these grids, popping out this edge.
You lose or popping out of the other edge you lose.
So either you spend too long early on in the model and never made it to the end, or you got to the end too soon.
Neither of those things a valid you have to arrive in the top right hand corner.
You have to get through the model on DH.
Generate exactly all the observation sequences