Summary – pruning

By removing unlikely partial paths (tokens), we can make recognition much faster.

slownormalfast

This video just has a plain transcript, not time-aligned to the videoTHIS IS AN UNCORRECTED AUTOMATIC TRANSCRIPT. IT MAY BE CORRECTED LATER IF TIME PERMITS
the only thing to say, Finally, something hopefully have already all discovered from doing practical.
And that is that in a large model on your models were at large.
In this, I'm a bit in a really large model.
Most of the tokens most the time will be very unlikely.
We'll be off in parts of the language model, very different words and nothing to do with the acoustics.
And they will go around around to keep crunching the numbers and keep bumping into likely tokens and being deleted.
But we could save a lot of computation by making an approximation, but that's just throw away things that look like there's no chance they're ever going to win.
Probability falls too low.
We just throw them away.
The most common way of doing that is thinking Beam search.
Every single iteration somewhere in the network will be a current token.
That's the most probable.
Currently, it might not be.
The eventual winner is currently doing the best.
Anything that's worse than that, by some margin called the Beam would just be deleted.
It's all set this idea of a beam with below this best token if we take that being really tight.
We could make things go really fast.
We risk throwing away the token.
That actually would have gone on to one.
It's just we didn't didn't know that at this point.