A generative model of sequences

How to generate a sequence of observations from our model.

This video just has a plain transcript, not time-aligned to the videoTHIS IS AN UNCORRECTED AUTOMATIC TRANSCRIPT. IT MAY BE CORRECTED LATER IF TIME PERMITS
wait.
Where we going? We're going to get to a model that could generate sequences of Mel frequency capital coefficients and therefore can assign a probability to any unknown sequence of such nfcc coefficients.
And then we'll compare the problem of probabilities for each of our models from different classes, which everyone is highest.
Will assign that label so well so far.
Got a calcium that could generate a single dime of C C vector.
That's okay.
We worked out how to calculate the probability we just read it off the curve that works for one dimension, two dimensions, any number of dimensions.
We thought initially about how we might generate sequence of observations we just repeatedly generate for the calcium we spotted that we're just going to state that these have this special statistical property, that they're independent and they have an identical distribution.
So we need to do now is just formalise that a little too right that it's a very simple probability notation.
What is the probability, then of a sequence of things happening? Can we compute the probability of several things happening based on the problems of the individual things? We're going to start with the simplest possible case.
And that's the case where the individual things are statistically independent from each other.
We're going to actually turn up to make this assumption for Nfcc vectors evil.
It's not true.
We'll fix it up a bit later.
This is so convenient.
This is such a nice way off thinking about data.
So let's start with some other events that clearly independent from each other.
Okay, so every morning I get up from offers is particularly and it's always dark, I hope my sock draw and I don't bother looking to stick your hand in and just pull out a pair of socks.
Okay.
I don't have five different colours of socks.
I'm very boring in my dress sense.
I have a drawer full of socks on.
They're uniformly distributed across these five colours.
One of colour is blue.
Okay, So would anyone like to tell me what is the probability off me? Wearing blue socks tomorrow? Mixed dark.
Just reach in five different colours.
Randomly sample.
20% okay.
Or 1/5.
Have we want to say it on fifth.
No point to 20%.
Express it.
How you like? Okay, that's fine, right? Does that depend on the weather.
It was dark, so let's think about the chance of it raining tomorrow.
Okay, let's just make a very simple published it model off.
Winter weather in Edinburgh doesn't rain too much of the time.
Let's say it rains one day every three days.
So what's the unless just say the property it raining tomorrow has nothing to do with today or the day after a re naive model of the weather.
So what's the probability of it raining tomorrow? 33%.
1 3rd, one third? Well, no 0.33 Okay, now, once the probability off both things happening together that it rains on day wear blue socks, Does one of them depend on the other? They're completely independent, so they would like to propose.
What's the problem of both things happening? Given that it's one third chance of it raining 1/5 chance that I'll be wearing blue songs well over 50 50 15? Yeah, So he's gonna multiply the two things together.
Okay, 1 50 people really comfortable with that prediction.
14 out of 15 times, something else will happen.
For example, it won't rain and I'll wear blue socks will reign and I won't wear blue socks.
It will rain and I wear red socks.
We had all of those things together that will be 49 to 15 times, but one out to 15 times.
Okay, So our beliefs state in our brains.
Not right now, because it's today, not tomorrow.
Our belief state is that 1/15 of us believes that these two things will happen tomorrow and the rest was, believes all these other combinations.
So we're maintaining a distribution over all the things that happen tomorrow.
We don't No, we don't need to choose between them.
We're just going to maintain that all of them are possible.
The probability of this particular one is 1 15 1/15.
So that's the probability.
So it's very intuitive and reasonable.
So let's just write that down.
Let's use some notation to get over the fear factor of the notation.
So let's give some notation to these things.
So we use big cup big couple letters too.
Note event.
Random variables have outcomes.
So X is the random variable.
It will rain tomorrow.
The next Khun, take the value's off.
Yes, it rains.
Uh, it doesn't rain.
And why is going to be the random variable.
That's colour of my socks.
And that could take the colours.
Red, blue, five different values, Some notation.
We've already decided that probability that why equals blue equals one in five.
Happy with this notation? Too scary apologies to anyone who's done for so we can write things expressions like probability that some random variable around the variables of variable that could take multiple values with some distribution.
You might want to make us to discuss model off writing things like this.
So big letters are rare Variable that takes multiple values.
Small letters are particular instances particular values it can take.
And so what? We just intuitively that you already understand We already decided this.
It's a perfectly reasonable equation.
This notation here, comma, turn this into the word.
Um, okay, translate into a sentence.
This is the probability of X and Y.
And if x and Y are independent, you know, there was knowing the value of one doesn't tell us anything about the value of the other One is just the product off the probability of X and probability of why we can stick in probability of X equals raining.
Why equals blue.
Multiply them together when we got our answer.
15.
Okay, we'll do something slightly trickier.
Probability later.
Not much trickier, but for now, let's go with this one.
This is a nice equation.
This says we can compute the probability of two things happening just by multiplying the independent probabilities of each of them happening.
That's great.
That's great.
Simple maths that looks a nice sort of thing that something might want to do.
Computation.
Let's use it to compute the probability of sequence.
He's a calcium.
I'm joining the Galaxy ins in one dimension.
Maybe that's capital coefficient, but they're really multi, very guardians.
Remember the big, colourful hill shaped thing? The really dimensions off 13 or 39.
We're going to draw them in one dimension because it's the only thing I can draw.
And that's the probability density function of this ball you see.
And so let's generate a sequence of things with some gallstones.
And here's some speech.
It's a word.
We're trying to recognise the word We've already said that speech changes the spectrum, changes we go through time, changes relatively slowly, so a reasonable way of generating a sequence of observations that would correspond to a seat.
Speak signal would be that the distribution off the NFC season there was a special envelope is roughly constant for a little while.
It's from the same statistical distribution.
So maybe up to this point, it's all coming from this one distribution.
If this is the beginning of a phoney, and then after that point, it changes a bit, and it comes from a slightly different distribution.
After that point, it changes again that comes from this other distribution.
So a reasonable model for generating a sequence of things for something like Speech, which has slowly changing spectral envelope through time, would be to use a galaxy in to generate a few frames of speech.
Some of these frames just every 10 milliseconds.
So we generate a bit that corresponds to this frame, generate a bit, generate a bit and then decide it's changed enough that the statistical properties have changed when we switched to a second distribution of the third distribution.
So within short regions of time, it seems reasonable to generate from a single calcium.
In other words, the distributions constant.
These samples are independent and identically distributed.
It's like saying that thie average spectrum is constant varies about that by the same amount just for this bit of time.
And then he changes and it changes and it changes.
So our questions are going to be, how many frames do we generate from gassing before moving on to a different calcium to a different distribution? On what order do we go through these Galaxies? And that's what the market was going to do for us.
We also then need to be able to compute the probability of this sequence.
So we've got an observation sequence, so that's observation one.
That's observation to that's observation.
Three whole sequence is gonna call it Big O, and I want to get the probability off the whole observation sequence.
Having been generated by these particular calcium sze, I'm going to make the assumption that we made about my socks and the weather this radically simplifying assumption that turns the equation to really simple equation that the probability off this thing coming from a Gaussian does not depend on any of this other stuff.
So we can just computer independently at this moment in time on the probability of this thing doesn't depend on any of this stuff.
We can compute that independently in time.
This is this thing.
No, the D correlation earlier is within the vector for this frame.
Okay, the related assumptions We d correlated within the features because we didn't want to model co variance because it has a lot of extra parameters for similar reasons in the abstract sense.
We're also going to assume that this is not correlated with this because we don't want to have to put promises are model model that we're going to trains the feet just to deal with that.
In fact, I think those promises would radically change the model.
Wouldn't they hit the market? We'll see when we get the remarkable.
It's so simple, so nice We're going to go to extreme lengths to be able to use it.
I didn't like it so much, so these things are conditionally independent, Given the moment generating them.
One thing doesn't depend on the other from before, we could just write that probably vote.
That's just a probability off 01 on DH 02 on three all wept and he's just turning the world on DH is just the properties of the mould independently multiplied together It was like socks in the weather.
So we're gonna see that statistically independent.
Is it true for speech? Certainly not, but we're getting familiar with this general theme.
Now we find the model that is good for computation.
It's easy to learn from data.
It's efficient.
There's nice mathematical properties.
Well, massage our problem until it looks like the sort of thing we can do with this model.
Already done it with features.
We chose the calcium, and then we did quite a lot of unusual things to the features to make them look calcium.
We weren't using a gassing.
We wouldn't need to do that.
So we're going to make this massive simplifying assumptions.

Log in if you want to mark this as completed
This video covers topics:
Excellent 51
Very helpful 9
Quite helpful 6
Slightly helpful 3
Confusing 2
No rating 0
My brain hurts 0
Really quite difficult 2
Getting harder 8
Just right 57
Pretty simple 4
No rating 0