Variance and covariance of the Gaussian

Modelling multivariate data requires more parameters, especially if the dimensions are correlated.

slownormalfast

This video just has a plain transcript, not time-aligned to the videoTHIS IS AN UNCORRECTED AUTOMATIC TRANSCRIPT. IT MAY BE CORRECTED LATER IF TIME PERMITS
So this form of this equation, this equation here, generalises to any number of dimensions and it's got two parameters.
It's got an average, I mean and it's got a deviation around that average.
Let's see what happens to those parameters as goto higher numbers of dimensions.
Let's draw this very, very simple One in two dimensions average here, I mean, and this is two dimensional.
So the mean is a vector with two numbers.
I mean, a long one dimension mean along the other dimension on this distribution here is just uniform.
It's circular and in high dimensions would say it was a miracle.
In other words, the distance in all directions is the same.
So that means we could just store one number for that Sigma sigma.
We could just all one number two, standard deviation, every direction, the deviations, the same.
So there are three parameters to this uniformly distributed circular calcium threes and my small number.
What about this distribution here? So the underlying data is now no longer evenly distributed.
All the different dimensions, the desert.
The variance in this dimension is smaller than the variance in this dimension.
So what we need to store now still need to store the mean on the mean is a vector of two numbers.
We now need to store the variance in both directions.
So Sigma is also going to be a vector of two numbers variants one way and variance the other way up with that.
So this distribution little more complicated than the previous one.
We now need to store four numbers to represent it, not three numbers.
It's got more complicated.
Still, how about the data that distributed like this? There's now some correlation between the two values data points that have a high value in this dimension.
On average, 10 toe also have a high value in this other dimension to the correlated, not independent anymore in the statistical sense.
So now we need to still stole the mean.
That means still a factor of two dimensions.
We also need to store the variance in these two directions.
It might be different, so the variance looks like it might be two dimensional, but we also need to store some additional things.
And that's these co variances.
How much does Co very.
In other words, what slope of this house cute is this against the axis, so the various actually good to become a matrix of four numbers I should be a symmetrical matrix only has three independent values, so you got even more numbers to store.
This case is a particularly nasty case because this variance here, let's change this number because in general, when becomes a matrix, we change the letter.
It's a big segment out to imply that Big Matrix this has got a lot more parameters than the previous one.
And imagine what happens when we go above two dimensions.
We got three dimensions for dimensions, 39 dimensions.
This matrix is going to get bigger and bigger.
There's the three dimensional version.
It's got nine things in it.
So the number of parameters in the variants goes up like the square of the dimensions.
So you can imagine a model that's operating in high dimensional space, maybe ar 15 magnitude spectrum with 1000 points in it.
This matrix is 1000 by 1000.
There's a very large number of the order of a million parameters in matrix.
That's a very bad thing, because to estimate the parameters, we need data points and more parameters that are, let's just stick with this intuitively, the more parameters that are more data points, we're going to need to estimate the parameters reliably.
So distributions looked like this where the correlation between different dimensions really bad news because our probability distribution will have a lot more parameters to estimate.
That means we need a lot more data to estimate the parameters and the number of parameters goes up like the square of the dimension.
That's a very bad thing goes up like best with dimensions.
So we're going to try and avoid this situation.
All costs in statistics.
We call it a correlation when one value tends to vary with another value, this is positive correlation.
Everything was twisted the other way.
They might be negative correlation.
We're going to use the probability term, which is essentially the same thing.
That's co VarianMS they co very.
When one varies, the other tensed area with it in sympathy go up and down together, we'll be going up directions together.
The highly correlated