The Gaussian as a model of data

Instead of storing data, we will distill it into a model: a probability density function.

This video just has a plain transcript, not time-aligned to the videoTHIS IS AN UNCORRECTED AUTOMATIC TRANSCRIPT. IT MAY BE CORRECTED LATER IF TIME PERMITS
we're going to now switch to thinking not off samples of things.
So here, samples of things with the template is an actual recording of a one.
The unknown is an actual recording of something that may or may not be a one.
We're measuring distance between samples.
There's no model.
We actually stored data.
When we come to do pattern recognition, we have that store data ready to go.
We're going to get rid of the data and instead fit a statistical model to it and just keep a statistical model.
We're going to steal everything we need to know about the data into a model and then throw the data away.
So just be training data for estimating the model.
So how might we store model? What might a model of something looked like? Well, so far, it's very naive.
It's just the actual examples that's not very efficient.
It's going to very large thing to store.
It's very hard to do computation with, so we might then start distilling that information down into just the things we need to know to make classification decisions.
So instead of storing actual points, well might, for example, been them into values and store their history, Ma'am, my fitter paramedic function to their distribution.
And that might be something like a galaxy.
This is, in fact, what we're going to do now.
It could affect the distribution, so we're going to use this galaxy and distribution on the gas and distribution is the mother of all probability distributions.
It has the nicest mathematical properties.
It's the most simple to do computation with.
It's got a very small number of parameters.
There's only two numbers we need to represent the calcium, and that's the average.
That's this middle point.
Here we call that mu.
That's the Greek letter Myu.
It's just me because you start with me.
Nothing fancier than that on DH.
The spread around the mean And then that's gonna be the standard deviation.
We're going to call that Sigma Sigma starts with us.
Esther Standard Deviation.
There's nothing clever about Greek letters were just trying to look fancy by using them.
Mr The Standard notation we're gonna still mean on the standard deviation we might equivalently.
Instead of scoring signal, we might store Sigma squared, and that's called the variance variances done.
Deviation.
One is just a square of the other.
There's just two numbers that describe that curve.
That looks good because we don't have to start to numbers instead of a whole cloud of data points.
So how might we fit the probability density function to some data on What is it going to help us with? So that's the equation for the calcium.
I don't know if you don't stand that, but I do encourage you to go away and think about how this equation works.
Put some values in it and see what numbers you get out, for example, So this is the parameter of interest.
That might be the energy in one of the bins off.
Pfft.
It might be the fourth element of our feature.
Vector is going to give it a number.
It's a variable called X.
I mean is the average mu.
When X equals Myu, this term goes to zero.
When this time goes to zero, this whole thing has a maximum value, and that's we're here.
So for data points that are near the average of the distribution, the probability is the highest.
That's intuitively reasonable.
So things near the centre of distribution of a high probability on the further away.
We go from the mean in this direction or this direction.
The lower the probability girls now technically, this is called a probability density.
It's not actually probability, but for the purposes of this course, weaken gloss over this subtle difference.
For now, just think of it as probability.
Or how likely is it that the Value X belongs to this distribution? We can see that sigma occurs in the sigma squared form here, which is often why we store Sigma squared calling variants equivalent Weaken Store Sigma recalled its standard deviation.

Log in if you want to mark this as completed
Excellent 34
Very helpful 4
Quite helpful 5
Slightly helpful 2
Confusing 0
No rating 0
My brain hurts 0
Really quite difficult 0
Getting harder 6
Just right 38
Pretty simple 1
No rating 0