Using Gaussians to perform classification

We can construct a classifier from two Gaussian probability density functions.

slownormalfast

This video just has a plain transcript, not time-aligned to the videoTHIS IS AN UNCORRECTED AUTOMATIC TRANSCRIPT. IT MAY BE CORRECTED LATER IF TIME PERMITS
So here's a county in, okay, Just just like that one between two dimensions on this is the probability of these two dimensions.
Probably density.
Okay, now is not convenient to draw these fancy three D diagrams as colourful as they look.
So we're just going to do a bird's eye view is going to look down on it like it was a map and hopefully you can all read maps.
To some extent, we're going to just draw the contour lines.
Okay.
Remember one way of representing three D things like the landscape is above, which is contour lines of equal height.
I was going to draw them with control lines and I was gonna go one contour line.
So here we are now looking down on top of these two dimensional galaxy ins.
On this distance, here is the standard deviation.
So that's the standard deviation green things.
And that's standard deviation of blue things.
Now what we can do, we're going to classify this unknown point.
We're going to do it this way.
Here's the thing we're trying to classify.
It was about here.
We're now going to work the point we're trying to classify, we decide Is it more like it? That's green or blue.
We're gonna work out the probability of it being green, so we know it's got a feature vector.
There's it's two dimensional feature vector.
We just plug that feature factor into the formula for calcium.
So here's the formula for gasoline and thatjust generalises to any number of dimensions, and we get a probability value.
That's the probability of it being green when we get to probability of it being blue and we just compare those two numbers.
And in this particular case, the probability of being green is going to be higher than the probability of being blue.
So we're going to classify that as a green thing.
Let's do it like a simpler version of it in one dimension just to make that really clear.
So let's just let's just imagine we're just doing things in this dimension.
You can draw along the top, so the galaxy in of the blue things looks like this is quite narrow on the gas and the green things.
Let's do things in the correct colours.
Let's just let's just take this ex dimension to it, back it back in one dimension.
Let's do the density function of blue things.
This is quite peaky.
Distribution, like that.
Distribution of green things has the means here, and it looks something like this much wider.
Now we want to classify some unknown value of X.
So along comes a value X.
Is it more like to be green or blue? So maybe the value is here.
That's X one, and we just go and see which of the curves is higher on the high school.
Blues is a blue thing.
Maybe there's an X here next to his ex green or blue.
The highest curve is a green thing.
We can see there were these two cross.
There's just an implicit boundary between the two classes, so these two probability distributions form a very simple form of classifier in the classified just has a boundary and effectively all this classifier says.
If X is less than this value, it's blue.
It's more than this value.
It's green.
That boundary is implied by these two probability distributions, and the same is true in two dimensions.
So some implied class boundary here.
Between these two things, it's not going to be exactly straight into something like this.
on this boundaries implied by these two distributions.
We never draw this boundary.
We never need to write it down or we need to do is for any point in space.
We just compute the probability of it being green, probably being blue.
Compare the two numbers in whichever's highest is what we classify this point us.