This video just has a plain transcript, not time-aligned to the videoTHIS IS AN UNCORRECTED AUTOMATIC TRANSCRIPT. IT MAY BE CORRECTED LATER IF TIME PERMITSNo.Now we're going to do what we said we would do at the beginning.We're going to engineer features that don't have this property.They're okay and have this property.Or indeed, if we're really lucky to have this property.But in general, this is good enough because here Sigma is a vector, and that's just linear in the number of dimensions.And Mu is a vector that's always a vector of the same number of dimensions.So this case is good enough.So we're going to try and engineer features that have this property.Now.What of our features? Bean so far are very naive features.Words When we go and get a bit of speech 25 seconds of speech, we transform it into a vector features.We've been pretending that those features are just the magnitude spectrum like my new spectrum might have thousands of points in it.And so a spectrum that's frequency, and that's the magnitude.Let's just assume it's smooth.Forget about harmonics from an imagine The spectrum looks like this, so I've eaten a feature vector is just the values that describe this curve, and those are written into a vector that goes in there, that one goes in there.So our feature vector is the magnitude spectrum in the spectrum.The energy at this particular frequency here tends to go up and down at the same time as the energy just below it and just above it, the highly correlated.They're not completely independent because this is a smooth, continuous care of spectral envelope.So there's a highly co variant.So the magnitude spectrum exhibits this bad property of co very ums.So we're going to get rid of that.At the same time, we're going to do a few of the nice things to it that will make it even smaller and even less correlated and therefore a better fit to the galaxy in Gaza, probably density function.So commence is a problem because it increases the number of parameters in our calcium as the square of the dimension of the calcium.That's a very bad property.Anything that goes up with square of something doesn't scale very well.And the more parameters that are more data will need.Data is always sparse.It was always too little.Data were always pushing our models the limit of the data, right So let's plug that in and replace the local distance measure with this probability measure.That's good.So it's better because it accounts of variants.However, it's got this bad property that we must have independent dimensions in a feature vector.No correlation.We're going to get rid of some correlation, so these FFT coefficients or the magnitude of 50 coefficients, are no good.They co vary because energy in one region of the spectrum tends to all go down together.So neighbouring frequency Benz go up and down together.So we're going to get rid of that co variance.We're going to perform some transformations on the vector, Teo de Correlate its dimensions.So we're onto another core concept, and this is perhaps a little bit tricky.So let's try and motivated carefully because we go along.The thing we're trying to do most of all is to come up with a feature.Vector has two nice properties.One.It's a few dimensions as possible but still captures all the important information.If we can use 12 dimensions instead of 1000 dimensions, that's great.We'll need less data to estimate the model.The model will be small.It'll be faster.It's a wind situation so well, vectors are small as possible.And second, and just as important, is that the elements of the vector do not co.Very so in a statistical sense, across lots of different frames of the data.As one goes up and down, the others don't systematically go up and down with it in sympathy.They were going down independently, so all the all the dimensions independent.
Requirements
We are going to use Gaussian pdfs, and that places some requirements on the properties of the features that we will model.
Log in if you want to mark this as completed
|
|