› Forums › Automatic speech recognition › Gaussian probability density functions (pdfs) › Overfitting a GMM
- This topic has 4 replies, 3 voices, and was last updated 6 years, 10 months ago by Simon.
-
AuthorPosts
-
-
February 22, 2018 at 18:08 #9067
I am thinking whether it is possible to overfit a GMM by adding a lot of Gaussians (e.g. a bizarre situation where you have as many Gaussians as data points).
Will GMM overfit in any case? Or would it just overcomplicate the model without giving any improvement?Thank you in advance
-
February 22, 2018 at 20:45 #9068
A good question +1
-
February 24, 2018 at 18:55 #9072
If there as many Gaussian mixture components in a GMM as there are data points, then we would expect each component to model a single data point. The mean of each component would be equal to the value of the corresponding data point.
What will the variance of each mixture component be?
-
February 25, 2018 at 01:03 #9074
The variance is 0, as you only have a data point.
I would say that then it does overfit, as it is too data specific? (And it is not modelling any possible variance) -
February 25, 2018 at 10:47 #9075
Correct – the variance would be zero. That’s a serious problem, because such a model will assign zero probability mass to everywhere except the exact positions of the data points. Zero variance is also numerically impossible: we cannot compute with such a model.
But overfitting will probably occur long before we get to the point where there are as many mixture components as there are data points. It will happen as soon as the model starts to assign too much probability mass in the small regions around the observed data points and not enough mass to as-yet-unseen values that may occur in the test set.
The problems of small (including zero) variances in a model can be mitigated by setting a variance floor (e.g., not allowing the variance of any mixture component to go below 1% of the variance of the data as a whole). Using a variance floor is good practice because it avoids the numerical problems of very small (or zero) variances, and offers a partial solution to overfitting.
-
-
AuthorPosts
- You must be logged in to reply to this topic.