Overfitting a GMM

This topic has 4 replies, 3 voices, and was last updated 7 years, 3 months ago by Simon.

Viewing 4 reply threads

Author

Posts
- February 22, 2018 at 18:08 #9067
  Ariadna S
  Tutor
  I am thinking whether it is possible to overfit a GMM by adding a lot of Gaussians (e.g. a bizarre situation where you have as many Gaussians as data points).
  Will GMM overfit in any case? Or would it just overcomplicate the model without giving any improvement?
  
  Thank you in advance
- February 22, 2018 at 20:45 #9068
  Shijie Yao
  Student
  A good question +1
- February 24, 2018 at 18:55 #9072
  Simon
  Professor
  If there as many Gaussian mixture components in a GMM as there are data points, then we would expect each component to model a single data point. The mean of each component would be equal to the value of the corresponding data point.
  
  What will the variance of each mixture component be?
- February 25, 2018 at 01:03 #9074
  Ariadna S
  Tutor
  The variance is 0, as you only have a data point.
  I would say that then it does overfit, as it is too data specific? (And it is not modelling any possible variance)
- February 25, 2018 at 10:47 #9075
  Simon
  Professor
  Correct – the variance would be zero. That’s a serious problem, because such a model will assign zero probability mass to everywhere except the exact positions of the data points. Zero variance is also numerically impossible: we cannot compute with such a model.
  
  But overfitting will probably occur long before we get to the point where there are as many mixture components as there are data points. It will happen as soon as the model starts to assign too much probability mass in the small regions around the observed data points and not enough mass to as-yet-unseen values that may occur in the test set.
  
  The problems of small (including zero) variances in a model can be mitigated by setting a variance floor (e.g., not allowing the variance of any mixture component to go below 1% of the variance of the data as a whole). Using a variance floor is good practice because it avoids the numerical problems of very small (or zero) variances, and offers a partial solution to overfitting.
Author

Posts

Viewing 4 reply threads

You must be logged in to reply to this topic.

Overfitting a GMM

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis