Features in Chapter 8-Holmes&Holmes

This topic has 1 reply, 2 voices, and was last updated 4 years, 3 months ago by Simon.

Viewing 1 reply thread

Author

Posts
- November 12, 2020 at 10:47 #13038
  Elisa G
  Student
  Hello there,
  
  I am reading about feature vectors and frames in Holmes&Holmes (Ch.8). At the end of page 110, they say:
  
  “Those features of the acoustic signal that are determined by the phonetic properties should obviously be given more weight in the distance calculation.”
  
  I don’t fully know how to interpret ‘weight’ here.
  
  What I mainly understand is that they want to extract important phonetic properties rather than other less relevant features (e.g. silence, noise, etc). Is that correct?
  
  Also, do we already know how to differentiate the two in the distance calculation, e.g. in dynamic time warping?
  
  Thank you!
- November 15, 2020 at 09:45 #13066
  Simon
  Professor
  They are pointing to a problem with simple distance metrics such as the Euclidean distance. This metric assumes all dimensions of the feature vector are equally important and simply sums up the squared differences between corresponding elements in the two vectors being compared.
  
  This is sensitive to the scale of each element.
  
  Take the example of filterbank energies as our feature vector, and that – in general and on average across all the data – the amount of energy in the 2nd filter is around 10 times larger than that in the 11th filter. (Look at a typical speech magnitude spectrum to see why this could be the case.)
  
  The 2nd element of the feature vector will contribute about 10 times as much to the total distance being calculated as the 11th element. It is being treated as more important.
  
  One solution to this would be to weight the elements as we sum them up in the Euclidean distance, to balance their contributions according to how important we think they are.
  
  This is precisely what the Gaussian distribution does for us: it is what the standard deviation parameter is for. This scales each dimension of the distance calculation according to the amount of variability we see along that dimension for the class we are modelling.
  
  Out of scope for this course, but something you will see in the literature, is a scaled Euclidean distance called the Mahalanobis distance. That is the same form that appears in the exponent of the Gaussian equation.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

Features in Chapter 8-Holmes&Holmes

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis