› Forums › Speech Synthesis › The front end › CART for letter-to-sound
- This topic has 1 reply, 2 voices, and was last updated 7 years, 10 months ago by Simon.
-
AuthorPosts
-
-
October 13, 2016 at 00:16 #5467
Here I have two questions all related to entropy.
a)In the video “Worked example 1 – letter-to-sound“, which shows us the classification of the sound of the letter “a”, it seems that we only care about the first question (which was Does letter n=”r”?), then we just finish the decision tree by randomly choosing the question until we sort all the labelled item. Do I get it right? If yes, does it mean that the entropy is only related to the split?b) Why we choose the split when it creates a lower entropy? It is because if we choose another split which would create a higher entropy, it takes more take or space to compute the decision tree?
-
October 16, 2016 at 20:32 #5481
In the video, the question Does letter n=”r”? is just one of many possible questions that we will try to split the root node. One of the many questions will reduce entropy more than the others, and that one is placed in the tree and the data are permanently partitioned down the “Yes” and “No” branches.
Then we recurse. That means that we simply apply precisely the same procedure separately to each of the two child nodes that we have just created; then we do the same for their child nodes, and so on until we decide to stop.
Why is entropy used to choose the best question?
Think about the goal of the tree: it is to make predictions. In other words, we want to partition the data in a way that makes the value of the predictee less random (i.e., less unpredictable) and more predictable.
Entropy is a measure of how unpredictable a random variable is. The random variable here is the predictee. We partition the data in the way that makes the distribution of the predictee as non-uniform as possible.
Ideally, we want all data points within each partition to have the same value for the predictee. That would mean zero entropy.
If we can’t achieve that, then we choose the split that has the lowest possible entropy.
-
-
AuthorPosts
- You must be logged in to reply to this topic.