The root node

All of the training data starts at the root node.

Place all your data at the root node and compute the entropy. This is a measure of how predictable the value of the predictee is at this point. I’ll start you off: The value “BREAK” occurs 12 times so its probability is 12/34 which is about 0.35. The value “NO BREAK” occurs 22 times so its probability is 22/34 which is about 0.65. Now compute entropy using “- sum of p log p”. Make sure you know how to do this for yourself first. Then, to save time you could use this entropy calculator.

Video to be added after the lecture…