CIND119 Module 4 - How To Build A Decision Tree
CIND119 Module 4 - How To Build A Decision Tree
1
From Data to Decision Tree
1 YES NO YES NO NO
2 YES NO NO NO NO
3 YES YES NO YES YES
4 NO YES NO YES YES
5 NO NO NO YES YES
6 NO YES NO YES YES
2
How to Use a Decision Tree
3
Hunt’s Algorithm & Example 1
4
Hunt’s Algorithm & Example 2
5
Impurity Measure: GINI Index
𝑗
• is the relative frequency of class at
branch .
• The relative frequency of class at is
( 13 ) +( 23 ) ]=0.44
2 2
the number of datapoints that belong to 𝐺𝐼𝑁𝐼 ( 𝑀𝑎𝑟𝑟𝑖𝑒𝑑=𝑌𝐸𝑆)=1 −[
class at branch divided by the total
( 33 ) =0.0
2
𝐺𝐼𝑁𝐼 ( 𝑀𝑎𝑟𝑟𝑖𝑒𝑑=𝑁𝑂 )=1 −
number of datapoints at branch
6
References
• Hunt, J.W. & Szymanski, T.G. (1977, May).
A fast algorithm for computing longest common subsequences. Communications
of the ACM, 20(5), 350–353. Retrieved from
https://doi.org/10.1145/359581.359603