A Complete Guide to Decision Tree Formation and Interpretation in Machine Learning

Table of Content

  • Important Definitions
  • Homogeneous and Heterogeneous Data
  • Entropy and Information Gain
  • Gini and Gini Impurity
  • Reduction in Standard Deviation
  • Different Nodes in Decision Tree
  • Different Algorithms used to build Decision Tree
  • Decision Tree for Classification
  • Building a decision tree for the given data using Entropy and Information Gain
  • Building a decision tree for the given data using Gini and Gini Index
  • Complete code to build decision tree in python
  • Decision Tree for Regression
  • Build Decision Tree using Standard Deviation Reduction strategy

Important Definitions

Homogeneous and Heterogeneous Data Nodes

Understanding homogeneous nature of data nodes is very important when it comes to decision tree. It is directly related to the predictive power of the tree build. If you pick a random data item from a fully homogeneous node then it wont be any brainier for you to tell which class it belongs to. Got the idea? if not then let me explain what is meant by homogeneous and non homogeneous data nodes.

Homogeneous and Heterogeneous Data Node in Decision Tree
Homogeneous and Heterogeneous Data node in Decision Tree | Source Website: www.ashutoshtripathi.com

Entropy and information Gain

Entropy is the measure of randomness in your data. More randomness means more entropy means harder to draw any conclusion or harder to classify. Lets relate it to the homogeneity concept explained above to make it easy to understand. If your data node is heterogeneous it will be hard to draw any conclusion about that data node which will lead to higher entropy.

Entropy of Homogeneous data nodes in decision tree
Source Website: www.ashutoshtripathi.com
Entropy in Decision Tree
Entropy in Decision Tree | Source Website: www.ashutoshtripathi.com
Entropy Forumla
Entropy Formula
Decision Tree Outlook Dataset | source website: www.ashutoshtripathi.com

Entropy Calculation

Entropy Calculation for different data nodes in Decision Tree | Source : www.ashutoshtripathi.com
Information Gain Formula in Decision Tree | source: www.ashutoshtripathi.com
Information Gain Info graphic Decision Tree | Source: www.ashutoshtripathi.com

There is another criteria to split the node that is using Gini and Gini Impurity. Let’s understand more about Gini criteria.

Gini and Gini Impurity

Gini is calculated by using the formula — the sum of square of probability for success and failure (p²+q²)

Gini Formula in Decision Tree | Source: www.ashutoshtripathi.com
Gini calculation for outlook column in decision tree | source: www.ashutoshtripathi.com
Gini Impurity calculation example | source: www.ashutoshtripathi.com
weighted gini impurity in decision tree example | source: www.ashutoshtripathi.com

Reduction in Variance or Standard Deviation Reduction — SDR

Entropy, Information Gain and Gini Impurity are used in classification problems. So what about regression? So similar to information gain that is reduction in entropy, we have reduction in variance or reduction in standard deviation which is used for regression type problem in decision tree.

outlook Dataset for decision tree | image source: google images
image source: google images
image source: www.ashutoshtripathi.com
weighted standard deviation calculation in decision tree regression | source: www.ashutoshtripathi.com
Standard Deviation Reduction in regression decision tree | source: www.ashutoshtripathi.com

Understand different nodes in Decision Trees

  1. Root Node: It represents the entire population or sample and this further gets divided into two or more homogeneous sets. If you consider Information gain then the node which has maximum information gain will become the root node.
  2. Decision Node: When a sub-node splits into further sub-nodes, then it is called the decision node. Basically it helps you to decide whether to choose (select or go) right sub tree or left sub tree
  3. Leaf / Terminal Node: Nodes do not split is called Leaf or Terminal node.
  4. Branch / Sub-Tree: A subsection of the entire tree is called branch or sub-tree.
  5. Parent and Child Node: A node, which is divided into sub-nodes is called a parent node of sub-nodes whereas sub-nodes are the child of a parent node.
  6. Pruning: When we remove sub-nodes of a decision node, this process is called pruning. You can say the opposite process of splitting. It is used in hyper parameter tuning in decision trees.

Different Algorithms used to build Decision Tree

Decision Tree is used for both types of problems Classification and Regression. So based on problem types it uses different algorithms.

Decision Tree for Classification Problems

Building a decision tree for the given data using Entropy and Information

Please watch the video explaining decision tree formation and split using entropy and information gain.

Decision Tree formation step by step explanation

Building a decision tree for the given data using Gini and Gini Index

Please watch the video which explains decision tree formation using gini and gini index.

Complete Code to build Decision Tree for classification using Python

Please refer my post on step by step guide to build decision tree using python.

Decision Tree for Regression

Please refer my post on step by step guide to build decision tree for Regression

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store