What Are Decision Trees And How Do They Work? (From Scratch)  Summary and Q&A
TL;DR
Decision trees are a visual representation of the decisionmaking process, where each node represents a condition and directs the path of the decision based on the condition's outcome. Decision trees can be used for classification and regression problems.
Questions & Answers
Q: How are decision trees structured?
Decision trees consist of nodes, which represent conditions, and branches, which represent the outcomes based on those conditions. They start with a root node and split into decision and leaf nodes.
Q: How is impurity measured in decision trees?
Impurity in decision trees can be measured using metrics such as Gini impurity and entropy. Gini impurity is calculated as the sum of the probabilities of each class multiplied by 1 minus the probability of that class. Entropy is similar, but the probabilities are multiplied by their logarithms.
Q: How does a decision tree handle categorical variables?
Categorical variables in a decision tree can be represented by assigning numerical values to each category. The decision tree can then use these numerical values to determine the path of the decision.
Q: How is the best split chosen in a decision tree?
The best split in a decision tree is chosen based on the reduction in impurity. The split that results in the largest reduction in impurity is selected, as it provides the most information gain.
Q: How is a decision tree built?
Decision trees are built by recursively splitting the data based on conditions that reduce impurity. The process continues until a certain stopping criterion is met, such as reaching a maximum tree depth or a minimum number of samples per leaf.
Q: How does a decision tree handle missing values?
Decision trees can handle missing values by assigning them to the most common class or by using surrogate splits, which create additional branches to account for missing values.
Summary & Key Takeaways

Decision trees are a visual representation of decisionmaking processes, with nodes representing conditions and branches representing outcomes based on those conditions.

The probability of a sample belonging to a certain class can be calculated at each node, based on the number of samples in each class.

Impurity in decision trees represents the mixture of classes at a node, and it can be measured using metrics such as Gini impurity or entropy.

The impurity is minimized when building a decision tree by choosing conditions that reduce impurity the most.

The basic building block of a decision tree is a decision node, which splits samples based on a condition, and leaf nodes, which represent the final predicted class.