Decision Tree Algorithm

The Decision Tree algorithm is based on conditional probabilities. Unlike Naive Bayes, Decision Trees generate rules. A rule is a conditional statement that can easily be used by humans and easily used within a database to identify a set of records.

The Decision Tree algorithm:

Decision Tree scoring is especially fast. The tree structure, created in the model build, is used for a series of simple tests, (typically 2-7). Each test is based on a single predictor. It is a membership test: either IN or NOT IN a list of values (categorical predictor); or LESS THAN or EQUAL TO some value (numeric predictor).

During the model build, the Decision Tree algorithm must repeatedly find the most efficient way to split a set of cases (records) into two child nodes. Oracle Data Mining offers two homogeneity metrics, gini and entropy, for calculating the splits. The default metric is gini.