Automatic Design of Decision-Tree Induction Algorithms (SpringerBriefs in Computer Science)

Automatic Design of Decision-Tree Induction Algorithms (SpringerBriefs in Computer Science)

Language: English

Pages: 176

ISBN: 3319142305

Format: PDF / Kindle (mobi) / ePub


Presents a detailed study of the major design components that constitute a top-down decision-tree induction algorithm, including aspects such as split criteria, stopping criteria, pruning and the approaches for dealing with missing values. Whereas the strategy still employed nowadays is to use a 'generic' decision-tree induction algorithm regardless of the data, the authors argue on the benefits that a bias-fitting strategy could bring to decision-tree induction, in which the ultimate goal is the automatic generation of a decision-tree induction algorithm tailored to the application domain of interest. For such, they discuss how one can effectively discover the most suitable set of components of decision-tree induction algorithms to deal with a wide variety of applications through the paradigm of evolutionary computation, following the emergence of a novel field called hyper-heuristics.

"Automatic Design of Decision-Tree Induction Algorithms" would be highly useful for machine learning and evolutionary computation students and researchers alike.

Rails Crash Course: A No-Nonsense Guide to Rails Development

Classical And Quantum Computing With C++ And Java Simulations

Elements of Automata Theory

Testing Computer Software (2nd Edition)

 

 

 

 

 

 

 

 

 

 

 

. . . . . . . . . . . . 6.2 Aggregation Schemes . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Results for the Balanced Meta-Training Set . . . 6.3.2 Results for the Imbalanced Meta-Training Set . 6.3.3 Experiments with the Best-Performing Strategy 6.4 Chapter Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

induction algorithm tailored to a variety of distinct data sets. The goal is to evolve an algorithm capable of being robust across different data sets. For such, we make use of 67 publicly-available data sets from the UCI machine-learning repository3 [8] (see Table 5.14). As in the homogeneous approach, we also randomly divided the 67 data sets into two groups: parameter optimisation and experiments. The 27 data sets in the parameter optimisation group are used for tuning the evolution parameters

0.03 0.83 ± 0.07 2.25 0.05 ± 0.03 0.72 ± 0.04 0.72 ± 0.14 0.88 ± 0.04 0.20 ± 0.42 0.51 ± 0.12 0.72 ± 0.04 0.71 ± 0.06 0.81 ± 0.04 0.84 ± 0.03 0.61 ± 0.10 0.99 ± 0.01 0.97 ± 0.02 0.78 ± 0.10 0.91 ± 0.06 0.71 ± 0.06 0.75 ± 0.12 0.74 ± 0.05 0.63 ± 0.02 0.84 ± 0.07 2.90 0.02 ± 0.01 0.70 ± 0.04 0.71 ± 0.14 0.88 ± 0.04 0.20 ± 0.42 0.49 ± 0.15 0.71 ± 0.05 0.71 ± 0.06 0.80 ± 0.04 0.84 ± 0.03 0.57 ± 0.10 0.99 ± 0.01 0.97 ± 0.02 0.77 ± 0.10 0.88 ± 0.09 0.59 ± 0.08 0.73 ± 0.14 0.71 ± 0.05 0.61 ± 0.02 0.82

number of strategies, so one could see how they cope with different optimisation goals. 7.2.3 Automatic Selection of the Meta-Training Set We employed a methodology that randomly selected data sets to be part of the metatraining set. Since the performance of the evolved decision-tree algorithm is directly related to the data sets that belong to the meta-training set, we believe an intelligent and automatic strategy to select a proper meta-training set would be beneficial to the final user. For

which is actually dividing GMI by the joint entropy of ai and y. Clearly CAIR (ai , X, y) ≥ 0, since both GMI and the joint entropy are greater (or equal) than zero. In fact, 0 ≤ CAIR(ai , X, y) ≤ 1, with CAIR(ai , X, y) = 0 when ai and y are totally independent and CAIR(ai , X, y) = 1 when they are totally dependent. The term redundancy in CAIR comes from the fact that one may discretize a continuous attribute in intervals in such a way that the class-attribute interdependence is kept intact

Download sample

Download