Advertisement

eJournal Archive Package (1962- 2003): Details on request.

Multi-class HingeBoost

Journal: Methods of Information in Medicine
Subtitle: A journal stressing, for more than 50 years, the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care
ISSN: 0026-1270
Topic:

Focus Theme: Recent Developments in Boosting Methodology
Guest Editors: M. Schmid, T. Hothorn

DOI: http://dx.doi.org/10.3414/ME11-02-0020
Issue: 2012 (Vol. 51): Issue 2 2012
Pages: 162-167

Multi-class HingeBoost

Method and Application to the Classification of Cancer Types Using Gene Expression Data

Focus Theme - Recent Developments in Boosting Methodology

Online Supplementary Material

Z. Wang (1)

(1) Department of Research, Connecticut Children’s Medical Center, Department of Pediatrics, University of Connecticut School of Medicine, Hartford, Connecticut, USA

Keywords

Classification, Variable Selection, Regression Trees, boosting, smoothing splines

Summary

Background: Multi-class molecular cancer classification has great potential clinical implications. Such applications require statistical methods to accurately classify cancer types with a small subset of genes from thousands of genes in the data.

Objectives: This paper presents a new functional gradient descent boosting algorithm that directly extends the HingeBoost algorithm from the binary case to the multi-class case without reducing the original problem to multiple binary problems.

Methods: Minimizing a multi-class hinge loss with boosting technique, the proposed HingeBoost has good theoretical properties by implementing the Bayes decision rule and providing a unifying framework with either equal or unequal misclassification costs. Furthermore, we propose Twin HingeBoost which has better feature selection behavior than HingeBoost by reducing the number of ineffective covariates. Simulated data, benchmark data and two cancer gene expression data sets are utilized to evaluate the performance of the proposed approach.

Results: Simulations and the benchmark data showed that the multi-class HingeBoost generated accurate predictions when compared with the alternative methods, especially with high-dimensional covariates. The multi-class HingeBoost also produced more accurate prediction or comparable prediction in two cancer classification problems using gene expression data.

Conclusions: This work has shown that the HingeBoost provides a powerful tool for multi-classification problems. In many applications, the classification accuracy and feature selection behavior can be further improved when using Twin HingeBoost.

You may also be interested in...

1.

Focus Theme - Recent Developments in Boosting Methodology

Online Supplementary Material

A. Groll (1), G. Tutz (1)

Methods Inf Med 2012 51 2: 168-177

http://dx.doi.org/10.3414/ME11-02-0021

2.

Original Article

R. Stollhoff (1), W. Sauerbrei (2), M. Schumacher (2)

Methods Inf Med 2010 49 3: 219-229

http://dx.doi.org/10.3414/ME0543

3.

W. Adler, A. Peters, B. Lausen

Methods Inf Med 2008 47 1: 38-46

http://dx.doi.org/10.3414/ME0348