Abstract
As data become heterogeneous, multiple kernel learning methods may help to classify them. To overcome the drawback lying in its (multiple) finite choice, we propose a novel method of ‘infinite’ kernel combinations for learning problems with the help of infinite and semi-infinite optimizations. Looking at all the infinitesimally fine convex combinations of the kernels from an infinite kernel set, the margin is maximized subject to an infinite number of constraints with a compact index set and an additional (Riemann–Stieltjes) integral constraint due to the combinations. After a parametrization in the space of probability measures, we get a semi-infinite programming problem. We analyse regularity conditions (reduction ansatz) and discuss the type of density functions in the constraints and the bilevel optimization problem derived. Our proposed approach is implemented with the conceptual reduction method and tested on homogeneous and heterogeneous data; this yields a better accuracy than a single-kernel learning for the heterogeneous data. We analyse the structure of problems obtained and discuss structural frontiers, trade-offs and research challenges.
Acknowledgements
The authors cordially thank the three anonymous referees for their constructive critisim and Professors E. Anderson, U. Çapar, M. Goberna, and J. Shawe-Taylor for their valuable advice. This study was partially undertaken at the Institute of Applied Mathematics, Middle East Technical University, Ankara, Turkey and the Faculty of Engineering and Natural Sciences, Sabancı University, Istanbul, Turkey.
Notes
Communication with Professor Eddie J. Anderson.
The matrix A is strictly diagonally dominant if .
Weak topology X is defined in its continuous dual space X*. This dual space consists of all linear functions from X into ℝ (or ℂ) which are continuous with respect to the strong topology.
Discussion with Professor Werner Römisch.
Available from http://archive.ics.uci.edu/ml/.