학술논문

Multiple instance learning.
Document Type
Book Review
Author
Herrera, Francisco (E-GRAN-CAI) AMS Author Profile; Ventura, Sebastián (E-COR-C) AMS Author Profile; Bello, Rafael (CB-LAVI-CIF) AMS Author Profile; Cornelis, Chris (B-UGENT-AMS) AMS Author Profile; Zafra, Amelia (E-COR-INA) AMS Author Profile; Sánchez-Tarragó, Dánel (CB-LAVI-NDM) AMS Author Profile; Vluymans, Sarah (B-UGENT-AMS) AMS Author Profile
Source
Subject
62 Statistics -- 62H Multivariate analysis
  62H30 Classification and discrimination; cluster analysis

62 Statistics -- 62J Linear inference, regression
  62J02 General nonlinear regression

68 Computer science -- 68T Artificial intelligence
  68T05 Learning and adaptive systems

68 Computer science -- 68W Algorithms
  68W40 Analysis of algorithms

94 Information and communication, circuits
  94-01 Instructional exposition

94 Information and communication, circuits -- 94A Communication, information
  94A08 Image processing
Language
English
Abstract
Preface: ``Multiple instance learning (MIL) is a recent learning framework that has become very popular lately. In this framework, objects are represented as sets of feature vectors (or bags, in MIL terminology). This kind of representation is well suited for certain problems, such as the prediction of structure-activity relationships, image classification, document categorization or the prediction of protein binding sites. In fact, MIL provides a much more natural representation than the one used in classical machine learning, where a single feature vector is used per object. \par ``The first papers on MIL appeared in the early nineties. Their main interest is solving classification problems where input data are represented as multiple instances. This field, known as multiple instance classification (MIC), is the most popular subparadigm of MIL, but not the only one. In recent years, papers have also appeared on multiple instance regression (multi-instance learning with a continuous output) and multi-instance clustering. This book aims to present a general and comprehensible overview of the MIL paradigm, providing a formal definition of the framework and covering the subparadigms it comprises, the most relevant algorithms and the most representative applications. \par ``The book is divided into three main parts. The first part (Chaps. 1--3) introduces the most important concepts of the discipline that will be necessary to understand the remainder of the book. Chapter 1 contains some introductory concepts on knowledge discovery in databases, data preprocessing, and data mining, whereas Chap. 2 introduces the multiple instance learning paradigm from a descriptive perspective. The first part finishes with a chapter focused on MIC. Besides including a formal definition of the problem and a taxonomy for MIC algorithms, it also carries out a study of two important issues, namely distance metrics and alternative learning hypotheses. \par ``The second part of the book (Chaps. 4--7) provides an exhaustive review of the different MIL algorithms. Chapters 4 and 5 describe the most important classification algorithms, following the taxonomy introduced in Chap. 3. Chapter 6 introduces multiple instance regression, the other main task in supervised learning. Chapter 7 covers two unsupervised learning tasks in the MIL framework: clustering and association rule mining. \par ``The last part of the book (Chaps. 8--10) deals with other recent areas. Data reduction for MIL is addressed in Chap. 8. Chapter 9 discusses the problem of learning with imbalanced multi-instance data. Finally, Chap. 10 introduces multiple instance multiple label learning, a new learning framework that combines MIL with multi-label learning. \par ``The target audience for this book is anyone interested in a good understanding of this important paradigm of machine learning, as well as a deep description of the current state of the art in the discipline. Practitioners in the industry and enterprise should find new insights and possibilities in the breadth of the topics covered. Researchers and data scientists in universities, research centers, and companies could appreciate this comprehensive review and uncover new ideas for productive research efforts.''

Online Access