Boosting for Comparison-Based Learning

international joint conference on artificial intelligence, 2019.

Cited by: 2|Bibtex|Views85|Links
EI
Keywords:
1-nearest neighbour algorithmvan der maatentriplet comparisonexplicit representationordinal embeddingMore(2+)
Weibo:
We address the problem of classification with noisy triplets that have been obtained in a passive manner: the examples lie in an unknown metric space, not necessarily Euclidean, and

Abstract:

We consider the problem of classification in a comparison-based setting: given a set of objects, we only have access to triplet comparisons of the form $x_i$ is closer to object $x_j$ than to object $x_k$.u0027u0027 In this paper we introduce TripletBoost, a new method that can learn a classifier just from such triplet comparisons. The m...More

Code:

Data:

0
Introduction
Highlights
  • In the past few years the problem of comparison-based learning has attracted growing interest in the machine learning community [Agarwal et al, 2007; Jamieson and Nowak, 2011; Tamuz et al, 2011; Tschopp et al, 2011; Van Der Maaten and Weinberger, 2012; Heikinheimo and Ukkonen, 2013; Amid and Ukkonen, 2015; Kleindessner and Luxburg, 2015; Jain et al, 2016; Haghiri et al, 2017; Kazemi et al, 2018]
  • The motivation is to relax the assumption that an explicit representation of the objects or a distance metric between pairs of examples are available
  • In this paper we focus on triplet comparisons of the form object xi is closer to object xj than to object xk, that is on relations of the form d(xi, xj) < d(xi, xk) where d is an unknown metric1
  • We address the problem of classification with noisy triplets that have been obtained in a passive manner: the examples lie in an unknown metric space, not necessarily Euclidean, and
  • We assume that the answers to the triplet comparisons can be noisy. To deal with this problem one can try to first recover an explicit representation of the examples, a task that can be solved by ordinal embedding approaches [Agarwal et al, 2007; Van Der Maaten and Weinberger, 2012; Terada and von Luxburg, 2014; Jain et al, 2016], and apply standard machine learning approaches
  • We consider three different regimes where 1%, 5% or 10% of them are available, and we consider three noise levels where 0%, (a) Gisette, Metric: Euclidean, Noise (b) Gisette, Metric: Euclidean, Pro- (c) Moons, Metric: Euclidean, Proporportion of Triplets: 5%
  • To put the results obtained in the triplet setting in perspective, we consider two methods that use the original Euclidean representations of the data, the 1-nearest neighbour algorithm (1NN) and AdaBoost.SAMME (SAMME) [Hastie et al, 2009]
Methods
  • The authors propose an empirical evaluation of TripletBoost. The authors consider six datasets of varying scales and four baselines.

    Baselines.
  • The authors use tSTE [Van Der Maaten and Weinberger, 2012] to embed the triplets in a Euclidean space and the authors use the 1-nearest neighbour algorithm for classification.
  • To the best of the knowledge, TripletBoost is the only method able to do classification using only passively obtained triplets.
  • To put the results obtained in the triplet setting in perspective, the authors consider two methods that use the original Euclidean representations of the data, the 1-nearest neighbour algorithm (1NN) and AdaBoost.SAMME (SAMME) [Hastie et al, 2009]
Results
  • Since the authors are in a multi-label setting the authors would like to predict how relevant each genre is for a new movie rather than a single genre.
  • To obtain such a quantity the authors can ignore the arg max in Equation (5) in the main paper to obtain a classifier H(m, y) that predicts the weight of a genre y for a movie m:
Conclusion
  • In this paper the authors proposed TripletBoost to address the problem of comparison-based classification.
  • The authors derived a new lower bound show√ing that to avoid learning a random predictor, at least Ω(n n) triplets are needed.
  • In practice the authors have shown that, given a sufficient amount of triplets, the method is competitive with state of the art methods and that it is quite resistant to noise
Summary
  • Introduction:

    In the past few years the problem of comparison-based learning has attracted growing interest in the machine learning community [Agarwal et al, 2007; Jamieson and Nowak, 2011; Tamuz et al, 2011; Tschopp et al, 2011; Van Der Maaten and Weinberger, 2012; Heikinheimo and Ukkonen, 2013; Amid and Ukkonen, 2015; Kleindessner and Luxburg, 2015; Jain et al, 2016; Haghiri et al, 2017; Kazemi et al, 2018].
  • The authors assume that the answers to the triplet comparisons can be noisy
  • To deal with this problem one can try to first recover an explicit representation of the examples, a task that can be solved by ordinal embedding approaches [Agarwal et al, 2007; Van Der Maaten and Weinberger, 2012; Terada and von Luxburg, 2014; Jain et al, 2016], and apply standard machine learning approaches.
  • To the best of the knowledge, for the case of passively obtained triplets, this problem has not yet been solved in the literature
  • Objectives:

    Given the triplets in T and the label information of all points, the goal is to learn a classifier.
  • Methods:

    The authors propose an empirical evaluation of TripletBoost. The authors consider six datasets of varying scales and four baselines.

    Baselines.
  • The authors use tSTE [Van Der Maaten and Weinberger, 2012] to embed the triplets in a Euclidean space and the authors use the 1-nearest neighbour algorithm for classification.
  • To the best of the knowledge, TripletBoost is the only method able to do classification using only passively obtained triplets.
  • To put the results obtained in the triplet setting in perspective, the authors consider two methods that use the original Euclidean representations of the data, the 1-nearest neighbour algorithm (1NN) and AdaBoost.SAMME (SAMME) [Hastie et al, 2009]
  • Results:

    Since the authors are in a multi-label setting the authors would like to predict how relevant each genre is for a new movie rather than a single genre.
  • To obtain such a quantity the authors can ignore the arg max in Equation (5) in the main paper to obtain a classifier H(m, y) that predicts the weight of a genre y for a movie m:
  • Conclusion:

    In this paper the authors proposed TripletBoost to address the problem of comparison-based classification.
  • The authors derived a new lower bound show√ing that to avoid learning a random predictor, at least Ω(n n) triplets are needed.
  • In practice the authors have shown that, given a sufficient amount of triplets, the method is competitive with state of the art methods and that it is quite resistant to noise
Tables
  • Table1: Summary of the different datasets
Download tables as Excel
Funding
  • Ulrike von Luxburg acknowledges funding by the DFG through the Institutional Strategy of the University of Tubingen (DFG, ZUK 63) and the Cluster of Excellence EXC 2064/1, project number 390727645
Study subjects and analysis
datasets: 6
We propose an empirical evaluation of TripletBoost. We consider six datasets of varying scales and four baselines.

Baselines
. First, we consider an embedding approach

datasets: 6
We propose an empirical evaluation of TripletBoost. We consider six datasets of varying scales and four baselines. Baselines

datasets: 6
Datasets and performance measure. We consider six datasets: Iris, Moons, Gisette, Cod-rna, MNIST, and kMNIST. For each dataset we generate some triplets as in Equation (1) using three metrics: the Euclidean, Cosine, and Cityblock distances (details provided in Appendix C)

users: 6040
As a proof of concept we considered the 1m movielens dataset [Harper and Konstan, 2016]. It contains 1 million ratings from 6040 users on 3706 movies. We used the users’ ratings to obtain some triplets about the movies and TripletBoost to learn a classifier able to predict the genres of a new movie (more details are given in Appendix D)

Reference
  • Sameer Agarwal, Josh Wills, Lawrence Cayton, Gert Lanckriet, David Kriegman, and Serge Belongie. Generalized non-metric multidimensional scaling. In Artificial Intelligence and Statistics, 2007.
    Google ScholarLocate open access versionFindings
  • Nir Ailon. An active learning algorithm for ranking from pairwise preferences with an almost optimal query complexity. Journal of Machine Learning Research, 13(Jan), 2012.
    Google ScholarLocate open access versionFindings
  • Ehsan Amid and Antti Ukkonen. Multiview triplet embedding: Learning attributes in multiple maps. In International Conference on Machine Learning, 2015.
    Google ScholarLocate open access versionFindings
  • Aurelien Bellet, Amaury Habrard, and Marc Sebban. Metric learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 9(1), 2015.
    Google ScholarLocate open access versionFindings
  • Leo Breiman. Prediction games and arcing algorithms. Neural computation, 11(7), 1999.
    Google ScholarLocate open access versionFindings
  • Tarin Clanuwat, Mikel Bober-Irizar, Asanobu Kitamoto, Alex Lamb, Kazuaki Yamamoto, and David Ha. Deep learning for classical japanese literature. arXiv preprint arXiv:1812.01718v1, 2018.
    Findings
  • Dheeru Dua and Efi Karra Taniskidou. UCI machine learning repository, 2017.
    Google ScholarFindings
  • Yoav Freund and Robert E Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1), 1997.
    Google ScholarLocate open access versionFindings
  • Wei Gao and Zhi-Hua Zhou. On the doubt about margin explanation of boosting. Artificial Intelligence, 203, 2013.
    Google ScholarLocate open access versionFindings
  • Isabelle Guyon, Steve Gunn, Asa Ben-Hur, and Gideon Dror. Result analysis of the nips 2003 feature selection challenge. In Neural Information Processing Systems, 2005.
    Google ScholarLocate open access versionFindings
  • Siavash Haghiri, Debarghya Ghoshdastidar, and Ulrike von Luxburg. Comparison-based nearest neighbor search. In Artificial Intelligence and Statistics, 2017.
    Google ScholarLocate open access versionFindings
  • F Maxwell Harper and Joseph A Konstan. The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems, 5(4), 2016.
    Google ScholarLocate open access versionFindings
  • Trevor Hastie, Saharon Rosset, Ji Zhu, and Hui Zou. Multiclass adaboost. Statistics and its Interface, 2(3), 2009.
    Google ScholarLocate open access versionFindings
  • Hannes Heikinheimo and Antti Ukkonen. The crowd-median algorithm. In First AAAI Conference on Human Computation and Crowdsourcing, 2013.
    Google ScholarLocate open access versionFindings
  • Lalit Jain, Kevin G Jamieson, and Rob Nowak. Finite sample prediction and recovery bounds for ordinal embedding. In Neural Information Processing Systems, 2016.
    Google ScholarLocate open access versionFindings
  • Kevin G Jamieson and Robert D Nowak. Low-dimensional embedding using adaptively selected ordinal data. In Conference on Communication, Control, and Computing, 2011.
    Google ScholarLocate open access versionFindings
  • Daniel M Kane, Shachar Lovett, Shay Moran, and Jiapeng Zhang. Active classification with comparison queries. In Foundations of Computer Science, 2017.
    Google ScholarLocate open access versionFindings
  • Ehsan Kazemi, Lin Chen, Sanjoy Dasgupta, and Amin Karbasi. Comparison based learning from weak oracles. arXiv preprint arXiv:1802.06942v1, 2018.
    Findings
  • Matthaus Kleindessner and Ulrike Luxburg. Dimensionality estimation without distances. In Artificial Intelligence and Statistics, 2015.
    Google ScholarLocate open access versionFindings
  • Yann LeCun, Leon Bottou, Yoshua Bengio, Patrick Haffner, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 1998.
    Google ScholarLocate open access versionFindings
  • F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikitlearn: Machine learning in Python. Journal of Machine Learning Research, 12, 2011.
    Google ScholarLocate open access versionFindings
  • Robert E Schapire and Yoav Freund. Boosting: Foundations and algorithms. MIT press, 2012.
    Google ScholarFindings
  • Robert E Schapire and Yoram Singer. Improved boosting algorithms using confidence-rated predictions. Machine learning, 37(3), 1999.
    Google ScholarLocate open access versionFindings
  • Robert E Schapire and Yoram Singer. Boostexter: A boosting-based system for text categorization. Machine learning, 39(2-3), 2000.
    Google ScholarLocate open access versionFindings
  • Robert E Schapire, Yoav Freund, Peter Bartlett, and Wee Sun Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of statistics, 1998.
    Google ScholarLocate open access versionFindings
  • Omer Tamuz, Ce Liu, Serge Belongie, Ohad Shamir, and Adam Tauman Kalai. Adaptively learning the crowd kernel. In International Conference on Machine Learning, 2011.
    Google ScholarLocate open access versionFindings
  • Yoshikazu Terada and Ulrike von Luxburg. Local ordinal embedding. In International Conference on Machine Learning, 2014.
    Google ScholarLocate open access versionFindings
  • Dominique Tschopp, Suhas Diggavi, Payam Delgosha, and Soheil Mohajer. Randomized algorithms for comparisonbased search. In Neural Information Processing Systems, 2011.
    Google ScholarLocate open access versionFindings
  • Andrew V Uzilov, Joshua M Keegan, and David H Mathews. Detection of non-coding rnas on the basis of predicted secondary structure formation free energy change. BMC bioinformatics, 7(1), 2006.
    Google ScholarLocate open access versionFindings
  • Laurens Van Der Maaten and Kilian Weinberger. Stochastic triplet embedding. In Machine Learning for Signal Processing, 2012.
    Google ScholarLocate open access versionFindings
  • Liwei Wang, Masashi Sugiyama, Zhaoxiang Jing, Cheng Yang, Zhi-Hua Zhou, and Jufu Feng. A refined margin analysis for boosting algorithms via equilibrium margin. Journal of Machine Learning Research, 12(Jun), 2011.
    Google ScholarLocate open access versionFindings
  • This lower bound does not contradict existing results [Ailon, 2012; Jamieson and Nowak, 2011; Jain et al., 2016]. They were developed in the different context of triplet recovery, where the goal is not classification, but to predict the outcome of unobserved triplet questions. For example it has been shown that to exactly recover all the triplets, the number of passively available triplets should scale in Ω(n3) [Jamieson and Nowak, 2011]. Similarly Jain et al. [2016] derive a finite error bound for approximate recovery of the Euclidean Gram matrix. Our bound shows that, in a classification setting, it might be possible to do better than that. To set a complete picture, one would need to derive an upper bound on the number of triplets necessary for good classification accuracy.
    Google ScholarLocate open access versionFindings
  • As a proof of concept we considered the 1m movielens dataset [Harper and Konstan, 2016]. This dataset contains 1 million ratings from 6040 users on 3706 movies and each movie has one or several genres (there is 18 genres in total). To demonstrate the interest of our approach we proposed (i) to use the users’ ratings to obtain some triplets of the form movie mi is closer to movie mj than to movie mk, and (ii) to use TripletBoost to learn a classifier predicting the genres of the movies.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments