AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We present a kernel Gaussian Mixture Model, and deduce a parameter estimation algorithm by embedding kernel trick into EM algorithm

Kernel Trick Embedded Gaussian Mixture Model

ALGORITHMIC LEARNING THEORY, PROCEEDINGS, (2003): 159-174

Cited by: 34|Views93
WOS SCOPUS EI

Abstract

In this paper, we present a kernel trick embedded Gaussian Mixture Model (GMM), called kernel GMM. The basic idea is to embed kernel trick into EM algorithm and deduce a parameter estimation algorithm for GMM in feature space. Kernel GMM could be viewed as a Bayesian Kernel Method. Compared with most classical kernel methods, the proposed...More

Code:

Data:

0
Introduction
  • Kernel trick is an efficient method for nonlinear data analysis early used by Support Vector Machine (SVM) [18].
  • In many cases, the authors are required to obtain risk minimization result and incorporate prior knowledge, which could be provided within Bayesian probabilistic framework.
  • This makes the emerging of combining kernel trick and Bayesian method, which is called Bayesian Kernel Method [16].
  • As Bayesian Kernel Method is in probabilistic framework, it can realize Bayesian optimal decision and estimate confidence or reliability with probabilistic criteria such as Maximum-A-Posterior [5] and so on
Highlights
  • Kernel trick is an efficient method for nonlinear data analysis early used by Support Vector Machine (SVM) [18]
  • We review some background knowledge including the kernel trick, Gaussian Mixture Model based on EM algorithm and Bayesian Kernel Method
  • We present a kernel Gaussian Mixture Model, and deduce a parameter estimation algorithm by embedding kernel trick into EM algorithm
  • We adopt a Monte Carlo sampling technique to speedup kernel Gaussian Mixture Model upon large scale problem, make it more practical and efficient
  • Our future work will focus on incorporating prior knowledge such as invariance in kernel Gaussian Mixture Model and enriching its applications
Conclusion
  • Computational cost and speedup techniques on large scale problem By employing kernel trick, the computational cost of kernel eigen-decomposition based methods is almost involved by the eigen-decomposition step.
  • If the size N is not very large (e.g. N ≤ 1, 000), it is not a problem to obtain full eigen-decomposition.
  • Compared with most classical kernel methods, kGMM can solve problems in a probabilistic framework.
  • It can tackle nonlinear problems better than the traditional GMM.
  • The authors' future work will focus on incorporating prior knowledge such as invariance in kGMM and enriching its applications
Tables
  • Table1: Notation List p(lit) wl(it)
  • Table2: Parameter Estimation Algorithm for kGMM
  • Table3: Comparison results on USPS data set
Download tables as Excel
Reference
  • Achlioptas, D., McSherry, F. and Scholkopf, B.: Sampling techniques for kernel methods. In Advances in Neural Information Processing System (NIPS) 14, MIT Press, Cambridge MA (2002)
    Google ScholarLocate open access versionFindings
  • Bilmes, J. A.: A Gentle Tutorial on the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models, Technical Report, UC Berkeley, ICSI-TR-97-021 (1997)
    Google ScholarFindings
  • Bishop, C. M.: Neural Networks for Pattern Recognition, Oxford University Press. (1995)
    Google ScholarFindings
  • Dahmen, J., Keysers, D., Ney, H. and Gld, M.O.: Statistical Image Object Recognition using Mixture Densities. Journel of Mathematical Imaging and Vision, 14(3) (2001) 285-296
    Google ScholarLocate open access versionFindings
  • Duda, R. O., Hart, P. E. and Stork, D. G.: Pattern Classification, New York: John Wiley & Sons Press, 2nd Edition. (2001)
    Google ScholarFindings
  • Everitt, B. S.: An Introduction to Latent Variable Models, London: Chapman and Hall. (1984)
    Google ScholarFindings
  • Francis R. B. and Michael I. J.: Kernel Independent Component Analysis, Journal of Machine Learning Research, 3, (2002) 1-48
    Google ScholarLocate open access versionFindings
  • Gestel, T. V., Suykens, J.A.K., Lanckriet, G., Lambrechts, A., Moor, B. De and Vanderwalle J.: Bayesian framework for least squares support vector machine classifiers, gaussian processs and kernel fisher discriminant analysis. Neural Computation, 15(5) (2002) 1115-1148
    Google ScholarLocate open access versionFindings
  • Herbrich, R., Graepel, T. and Campbell, C.: Bayes Point Machines: Estimating the Bayes Point in Kernel Space. In Proceedings of International Joint Conference on Artificial Intelligence Work-shop on Support Vector Machines, (1999) 23-27
    Google ScholarLocate open access versionFindings
  • Kwok, J. T.: The Evidence Framework Applied to Support Vector Machines, IEEE Trans. on NN, Vol. 11 (2000) 1162-1173.
    Google ScholarLocate open access versionFindings
  • Mika, S., Ratsch, G., Weston, J., Scholkopf, B. and Mller, K.R.: Fisher discriminant analysis with kernels. IEEE Workshop on Neural Networks for Signal Processing IX, (1999) 41-48
    Google ScholarLocate open access versionFindings
  • Mjolsness, E. and Decoste, D.: Machine Learning for Science: State of the Art and Future Pros-pects, Science. Vol. 293 (2001)
    Google ScholarLocate open access versionFindings
  • Roberts, S. J.: Parametric and Non-Parametric Unsupervised Cluster Analysis, Pattern Recogni-tion, Vol. 30. No 2, (1997) 261-272
    Google ScholarLocate open access versionFindings
  • Scholkopf, B., Smola, A.J. and Mller, K.R.: Nonlinear Component Analysis as a Kernel Eigen-value Problem, Neural Computation, 10(5), (1998) 1299-1319
    Google ScholarLocate open access versionFindings
  • Scholkopf, B., Mika, S., Burges, C. J. C., Knirsch, P., Mller, K. R., Raetsch, G. and Smola, A.: Input Space vs. Feature Space in Kernel-Based Methods, IEEE Trans. on NN, Vol 10. No. 5, (1999) 1000-1017
    Google ScholarLocate open access versionFindings
  • Scholkopf, B. and Smola, A. J.: Learning with Kernels: Support Vector Machines, Regularization and Beyond, MIT Press, Cambridge MA (2002)
    Google ScholarFindings
  • Tipping, M. E.: Sparse Bayesian Learning and the Relevance Vector Machine, Journal of Machine Learning Research. (2001)
    Google ScholarLocate open access versionFindings
  • Vapnik, V.: The Nature of Statistical Learning Theory, 2nd Edition, SpringerVerlag, New York (1997)
    Google ScholarFindings
  • Williams, C. and Seeger, M.: Using the Nystrom Method to Speed Up Kernel Machines. In T. K. Leen, T. G. Diettrich, and V. Tresp, editors, Advances in Neural Information Processing Systems (NIPS)13. MIT Press, Cambridge MA (2001)
    Google ScholarLocate open access versionFindings
  • Taylor, J. S., Williams, C., Cristianini, N. and Kandola J.: On the Eigenspectrum of the Gram Matrix and Its Relationship to the Operator Eigenspectrum, N. CesaBianchi et al. (Eds.): ALT 2002, LNAI 2533, Springer-Verlag, Berlin Heidelberg (2002) 23-40
    Google ScholarLocate open access versionFindings
  • Ng, A. Y., Jordan, M. I. and Weiss, Y.: On Spectral Clustering: Analysis and an algorithm, Advance in Neural Information Processing Systems (NIPS) 14, MIT Press, Cambridge MA (2002)
    Google ScholarLocate open access versionFindings
  • Moghaddam, B. and Pentland, A.: Probabilistic visual learning for object representation, IEEE Trans. on PAMI, Vol. 19, No. 7 (1997) 696-710 (2) If K is a N × N projecting kernel matrix such that Kij = φ(xi) · wjφ(xj), and Kis a N × N matrix, which is centered in the feature space, such that Kij = φ(xi) · wjφ(xj), then
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科