Challenges with EM in application to weakly identifiable mixture models.
arXiv: Statistics Theory(2019)
摘要
We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on $n$ i.i.d. samples are known to have lower accuracy than the classical $n^{- frac{1}{2}}$ error. We investigate whether the Expectation-Maximization (EM) algorithm also converges slowly for these models. We first demonstrate via simulation studies a broad range of over-specified mixture models for which the EM algorithm converges very slowly, both in one and higher dimensions. We provide a complete analytical characterization of this behavior for fitting data generated from a multivariate standard normal distribution using two-component Gaussian mixture with varying location and scale parameters. Our results reveal distinct regimes in the convergence behavior of EM as a function of the dimension $d$. In the multivariate setting ($d geq 2$), when the covariance matrix is constrained to a multiple of the identity matrix, the EM algorithm converges in order $(n/d)^{frac{1}{2}}$ steps and returns estimates that are at a Euclidean distance of order ${(n/d)^{-frac{1}{4}}}$ and ${ (n d)^{- frac{1}{2}}}$ from the true location and scale parameter respectively. On the other hand, in the univariate setting ($d = 1$), the EM algorithm converges in order $n^{frac{3}{4} }$ steps and returns estimates that are at a Euclidean distance of order ${ n^{- frac{1}{8}}}$ and ${ n^{-frac{1} {4}}}$ from the true location and scale parameter respectively. Establishing the slow rates in the univariate setting requires a novel localization argument with two stages, with each stage involving an epoch-based argument applied to a different surrogate EM operator at the population level. We also show multivariate ($d geq 2$) examples, involving more general covariance matrices, that exhibit the same slow rates as the univariate case.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络