Adversarially learned likelihood-ratio
user-5ebe28934c775eda72abcddd(2018)
摘要
We link the reverse KL divergence with adversarial learning. This insight enables learning to synthesize realistic samples in two settings: (i) Given a set of samples from the true distribution, an adversarially learned likelihood-ratio and a new entropy bound are used to learn a GAN model, that improves synthesized sample quality relative to previous GAN variants. (ii) Given an unnormalized distribution, a reference-based framework is proposed to learn to draw samples, naturally yielding an adversarial scheme to amortize MCMC/SVGD samples. Experimental results show the improved performance of the derived algorithms. 1 BACKGROUND ON THE REVERSE KL DIVERGENCE Target Distribution Assume we are given a set of samples D = {xi}i=1,N , with each sample assumed drawn iid from an unknown distribution q(x). For x ∈ X , let Sq ⊂ X represent the support of q, implying that Sq is the smallest subset of X for which ∫ Sq q(x)dx = 1 (or ∫ Sq q(x)dx = 1− , for → 0). Let S q represent the complement set of Sq , i.e., Sq ∪ S q = X and Sq ∩ S q = ∅. Model Distribution We desire a model pθ(x) that approximately allows one to draw samples efficiently from q(x), implemented as x = fθ(z) with z ∼ p(z), where p(z) is a distribution that one may sample from easily, and fθ(z) is a nonlinear deterministic function with parameters θ that are to be learned. Similarly, let Spθ represent the support of pθ, with Spθ ∪ S pθ = X and Spθ ∩ S o pθ = ∅. The reverse KL divergence 1 between these two distributions is: KL(pθ(x)||q(x)) = Epθ(x) log [ pθ(x) q(x) ] = −h(pθ(x))− Epθ(x) log q(x) (1) • The 1st term is the differential entropy, encouraging pθ(x) to spread over the support set as wide as possible • The 2nd term can be further written as: Epθ(x) log q(x) = ∫ Spθ∩Sq pθ(x) log q(x)dx + ∫ Spθ∩S q pθ(x) log q(x)dx, where there is a strong (negative) penalty introduced by ∫ Spθ∩S q pθ(x) log q(x)dx. Hence, it is encouraged that Spθ ∩ S q = ∅, implying Spθ ⊆ Sq. When Spθ ⊂ Sq , “mode-collapse” is manifested. It can be seen that the goals of two terms in the reverse KL objective seem complementary to each other. We advocate that minimizing KL(pθ‖q) is a promising approach to learn a model pθ(x) to characterize q(x). Below, we discuss two distinctive setups to learn pθ(x), when only either a sample set (in Section 2) or an unnormalized density form (in Section 3) of q(x) is available. 2 LEARNING WITH SAMPLES Learning pθ(x) with a set of samples from q(x) is exactly the problem setup of Generative Adversarial Networks (GAN) (Goodfellow et al., 2014). One may consider to train a ψ-parameterized discriminator gψ(x) to estimate the likelihood-ratio log(pθ(x)/q(x)) (Kanamori et al., 2010; Mohamed & Lakshminarayanan, 2016; Mescheder et al., 2016; Gutmann & Hyvärinen, 2010): ψ̂ = argmaxψ[Ep(z) log σ(gψ(fθ(z))) + Eq(x) log(1− σ(gψ(x)))], (2) In contrast to the maximum likelihood setup KL(q(x)||pθ(x))
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络