On the Use of Random Forest for Two-Sample Testing

COMPUTATIONAL STATISTICS & DATA ANALYSIS(2019)

Cited 9|Views6
No score
Abstract
We follow the line of using classifiers for two-sample testing and propose several tests based on the Random Forest classifier. The developed tests are easy to use, require no tuning and are applicable for any distribution on $\mathbb{R}^p$, even in high-dimensions. We provide a comprehensive treatment for the use of classification for two-sample testing, derive the distribution of our tests under the Null and provide a power analysis, both in theory and with simulations. To simplify the use of the method, we also provide the R-package "hypoRF".
More
Translated text
Key words
Random forest,Distribution testing,Classification,Kernel two-sample test,MMD,Total variation distance,U-statistics
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined