Machine learning as an ultra-fast alternative to Bayesian retrievals

semanticscholar(2021)

引用 0|浏览5
暂无评分
摘要
  • 1Kapteyn Astronomical Institute, University of Groningen, Groningen, The Netherlands
  • 2SRON Netherlands Institute for Space Research, Utrecht, The Netherlands
  • 3Centre for Exoplanet Science, University of Edinburgh, Edinburgh, UK
Hide

Introduction: Inferring physical and chemical properties of an exoplanet's atmosphere from its transmission spectrum is computationally expensive. A multitude of forward models, sampled from a high dimensional parameter space, need to be compared to the observation. The preferred sampling method is currently Nested Sampling [7], in particular, the MultiNest implementation [2, 3]. It typically requires tens to hundreds of thousands of forward models to converge. Therefore, simpler forward models are usually favoured over longer computation times. 

A possible workaround is to use machine learning. A machine learning algorithm trained on a grid of forward models and parameter pairs can perform retrievals in seconds. This would make it possible to use complex models that take full advantage of future facilities e.g., JWST. Not only would retrievals of individual exoplanets become much faster, but it would also enable statistical studies of populations of exoplanets. It would also be a valuable tool for retrievability analyses, for example to assess the sensitivity of using different chemical networks.

The main obstacle to overcome is being able to predict accurate posterior distributions and error estimates on the retrieved parameters. These need to be  as close as possible to their Bayesian counterparts.

Methods: Expanding on the 5-parameter grid in [5], we used ARCiS (ARtful modelling Code for exoplanet Science) [6] to generate a grid of 200,000 forward models described by the following parameters: isothermal temperature (T), planetary radius (RP), planetary mass (MP), abundances of water (H2O), ammonia (NH3) and hydrogen cyanide (HCN), and cloud top pressure (Pcloud). The models contain 13 wavelength bins, matching those of WASP-12b's observation with HST/WFC3 [4]. We added normally distributed random noise  with σ=50 ppm.

We trained a random forest following the details in [5] and a convolutional neural network (CNN). We divided the data into a training set of 190,000 spectra and a test set of 10,000. For the CNN we reserved 19,000 spectra (10%) from the training set for validation. These are needed to update the network weights at each training iteration.

The CNN was trained with the loss function introduced in [1] to output a probability distribution. To account for the observational noise, we combined the distributions predicted for multiple noisy copies of the spectrum.

To evaluate the performance of the machine learning algorithms, we retrieved all the spectra in the test set and plotted our predictions against the true values for the parameters. We repeated the experiment with only 1,000 spectra for Nested Sampling, reflecting the increased computational overhead of each of these retrievals. We then used a transmission spectrum of WASP-12b observed with HST/WFC3 [4] as a real-world test case. 

Results: Although the random forest trains faster, the CNN provided better results. Figures 1 and 2 show the predicted versus the true parameters for the CNN and Nested Sampling bulk retrievals. Remarkably, we observe the same structures in both plots. This shows that the CNN is able to learn the relationship between spectral features and parameters. We also found that both the CNN and Nested Sampling provide correct error estimates, with ~60% of predictions within 1σ of the true value, ~98% within 2σ, and virtually all within 3σ. This is in almost perfect agreement with expectation from statistical errors.

更多

查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要