AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
The define-by-run principle enables the user to dynamically construct the search space in the way that has never been possible with previous hyperparameter tuning frameworks

Optuna: A Next-generation Hyperparameter Optimization Framework

pp.2623-2631 (2019)

Cited by: 572|Views336
EI

Abstract

The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, v...More

Code:

Data:

0
Introduction
  • Hyperparameter search is one of the most cumbersome tasks in machine learning projects.
  • The complexity of deep learning method is growing with its popularity, and the framework of.
  • ACM ISBN 978-1-4503-6201-6/19/08.
  • 15.00 https://doi.org/10.1145/3292500.3330701.
  • Def objective: n_layers = trial.
  • Suggest_int ( ' n_layers ' , 1, 4) layers = [].
  • For i in range ( n_layers ): layers .
  • Append (trial.
  • Suggest_int ( ' n_units_l {} '.
  • Clf = MLPClassifier ( tuple).
  • Mnist = fetch_mldata ( ' MNIST original ' ).
Highlights
  • Hyperparameter search is one of the most cumbersome tasks in machine learning projects
  • The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software
  • We evaluated the performance gain from the pruning procedure in the Optuna-implemented optimization of Alex Krizhevsky’s neural network (AlexNet) [16] on the Street View House Numbers (SVHN) dataset [24]
  • The efficacy of Optuna strongly supports our claim that our new design criteria for generation optimization frameworks are worth adopting in the development of future frameworks
  • The define-by-run principle enables the user to dynamically construct the search space in the way that has never been possible with previous hyperparameter tuning frameworks
  • It is our strong hope that the set of design techniques we developed for Optuna will serve as a basis of other generation optimization frameworks to be developed in the future
Methods
  • There are generally two types of sampling method: relational sampling that exploits the correlations among the parameters and independent sampling that samples each parameter independently.
  • The independent sampling is not necessarily a naive option, because some sampling algorithms like TPE [3] are known to perform well even without using the parameter correlations, and the cost effectiveness for both relational and independent sampling depends on environment and task.
  • Output: true if the trial should be pruned, false otherwise
  • Some words of caution are in order for the implementation of relational sampling in define-by-run framework.
Results
  • The authors' implementation of ASHA significantly outperforms Median pruning, a pruning method featured in Vizier.
Conclusion
  • The efficacy of Optuna strongly supports the claim that the new design criteria for generation optimization frameworks are worth adopting in the development of future frameworks.
  • The define-by-run principle enables the user to dynamically construct the search space in the way that has never been possible with previous hyperparameter tuning frameworks.
  • It is the strong hope that the set of design techniques the authors developed for Optuna will serve as a basis of other generation optimization frameworks to be developed in the future
Tables
  • Table1: Software frameworks for deep learning and hyperparameter optimization, sorted by their API styles: define-and-run and define-by-run
  • Table2: Comparison of previous hyperparameter optimization frameworks and Optuna. There is a checkmark for lightweight if the setup for the framework is easy and it can be easily used for lightweight purposes
Download tables as Excel
Reference
  • Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In OSDI. 265–283.
    Google ScholarLocate open access versionFindings
  • Takuya Akiba, Tommi Kerola, Yusuke Niitani, Toru Ogawa, Shotaro Sano, and Shuji Suzuki. 2018. PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track. In ECCV Workshop on Open Images Challenge.
    Google ScholarLocate open access versionFindings
  • James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for Hyper-parameter Optimization. In NIPS. 2546–2554.
    Google ScholarFindings
  • James Bergstra, Brent Komer, Chris Eliasmith, Dan Yamins, and David D Cox. 2015. Hyperopt: a Python library for model selection and hyperparameter optimization. Computational Science & Discovery 8, 1 (2015), 14008.
    Google ScholarLocate open access versionFindings
  • Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, and George Ke. 2016. A Strategy for Ranking Optimization Methods using Multiple Criteria. In ICML Workshop on AutoML. 11–20.
    Google ScholarLocate open access versionFindings
  • Tobias Domhan, Jost Tobias Springenberg, and Frank Hutter. 2015. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves. In IJCAI. 3460–3468.
    Google ScholarFindings
  • Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. 201Optimizing Space Amplification in RocksDB. In CIDR.
    Google ScholarLocate open access versionFindings
  • Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D Sculley. 2017. Google Vizier: A Service for Black-Box Optimization. In KDD. 1487–1495.
    Google ScholarFindings
  • Nikolaus Hansen and Andreas Ostermeier. 2001. Completely Derandomized Self-Adaptation in Evolution Strategies. Evolutionary Computation 9, 2 (2001), 159–195.
    Google ScholarLocate open access versionFindings
  • Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-based Optimization for General Algorithm Configuration. In LION. 507– 523.
    Google ScholarLocate open access versionFindings
  • Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2018. Automatic Machine Learning: Methods, Systems, Challenges. Springer. In press, available at http://automl.org/book.
    Findings
  • Kevin Jamieson and Ameet Talwalkar. 2016. Non-stochastic best arm identification and hyperparameter optimization. In Artificial Intelligence and Statistics. 240–248.
    Google ScholarLocate open access versionFindings
  • Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, and Frank Hutter. 2017. Learning Curve Prediction with Bayesian Neural Networks. In ICLR.
    Google ScholarLocate open access versionFindings
  • Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Corlay, Paul Ivanov, Damián Avila, Safia Abdalla, and Carol Willing. 2016. Jupyter Notebooks – a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, F. Loizides and B. Schmidt (Eds.). IOS Press, 87 – 90.
    Google ScholarLocate open access versionFindings
  • Patrick Koch, Oleg Golovidov, Steven Gardner, Brett Wujek, Joshua Griffin, and Yan Xu. 2018. Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning. In KDD. 443–452.
    Google ScholarLocate open access versionFindings
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS. 1097–1105.
    Google ScholarLocate open access versionFindings
  • Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper R. R. Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Tom Duerig, and Vittorio Ferrari. 2018. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. CoRR abs/1811.00982 (2018). arXiv:1811.00982
    Findings
  • Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 20Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research 18, 185 (2018), 1–52.
    Google ScholarLocate open access versionFindings
  • Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. 2018. Massively Parallel Hyperparameter Tuning. In NeurIPS Workshop on Machine Learning Systems.
    Google ScholarLocate open access versionFindings
  • Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, and Ion Stoica. 2018. Tune: A Research Platform for Distributed Model Selection and Training. In ICML Workshop on AutoML.
    Google ScholarLocate open access versionFindings
  • Michael McCourt. 2016. Benchmark suite of test functions suitable for evaluating black-box optimization strategies. https://github.com/sigopt/evalset.
    Findings
  • Wes McKinney. 2011. Pandas: a Foundational Python Library for Data Analysis and Statistics. In SC Workshop on Python for High Performance and Scientific Computing.
    Google ScholarLocate open access versionFindings
  • Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, William Paul, Michael I. Jordan, and Ion Stoica. 2017. Ray: A Distributed Framework for Emerging AI Applications. CoRR abs/1712.05889 (2017). arXiv:1712.05889 http://arxiv.org/abs/1712.05889
    Findings
  • Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning.
    Google ScholarLocate open access versionFindings
  • Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, and Pengcheng Yin. 2017. DyNet: The Dynamic Neural Network Toolkit. CoRR abs/1701.03980 (2017).
    Findings
  • Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS Autodiff Workshop.
    Google ScholarLocate open access versionFindings
  • Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. Proc. IEEE 104, 1 (2016), 148–175.
    Google ScholarLocate open access versionFindings
  • Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In NIPS. 2951–2959.
    Google ScholarFindings
  • Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: a Next-Generation Open Source Framework for Deep Learning. In NIPS Workshop on Machine Learning Systems.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科