Learning to Learn without Gradient Descent by Gradient Descent
Title: | Learning to Learn without Gradient Descent by Gradient Descent |
---|---|
Authors: | Chen, Yutian, Hoffman, Matthew W., Colmenarejo, Sergio Gomez, Denil, Misha, Lillicrap, Timothy P., Botvinick, Matt, de Freitas, Nando |
Publication Year: | 2016 |
Collection: | Computer Science Statistics |
Subject Terms: | Statistics - Machine Learning, Computer Science - Learning |
More Details: | We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning. Comment: Accepted by ICML 2017. Previous version "Learning to Learn for Global Optimization of Black Box Functions" was published in the Deep Reinforcement Learning Workshop, NIPS 2016 |
Document Type: | Working Paper |
Access URL: | http://arxiv.org/abs/1611.03824 |
Accession Number: | edsarx.1611.03824 |
Database: | arXiv |
Description not available. |