Foo, Chuan-Sheng and Do, Chuong and Ng, Andrew
We present a general Bayesian framework for hyperparameter tuning in $L_2$-regularized supervised learning models. Paradoxically, our algorithm works by first analytically integrating out the hyperparameters from the model. We solve the resulting non-convex optimization problem efficiently using a majorization-minimization (MM) algorithm, in which the non-convex problem is reduced to a series of convex $L_2$-regularized parameter estimation tasks. The principal appeal of our method is its simplicity: the updates for choosing the $L_2$-regularized subproblems in each step are trivial to implement (or even perform by hand), and each subproblem can be efficiently solved by adapting existing solvers. Empirical results on a variety of supervised learning models show that our algorithm is competitive with both grid-search and gradient-based algorithms, but is more efficient and far easier to implement.