Accelerated Adaptive Cubic Regularized Quasi-Newton Methods

Papers

Feb 10, 2023·

Dmitry Kamzolov

Klea Ziu

Artem Agafonov

Martin Takáč

· 0 min read

Cite arXiv:2302.04987

Abstract

In this paper, we propose the first Quasi-Newton method with a global convergence rate of $O(k^{−1})$ for general convex functions. Quasi-Newton methods, such as BFGS, SR-1, are well-known for their impressive practical performance. However, they may be slower than gradient descent for general convex functions, with the best theoretical rate of $O(k^{−1/3})$. This gap between impressive practical performance and poor theoretical guarantees was an open question for a long period of time. In this paper, we make a significant step to close this gap. We improve upon the existing rate and propose the Cubic Regularized Quasi-Newton Method with a convergence rate of $O(k^{−1})$. The key to achieving this improvement is to use the Cubic Regularized Newton Method over the Damped Newton Method as an outer method, where the Quasi-Newton update is an inexact Hessian approximation. Using this approach, we propose the first Accelerated Quasi-Newton method with a global convergence rate of O(k−2) for general convex functions. In special cases where we can improve the precision of the approximation, we achieve a global convergence rate of $O(k^{−3})$, which is faster than any first-order method. To make these methods practical, we introduce the Adaptive Inexact Cubic Regularized Newton Method and its accelerated version, which provide real-time control of the approximation error. We show that the proposed methods have impressive practical performance and outperform both first and second-order methods.

Type

Preprint

Publication

Preprint, Under Review

Last updated on Feb 10, 2023

Quasi-Newton Methods Second-Order Methods

Authors

Dmitry Kamzolov

Researcher

← Advancing the Lower Bounds: an Accelerated, Stochastic, Second-order Method with Optimal Adaptation to Inexactness Sep 4, 2023

A Damped Newton Method Achieves Global $O(\frac{1}{k^2})$ and Local Quadratic Convergence Rate Oct 31, 2022 →