ETNA - Electronic Transactions on Numerical Analysis
|
Verlag der Österreichischen Akademie der Wissenschaften Austrian Academy of Sciences Press
A-1011 Wien, Dr. Ignaz Seipel-Platz 2
Tel. +43-1-515 81/DW 3420, Fax +43-1-515 81/DW 3400 https://verlag.oeaw.ac.at, e-mail: verlag@oeaw.ac.at |
|
DATUM, UNTERSCHRIFT / DATE, SIGNATURE
BANK AUSTRIA CREDITANSTALT, WIEN (IBAN AT04 1100 0006 2280 0100, BIC BKAUATWW), DEUTSCHE BANK MÜNCHEN (IBAN DE16 7007 0024 0238 8270 00, BIC DEUTDEDBMUC)
|
ETNA - Electronic Transactions on Numerical Analysis ISBN 978-3-7001-8258-0 Online Edition Research Article
Stefan Steinerberger
S. 610 - 619 doi:10.1553/etna_vol54s610 Verlag der Österreichischen Akademie der Wissenschaften doi:10.1553/etna_vol54s610
Abstract: We study the behavior of the stochastic gradient descent methodapplied to $\|Ax -b \|_2^2 \rightarrow \min$ for invertible matrices $A \in \mathbb{R}^{n \times n}$. We show that there is an explicit constant $c_{A}$ depending (mildly) on $A$ such that$ \mathbb{E}\left\| Ax_{k+1}-b\right\|^2_{2} \leq \left(1 + \frac{c_{A}}{\|A\|_F^2}\right) \left\|A x_k -b \right\|^2_{2} - \frac{2}{\|A\|_F^2} \left\|A^T A (x_k - x)\right\|^2_{2}.$This is a curious inequality since the last term involves one additional matrix multiplication applied to the error $x_k - x$ compared to the remaining terms: if the projection of $x_k - x$ onto the subspace of singular vectors corresponding to large singular values is large, then the stochastic gradient descent method leads to a fast regularization. For symmetric matrices, this inequality has an extension to higher-order Sobolev spaces. This explains a (known) regularization phenomenon: an energy cascade from large singular values to small singular values acts as a regularizer. Keywords: stochastic gradient descent, Kaczmarz method, least-squares, regularization Published Online: 2021/10/05 08:03:53 Object Identifier: 0xc1aa5576 0x003cde5e Rights: . Electronic Transactions on Numerical Analysis (ETNA) is an electronic journal for the publication of significant new developments in numerical analysis and scientific computing. Papers of the highest quality that deal with the analysis of algorithms for the solution of continuous models and numerical linear algebra are appropriate for ETNA, as are papers of similar quality that discuss implementation and performance of such algorithms. New algorithms for current or new computer architectures are appropriate provided that they are numerically sound. However, the focus of the publication should be on the algorithm rather than on the architecture. The journal is published by the Kent State University Library in conjunction with the Institute of Computational Mathematics at Kent State University, and in cooperation with the Johann Radon Institute for Computational and Applied Mathematics of the Austrian Academy of Sciences (RICAM). Reviews of all ETNA papers appear in Mathematical Reviews and Zentralblatt für Mathematik. Reference information for ETNA papers also appears in the expanded Science Citation Index. ETNA is registered with the Library of Congress and has ISSN 1068-9613. …
|
Verlag der Österreichischen Akademie der Wissenschaften Austrian Academy of Sciences Press
A-1011 Wien, Dr. Ignaz Seipel-Platz 2
Tel. +43-1-515 81/DW 3420, Fax +43-1-515 81/DW 3400 https://verlag.oeaw.ac.at, e-mail: verlag@oeaw.ac.at |