In this paper, we propose a general framework for the algorithm New Q-Newton's method Backtracking, developed in the author's previous work. For a symmetric, square real matrix $A$, we define $minsp(A):=\min _{||e||=1} ||Ae||$. Given a $C^2$ cost function $f:\mathbb{R}^m\rightarrow \mathbb{R}$ and a real number $0<\tau $, as well as $m+1$ fixed real numbers $\delta _0,\ldots ,\delta _m$, we define for each $x\in \mathbb{R}^m$ with $\nabla f(x)\not= 0$ the following quantities: $\kappa :=\min _{i\not= j}|\delta _i-\delta _j|$; $A(x):=\nabla ^2f(x)+\delta ||\nabla f(x)||^{\tau}Id$, where $\delta$ is the first element in the sequence $\{\delta _0,\ldots ,\delta _m\}$ for which $minsp(A(x))\geq \kappa ||\nabla f(x)||^{\tau}$; $e_1(x),\ldots ,e_m(x)$ are an orthonormal basis of $\mathbb{R}^m$, chosen appropriately; $w(x)=$ the step direction, given by the formula: $$w(x)=\sum _{i=1}^m\frac{<\nabla f(x),e_i(x)>}{||A(x)e_i(x)||}e_i(x);$$ (we can also normalise by $w(x)/\max \{1,||w(x)||\}$ when needed) $\gamma (x)>0$ learning rate chosen by Backtracking line search so that Armijo's condition is satisfied: $$f(x-\gamma (x)w(x))-f(x)\leq -\frac{1}{3}\gamma (x)<\nabla f(x),w(x)>.$$ The update rule for our algorithm is $x\mapsto H(x)=x-\gamma (x)w(x)$. In New Q-Newton's method Backtracking, the choices are $\tau =1+\alpha >1$ and $e_1(x),\ldots ,e_m(x)$'s are eigenvectors of $\nabla ^2f(x)$. In this paper, we allow more flexibility and generality, for example $\tau$ can be chosen to be $<1$ or $e_1(x),\ldots ,e_m(x)$'s are not necessarily eigenvectors of $\nabla ^2f(x)$. New Q-Newton's method Backtracking (as well as Backtracking gradient descent) is a special case, and some versions have flavours of quasi-Newton's methods. Several versions allow good theoretical guarantees. An application to solving systems of polynomial equations is given.
翻译:在本文中, 我们提出一个新的 Q- Newton 方法返回跟踪的总框架 。 对于一个对称、 真实的平方基质$A$, 我们定义$sp( A) :\\ min \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 美元) 成本函数 $:\ mathb{ \ rthb} 美元, 实际数 $0\ tau 美元, 以及 $m+1 固定的真数 $\ delta > 0,\ eldta_ 美元= = 美元= 美元= 美元= 美元= 美元 : $ kappappa:\ min\\ \\ \ \ \\ = la a_ la_ la_ la_ la_ \ \ \ \ \ \ \ \ \ (x) a\ \ \ \ \ \ \ \ \ \ \\ \\\\\\\\\\ \\\\\\\\\\\\\\\ laxxxxxxx lexx lex lex lex lex lex lex lex lex lex lexxxxxxxxxxxxxx lecom lexxxx x x x x x x x xxxx x x x x x x xxxxxxxxxx xxxx xxxxxx x x x x x x xxxxx x xxxxxx xxxxxxxxxxx x x xxxxxxxxxx x x x x xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx x x