我国政府信息门户网站建设研究seo发帖软件
- 1. 不动点定理及其条件验证
- 2. 收敛阶、收敛检测与收敛加速
- 2.1 如何估计不动点迭代的收敛阶xk+1=g(xk){x}_{{k}+1}={g}\left({x}_{{k}}\right)xk+1=g(xk)
- 2.2 给定精度的情况下,如何预测不动点迭代需要迭代的次数
- 2.3 如何加快收敛的速度
- 2.4 停止不定点迭代的条件
- 2.5 不动点迭代的两个缺点
- 3. 应用:如何求解非线性方程组f(x)=0f(x)=0f(x)=0的解
- 3.1 二分法(Bisection Method of Bolzano)
- 3.2 试位法(False Position Method)
- 3.3 牛顿-拉夫逊方法(Newton-Raphson method)
- 3.4 割线法(Secant Method)
- 3.5 Aitken过程加速
- 3.6 Muller方法(Muller's method)
- 4. 其他问题
- 4.1 如何寻找初值
- 4.2 收敛条件
- 4.3 算法的收敛速度对比
- 4.4 算法的选择
1. 不动点定理及其条件验证
不动点定义:P=g(P){P}={g}({P})P=g(P)
不定点迭代:xk+1=g(xk){x}_{{k}+1}={g}\left({x}_{{k}}\right)xk+1=g(xk)
定理: 如果g(xk)g({x_k})g(xk)是连续的并且序列xk{x_k}xk是收敛的,xk{x_k}xk收敛到方程的解:x=g(x){x}=g({x})x=g(x)
x∗=g(x∗)and xk−>x∗{x}^{*}={g}\left({x}^{*}\right) \text { and } {x}_{{k}}->{x}^{*}x∗=g(x∗) and xk−>x∗
定理: 假设
(1) 对于g(x)g(x)g(x),g′(x)∈C[a,b]g'(x)\in C[a,b]g′(x)∈C[a,b](连续)
(2) KKK是一个正的常数
(3) p0∈(a,b)p_0\in(a,b)p0∈(a,b)
(4) g(x)∈[a,b],∀x∈[a,b]g(x)\in[a,b],\forall x\in[a,b]g(x)∈[a,b],∀x∈[a,b]
那么
(a) 如果∣g′(x)∣≤K<1,∀x∈[a,b],xk+1=g(xk)\left|\mathrm{g}^{\prime}(x)\right| \leq \mathrm{K}<1 , \forall x \in[\mathrm{a}, \mathrm{b}], \mathrm{x}_{\mathrm{k}+1}=\mathrm{g}\left(\mathrm{x}_{\mathrm{k}}\right)∣g′(x)∣≤K<1,∀x∈[a,b],xk+1=g(xk)收敛。
(b) 如果∣g′(x)∣>1,∀x∈[a,b],xk+1=g(xk)\left|\mathrm{g}^{\prime}(x)\right|>1 , \forall x \in[\mathrm{a}, \mathrm{b}], \mathrm{x}_{\mathrm{k}+1}=\mathrm{g}\left(\mathrm{x}_{\mathrm{k}}\right)∣g′(x)∣>1,∀x∈[a,b],xk+1=g(xk)不收敛
曲线的切线斜率k∈(−1,1)k\in(-1,1)k∈(−1,1)看下面的图逐渐收敛:
曲线的切线斜率k∈[−∞,−1)∪(1,∞]k\in[-\infty,-1)\cup (1,\infty]k∈[−∞,−1)∪(1,∞]看下面的图不收敛:
综上所述,不动点迭代满足最重要的是:
(1)∣g′(x)∣≤K<1,∀x∈[a,b]【g′(x)的边界条件】(2)g(x)∈[a,b],∀x∈[a,b],并且有g([a,b])⊂[a,b]【g(x)的边界条件】\begin{aligned}&(1) \left|\mathrm{g}^{\prime}(x)\right| \leq \mathrm{K}<1 ,\forall x \in[\mathrm{a}, \mathrm{b}] & 【g'(x)的边界条件】\\ &(2) \mathrm{g}(x) \in[\mathrm{a}, \mathrm{b}] , \forall x \in[\mathrm{a}, \mathrm{b}] ,并且有g([a, b]) \subset[a, b]& 【g(x)的边界条件】\end{aligned}(1)∣g′(x)∣≤K<1,∀x∈[a,b](2)g(x)∈[a,b],∀x∈[a,b],并且有g([a,b])⊂[a,b]【g′(x)的边界条件】【g(x)的边界条件】
单调和非单调要分别判断边界条件,单调的g(x)的范围看端点就可以了,非单调还要看极值点。
2. 收敛阶、收敛检测与收敛加速
定义;
∣xk+1−x∗∣≤C∣xk−x∗∣p,k>M, for C>0,p>0\left|x_{k+1}-x^{*}\right| \leq C\left|x_{k}-x^{*}\right|^{p}, k>M \text {, for } C>0, p>0∣xk+1−x∗∣≤C∣xk−x∗∣p,k>M, for C>0,p>0
或
limk→∞∣xk+1−x∗∣∣xk−x∗∣p=C\lim _{k \rightarrow \infty} \frac{\left|x_{k+1}-x^{*}\right|}{\left|x_{k}-x^{*}\right|^{p}}=Ck→∞lim∣xk−x∗∣p∣xk+1−x∗∣=C
为ppp阶收敛
其中:
p=1p=1p=1 , 线性收敛(linear convergence)
1<p<21<p<21<p<2 , 超线性收敛(superlinear convergence)
p=2p=2p=2 , 平方收敛(square convergence)
2.1 如何估计不动点迭代的收敛阶xk+1=g(xk){x}_{{k}+1}={g}\left({x}_{{k}}\right)xk+1=g(xk)
定理;设x∗x^*x∗是最优解,如果g′(x∗)=g′′(x∗)=…=g(p−1)(x∗)=0g^{\prime}\left(x^{*}\right)=g^{\prime \prime}\left(x^{*}\right)=\ldots=g^{(p-1)}\left(x^{*}\right)=0g′(x∗)=g′′(x∗)=…=g(p−1)(x∗)=0, g(p)(x∗)≠0g^{(p)}\left(x^{*}\right) \neq 0g(p)(x∗)=0, xk+1=g(xk)x_{k+1}=g\left(x_{k}\right)xk+1=g(xk)是 ppp阶收敛。
证明:
xk+1=g(xk)=g(x∗)+g′(x∗)(xk−x∗)+…+g(p−1)(x∗)(xk−x∗)p−1(p−1)!+g(p)(ξ)(xk−x∗)pp!,【ξ∈[xk,x∗]或[x∗,xk]】⇒xk+1=x∗+g(p)(ξ)(xk−x∗)pp!⇒xk+1−x∗(xk−x∗)p=g(p)(ξ)p!→g(p)(x∗)p!\begin{aligned} x_{k+1}&=g\left(x_{k}\right)=g\left(x^{*}\right)+g^{\prime}\left(x^{*}\right)\left(x_{k}-x^{*}\right)+\ldots\\&+\frac{g^{(p-1)}\left(x^{*}\right)\left(x_{k}-x^{*}\right)^{p-1}}{(p-1) !} +\frac{g^{(p)}(\xi)\left(x_{k}-x^{*}\right)^{p}}{p !}, \quad【\xi \in\left[x_{k}, x^{*}\right] 或\left[x^{*}, x_{k}\right] 】\\ \Rightarrow& x_{k+1}=x^{*}+\frac{g^{(p)}(\xi)\left(x_{k}-x^{*}\right)^{p}}{p !} \\ \Rightarrow& \frac{x_{k+1}-x^{*}}{\left(x_{k}-x^{*}\right)^{p}}=\frac{g^{(p)}(\xi)}{p !} \rightarrow \frac{g^{(p)}\left(x^{*}\right)}{p !} \end{aligned}xk+1⇒⇒=g(xk)=g(x∗)+g′(x∗)(xk−x∗)+…+(p−1)!g(p−1)(x∗)(xk−x∗)p−1+p!g(p)(ξ)(xk−x∗)p,【ξ∈[xk,x∗]或[x∗,xk]】xk+1=x∗+p!g(p)(ξ)(xk−x∗)p(xk−x∗)pxk+1−x∗=p!g(p)(ξ)→p!g(p)(x∗)
2.2 给定精度的情况下,如何预测不动点迭代需要迭代的次数
定义L=maxx∈[a,b]{∣g′(x)∣}<1L=\max _{x \in[a, b]}\left\{\left|g^{\prime}(x)\right|\right\}<1L=maxx∈[a,b]{∣g′(x)∣}<1
迭代的次数满足:k≥ln(ε(1−L)/∣x1−x0∣)/lnLk \geq \ln \left(\varepsilon(1-L) /\left|x_{1}-x_{0}\right|\right) / \ln Lk≥ln(ε(1−L)/∣x1−x0∣)/lnL
证明:
xk+1=g(xk)=g(x∗)+g′(ξ)(xk−x∗)=x∗+g′(ξ)(xk−x∗)⇒∣xk+1−x∗∣≤∣g′(ξ)∥(xk−x∗)∣≤L∣xk−x∗∣≤Lk∣x1−x∗∣\begin{aligned}x_{k+1}=g\left(x_{k}\right)=g\left(x^{*}\right)+g^{\prime}(\xi)\left(x_{k}-x^{*}\right)=x^{*}+g^{\prime}(\xi)\left(x_{k}-x^{*}\right)\\ \Rightarrow \left|x_{k+1}-x^{*}\right| \leq\left|g^{\prime}(\xi) \|\left(x_{k}-x^{*}\right)\right| \leq L\left|x_{k}-x^{*}\right|\le L^{k}\left|x_{1}-x^{*}\right|\end{aligned}xk+1=g(xk)=g(x∗)+g′(ξ)(xk−x∗)=x∗+g′(ξ)(xk−x∗)⇒∣xk+1−x∗∣≤∣g′(ξ)∥(xk−x∗)∣≤L∣xk−x∗∣≤Lk∣x1−x∗∣
又有
∣xk+1−xk∣=∣g(xk)−g(xk−1)∣≤L∣xk−xk−1∣≤Lk∣x1−x0∣\left|x_{k+1}-x_{k}\right|=\left|g\left(x_{k}\right)-g\left(x_{k-1}\right)\right| \leq L\left|x_{k}-x_{k-1}\right| \leq L^{k}\left|x_{1}-x_{0}\right| ∣xk+1−xk∣=∣g(xk)−g(xk−1)∣≤L∣xk−xk−1∣≤Lk∣x1−x0∣
于是有:
∣xk+q−xk∣≤∣xk+q−xk+q−1∣+∣xk+q−1−xk+q−2∣+…+∣xk+1−xk∣≤(Lq−1+Lq−2+…+1)∣xk+1−xk∣<(1+L+L2+…+Lq−1+…)∣xk+1−xk∣=11−L∣xk+1−xk∣≤Lk1−L∣x1−x0∣\begin{aligned} &\left|x_{k+q}-x_{k}\right| \leq\left|x_{k+q}-x_{k+q-1}\right|+\left|x_{k+q-1}-x_{k+q-2}\right|+\ldots+\left|x_{k+1}-x_{k}\right| \\ &\leq\left(L^{q-1}+L^{q-2}+\ldots+1\right)\left|x_{k+1}-x_{k}\right|\\&<\left(1+L+L^{2}+\ldots+L^{q-1}+\ldots\right)\left|x_{k+1}-x_{k}\right|\\ &=\frac{1}{1-L}\left|x_{k+1}-x_{k}\right| \\&\leq \frac{L^{k}}{1-L}\left|x_{1}-x_{0}\right| \end{aligned}∣xk+q−xk∣≤∣xk+q−xk+q−1∣+∣xk+q−1−xk+q−2∣+…+∣xk+1−xk∣≤(Lq−1+Lq−2+…+1)∣xk+1−xk∣<(1+L+L2+…+Lq−1+…)∣xk+1−xk∣=1−L1∣xk+1−xk∣≤1−LLk∣x1−x0∣
让q→∞q \rightarrow \inftyq→∞有
∣x∗−xk∣≤11−L∣xk+1−xk∣≤Lk1−L∣x1−x0∣\left|x^{*}-x_{k}\right| \leq \frac{1}{1-L}\left|x_{k+1}-x_{k}\right| \leq \frac{L^{k}}{1-L}\left|x_{1}-x_{0}\right|∣x∗−xk∣≤1−L1∣xk+1−xk∣≤1−LLk∣x1−x0∣
于是:
Lk1−L∣x1−x0∣≤ε⇒k≥ln(ε(1−L)/∣x1−x0∣)/lnL\frac{L^{k}}{1-L}\left|x_{1}-x_{0}\right| \leq \varepsilon \Rightarrow k \geq \ln \left(\varepsilon(1-L) /\left|x_{1}-x_{0}\right|\right) / \ln L1−LLk∣x1−x0∣≤ε⇒k≥ln(ε(1−L)/∣x1−x0∣)/lnL
2.3 如何加快收敛的速度
xk+1−x∗≈L(xk−x∗)xk+2−x∗≈L(xk+1−x∗)xk+1−x∗xk+2−x∗≈xk−x∗xk+1−x∗⇒x∗≈xk−(xk+1−xk)2xk+2−2xk+1+xk=xΔ\begin{aligned} &x_{k+1}-x^{*} \approx L\left(x_{k}-x^{*}\right) \\ &x_{k+2}-x^{*} \approx L\left(x_{k+1}-x^{*}\right) \\ &\frac{x_{k+1}-x^{*}}{x_{k+2}-x^{*}} \approx \frac{x_{k}-x^{*}}{x_{k+1}-x^{*}} \Rightarrow\quad x^{*} \approx x_{k}-\frac{\left(x_{k+1}-x_{k}\right)^{2}}{x_{k+2}-2 x_{k+1}+x_{k}}=x^{\Delta} \end{aligned}xk+1−x∗≈L(xk−x∗)xk+2−x∗≈L(xk+1−x∗)xk+2−x∗xk+1−x∗≈xk+1−x∗xk−x∗⇒x∗≈xk−xk+2−2xk+1+xk(xk+1−xk)2=xΔ
根据上面的思路我们可以:
Iterationxˉk+1=g(xk)Onemorex^k+1=g(xˉk+1)Tospeedupxk+1=xk−(xˉk+1−xk)2x^k+1−2xˉk+1+xk\begin{aligned}Iteration &\quad \bar{x}_{k+1}=g\left(x_{k}\right) \\ One more &\quad \hat{x}_{k+1}=g\left(\bar{x}_{k+1}\right) \\ To\, speed \,up &\quad x_{k+1}=x_{k}-\frac{\left(\bar{x}_{k+1}-x_{k}\right)^{2}}{\hat{x}_{k+1}-2 \bar{x}_{k+1}+x_{k}} \end{aligned}IterationOnemoreTospeedupxˉk+1=g(xk)x^k+1=g(xˉk+1)xk+1=xk−x^k+1−2xˉk+1+xk(xˉk+1−xk)2
2.4 停止不定点迭代的条件
当L=maxx∈[a,b]{∣g′(x)∣}<1L=\max _{x \in[a, b]}\left\{\left|g^{\prime}(x)\right|\right\}<1L=maxx∈[a,b]{∣g′(x)∣}<1时,可以使用下面的条件:
∣xk+1−xk∣<eps\left|x_{\mathrm{k}+1}-x_{\mathrm{k}}\right|<\mathrm{eps}∣xk+1−xk∣<eps
2.5 不动点迭代的两个缺点
- 很难估计L(maxx∈[a,b]{∣g′(x)∣})L(\max _{x \in[a, b]}\left\{\left|g^{\prime}(x)\right|\right\})L(maxx∈[a,b]{∣g′(x)∣})
- L<1L<1L<1时无法收敛。
3. 应用:如何求解非线性方程组f(x)=0f(x)=0f(x)=0的解
3.1 二分法(Bisection Method of Bolzano)
算法的流程:
- 用一个区间找到一个根。
- 用中点分割该区间。
- 选择其中的一个子区间作为新的位置。
a=x0,b=x0+hc=a+b2f(a)f(b)<0,\begin{aligned} &a=x_{0}, \quad b=x_{0}+h \\ &c=\frac{a+b}{2}\\ &f(a) f(b)<0, \end{aligned}a=x0,b=x0+hc=2a+bf(a)f(b)<0,
于是:
[a,b]→[a1,b1]→[a2,b2]→…→[an,bn]a=a0≤a1≤⋯≤an≤⋯≤r≤⋯≤bn≤⋯≤b1≤b0=b\begin{aligned} &{[{a}, {b}]\rightarrow\left[{a}_{1}, {~b}_{1}\right]\rightarrow \left[{a}_{2}, {~b}_{2}\right]\rightarrow\ldots\rightarrow\left[{a}_{{n}}, {b}_{{n}}\right]} \\ &a=a_{0} \leq a_{1} \leq \cdots \leq a_{n} \leq \cdots \leq r \leq \cdots \leq b_{n} \leq \cdots \leq b_{1} \leq b_{0}=b \end{aligned}[a,b]→[a1, b1]→[a2, b2]→…→[an,bn]a=a0≤a1≤⋯≤an≤⋯≤r≤⋯≤bn≤⋯≤b1≤b0=b
定义rrr是精确解。
∣r−cn∣≤b−a2n+1,for n=0,1,2,…cn=an+bn2\begin{aligned} &\left|r-c_{n}\right| \leq \frac{b-a}{2^{n+1}}, \text { for } n=0,1,2, \ldots \\ &c_{n}=\frac{a_{n}+b_{n}}{2} \end{aligned}∣r−cn∣≤2n+1b−a, for n=0,1,2,…cn=2an+bn
迭代次数N:
∣r−cn∣≤b−a2n+1<δ2n+1>b−aδ(n+1)ln2>ln(b−a)−lnδn+1>ln(b−a)−lnδln2N=int(ln(b−a)−lnδln2)\begin{aligned} &\left|r-c_{n}\right| \leq \frac{b-a}{2^{n+1}}<\delta \\ &2^{n+1}>\frac{b-a}{\delta} \\ &(n+1) \ln 2>\ln (b-a)-\ln \delta \\ &n+1>\frac{\ln (b-a)-\ln \delta}{\ln 2} \\ &N=\operatorname{int}\left(\frac{\ln (b-a)-\ln \delta}{\ln 2}\right) \end{aligned}∣r−cn∣≤2n+1b−a<δ2n+1>δb−a(n+1)ln2>ln(b−a)−lnδn+1>ln2ln(b−a)−lnδN=int(ln2ln(b−a)−lnδ)
简单地利用二分法可以判断区间内有没有零点(区间内有变号【可取最大值和最小值】)
3.2 试位法(False Position Method)
算法的流程:
- 用一个区间找到一个根。
- 以割线与X轴的交点划分区间。(过程中仍然保证端点的异号,让区间包含零点)
- 选择其中一个子区间作为新的位置。
c=b−f(b)(b−a)f(b)−f(a)c1→c2→…→r[an,bn]→[a,c]:=[an+1,bn+1]c=b-\frac{f(b)(b-a)}{f(b)-f(a)}\\ c_{1}\rightarrow c_{2}\rightarrow \ldots\rightarrow r\\ \left[a_{n}, b_{n}\right]\rightarrow [a, c]:=\left[a_{n+1}, b_{n+1}\right]c=b−f(b)−f(a)f(b)(b−a)c1→c2→…→r[an,bn]→[a,c]:=[an+1,bn+1]
缺点:在凹函数下不适用,不会收敛。
3.3 牛顿-拉夫逊方法(Newton-Raphson method)
我们知道不动点迭代,能不能用到求解非线性方程组呢?
使用泰勒展式:
f(xk+1)=f(xk)+f′(xk)(xk+1−xk)+O(∣d∣2)=0f(x_{k+1})=f\left(x_{{k}}\right)+f^{\prime}\left(x_{{k}}\right) (x_{k+1}-x_k)+{O}\left(|d|^{2}\right)=0f(xk+1)=f(xk)+f′(xk)(xk+1−xk)+O(∣d∣2)=0
于是我们可以让
f(xk)+f′(xk)(xk+1−xk)=0f\left(x_{\mathrm{k}}\right)+f^{\prime}\left(x_{\mathrm{k}}\right)\left(x_{{k}+1}-x_{{k}}\right)=0f(xk)+f′(xk)(xk+1−xk)=0
使得:
xk+1=xk−f(xk)/f′(xk)=g(xk)x_{\mathrm{k}+1}=x_{{k}}-f\left(x_{\mathrm{k}}\right) / f^{\prime}\left(x_{\mathrm{k}}\right)=g(x_k)xk+1=xk−f(xk)/f′(xk)=g(xk)
总结Newton-Raphson方法即:
f(x)=0x=g(x)=x−f(x)f′(x)xk+1=g(xk)=xk−f(xk)f′(xk)\begin{array}{l} f(x)=0 \\ x=g(x)=x-\frac{f(x)}{f^{\prime}(x)} \\ x_{k+1}=g\left(x_{k}\right)=x_{k}-\frac{f\left(x_{k}\right)}{f^{\prime}\left(x_{k}\right)} \end{array}f(x)=0x=g(x)=x−f′(x)f(x)xk+1=g(xk)=xk−f′(xk)f(xk)
我们可以证明在解的附近,Newton-Raphson方法是收敛的。
证明:
g(x)=x−f(x)/f′(x)g(x)=x-f(x) / f^{\prime}(x)g(x)=x−f(x)/f′(x)
g′(x)=1−f′(x)f′(x)−f(x)f′′(x)[f′(x)]2=f(x)f′′(x)[f′(x)]2g^{\prime}(x)=1-\frac{f^{\prime}(x) f^{\prime}(x)-f(x) f^{\prime \prime}(x)}{\left[f^{\prime}(x)\right]^{2}}=\frac{f(x) f^{\prime \prime}(x)}{\left[f^{\prime}(x)\right]^{2}}g′(x)=1−[f′(x)]2f′(x)f′(x)−f(x)f′′(x)=[f′(x)]2f(x)f′′(x)
我们知道不动点的条件是∣g′(x)∣<K<1\left|g^{\prime}(x)\right|<K<1∣g′(x)∣<K<1,当我们取的邻域足够小,条件g([a,b])⊂[a,b]g([a, b]) \subset[a, b]g([a,b])⊂[a,b]会满足,注意到f(x∗)=0f(x^*)=0f(x∗)=0,在解的邻域附近,因为f(x)=0f(x)=0f(x)=0,所以g′(x)=0g'(x)=0g′(x)=0。
各种条件下的推导(不做要求,想了解可以看一下)
- f′(x∗)>0and f′′(x∗)<0,g([x∗−δ,x∗+δ])⊂[x∗−δ,x∗+δ]f^{\prime}\left(x^{*}\right)>0 \text { and } f^{\prime \prime}\left(x^{*}\right)<0, \,\,\,\,g\left(\left[x^{*}-\delta, x^{*}+\delta\right]\right) \subset\left[x^{*}-\delta, x^{*}+\delta\right]f′(x∗)>0 and f′′(x∗)<0,g([x∗−δ,x∗+δ])⊂[x∗−δ,x∗+δ]
x∗−δ<g(x∗−δ)=(x∗−δ)−f(x∗−δ)f′(x∗−δ)⇔0<−f(x∗−δ)f′(x∗−δ)⇔f(x∗−δ)f′(x∗−δ)<0⇔f(x∗−δ)<0⇔f(x∗)−f′(ξ)δ<0【ξ∈[x∗−δ,x∗]】⇔−f′(ξ)δ<0⇒∃δ1>0,f′(ξ)>0,for x∗−ξ<δ1\begin{aligned} &x^{*}-\delta<g\left(x^{*}-\delta\right)=\left(x^{*}-\delta\right)-\frac{f\left(x^{*}-\delta\right)}{f^{\prime}\left(x^{*}-\delta\right)} \\ \Leftrightarrow& 0<-\frac{f\left(x^{*}-\delta\right)}{f^{\prime}\left(x^{*}-\delta\right)} \\ \Leftrightarrow &\frac{f\left(x^{*}-\delta\right)}{f^{\prime}\left(x^{*}-\delta\right)}<0 \\ \Leftrightarrow &f\left(x^{*}-\delta\right)<0 \\ \Leftrightarrow &f\left(x^{*}\right)-f^{\prime}(\xi) \delta<0 【\xi\in[x^*-\delta,x^*]】\\ \Leftrightarrow&-f^{\prime}(\xi) \delta<0\\ \Rightarrow&\exists \delta_1>0, f^{\prime}(\xi)>0, \text { for } x^{*}-\xi<\delta_1 \\ \end{aligned}⇔⇔⇔⇔⇔⇒x∗−δ<g(x∗−δ)=(x∗−δ)−f′(x∗−δ)f(x∗−δ)0<−f′(x∗−δ)f(x∗−δ)f′(x∗−δ)f(x∗−δ)<0f(x∗−δ)<0f(x∗)−f′(ξ)δ<0【ξ∈[x∗−δ,x∗]】−f′(ξ)δ<0∃δ1>0,f′(ξ)>0, for x∗−ξ<δ1
又有
f′′(x∗)<0⇒∃δ2>0,f′′(x)<0【保号性】⇒g′(x)=f(x)f′′(x)[f′(x)]2>0,for x∗−x<δ2【δ2足够小,导数保号性,f′(x)>0,x<x∗,f(x∗)=0,f(x)<0】\begin{aligned} &f^{\prime \prime}\left(x^{*}\right)<0\\ \Rightarrow & \exists \delta_2>0, f^{\prime \prime}(x)<0 【保号性】\\ \Rightarrow & g^{\prime}(x)=\frac{f(x) f^{\prime \prime}(x)}{\left[f^{\prime}(x)\right]^{2}}>0, \text { for } x^{*}-x<\delta_2\\ &【\delta_2足够小,导数保号性,f'(x)>0,x<x^*,f(x^*)=0,f(x)<0】 \end{aligned}⇒⇒f′′(x∗)<0∃δ2>0,f′′(x)<0【保号性】g′(x)=[f′(x)]2f(x)f′′(x)>0, for x∗−x<δ2【δ2足够小,导数保号性,f′(x)>0,x<x∗,f(x∗)=0,f(x)<0】
当δ<min{δ1,δ2}\delta<\min\{\delta_1,\delta_2\}δ<min{δ1,δ2}有:
x∗−δ<g(x∗−δ)<g(x),for x∗−x<δx^{*}-\delta<g\left(x^{*}-\delta\right)<g(x), \text { for } x^{*}-x<\deltax∗−δ<g(x∗−δ)<g(x), for x∗−x<δ- f′(x∗)>0and f′′(x∗)<0,g([x∗−δ,x∗+δ])⊂[x∗−δ,x∗+δ]f^{\prime}\left(x^{*}\right)>0 \text { and } f^{\prime \prime}\left(x^{*}\right)<0, \,\,\,\,g\left(\left[x^{*}-\delta, x^{*}+\delta\right]\right) \subset\left[x^{*}-\delta, x^{*}+\delta\right]f′(x∗)>0 and f′′(x∗)<0,g([x∗−δ,x∗+δ])⊂[x∗−δ,x∗+δ]
x∗−δ<g(x∗−δ)=(x∗−δ)−f(x∗−δ)f′(x∗−δ)⇔0<−f(x∗−δ)f′(x∗−δ)⇔f(x∗−δ)f′(x∗−δ)<0⇔f(x∗−δ)<0⇔f(x∗)−f′(ξ)δ<0【ξ∈[x∗−δ,x∗]】⇔−f′(ξ)δ<0⇒∃δ1>0,f′(ξ)>0,for x∗−ξ<δ1\begin{aligned} &x^{*}-\delta<g\left(x^{*}-\delta\right)=\left(x^{*}-\delta\right)-\frac{f\left(x^{*}-\delta\right)}{f^{\prime}\left(x^{*}-\delta\right)} \\ \Leftrightarrow& 0<-\frac{f\left(x^{*}-\delta\right)}{f^{\prime}\left(x^{*}-\delta\right)} \\ \Leftrightarrow &\frac{f\left(x^{*}-\delta\right)}{f^{\prime}\left(x^{*}-\delta\right)}<0 \\ \Leftrightarrow &f\left(x^{*}-\delta\right)<0 \\ \Leftrightarrow &f\left(x^{*}\right)-f^{\prime}(\xi) \delta<0 【\xi\in[x^*-\delta,x^*]】\\ \Leftrightarrow&-f^{\prime}(\xi) \delta<0\\ \Rightarrow&\exists \delta_1>0, f^{\prime}(\xi)>0, \text { for } x^{*}-\xi<\delta_1 \\ \end{aligned}⇔⇔⇔⇔⇔⇒x∗−δ<g(x∗−δ)=(x∗−δ)−f′(x∗−δ)f(x∗−δ)0<−f′(x∗−δ)f(x∗−δ)f′(x∗−δ)f(x∗−δ)<0f(x∗−δ)<0f(x∗)−f′(ξ)δ<0【ξ∈[x∗−δ,x∗]】−f′(ξ)δ<0∃δ1>0,f′(ξ)>0, for x∗−ξ<δ1
又有
f′′(x∗)>0⇒∃δ2>0,f′′(x)<0【保号性】⇒g′(x)=f(x)f′′(x)[f′(x)]2<0,for x∗−x<δ2【δ2足够小,导数保号性,f′(x)>0,x<x∗,f(x∗)=0,f(x)<0】\begin{aligned} &f^{\prime \prime}\left(x^{*}\right)>0\\ \Rightarrow & \exists \delta_2>0, f^{\prime \prime}(x)<0 【保号性】\\ \Rightarrow & g^{\prime}(x)=\frac{f(x) f^{\prime \prime}(x)}{\left[f^{\prime}(x)\right]^{2}}<0, \text { for } x^{*}-x<\delta_2\\ &【\delta_2足够小,导数保号性,f'(x)>0,x<x^*,f(x^*)=0,f(x)<0】 \end{aligned}⇒⇒f′′(x∗)>0∃δ2>0,f′′(x)<0【保号性】g′(x)=[f′(x)]2f(x)f′′(x)<0, for x∗−x<δ2【δ2足够小,导数保号性,f′(x)>0,x<x∗,f(x∗)=0,f(x)<0】
当δ<min{δ1,δ2}\delta<\min\{\delta_1,\delta_2\}δ<min{δ1,δ2}有:
x∗−δ<x∗=g(x∗)<g(x),for x∗−x<δ,x<x∗x^{*}-\delta<x^{*}=g\left(x^{*}\right)<g(x), \text { for } x^{*}-x<\delta,x<x^*x∗−δ<x∗=g(x∗)<g(x), for x∗−x<δ,x<x∗
注意Newton-Raphson方法对于单根是二阶收敛(二次收敛)【quadratic convergence】
∣En+1∣≈∣f′′(p)∣2∣f′(p)∣∣En∣2n→∞\left|E_{n+1}\right| \approx \frac{\left|f^{\prime \prime}(p)\right|}{2\left|f^{\prime}(p)\right|}\left|E_{n}\right|^{2}\quad n\rightarrow \infty∣En+1∣≈2∣f′(p)∣∣f′′(p)∣∣En∣2n→∞
证明:
而对于多重根是线性(一次)收敛,收敛速度降低。
∣En+1∣≈M−1M∣En∣n→∞\left|E_{n+1}\right| \approx \frac{M-1}{M}\left|E_{n}\right |\quad n\rightarrow \infty∣En+1∣≈MM−1∣En∣n→∞
证明:
如果出现了多重根p∗p^*p∗,我们看到在f′(p∗)=0f'(p^*)=0f′(p∗)=0,Newton-Raphson方法的分母会出现0.然而一般来说,分子f(pk)f(p_k)f(pk)要比分母f′(pk)f'(p_k)f′(pk)先出现0,所以Newton-Raphson方法一般还是可以用的。
Newton-Raphson方法的问题:
1.分母可能为0,除以零是不允许的。
2.收敛到一个不同的根,或发散。
3.产生一个循环序列。
4.产生一个发散的振荡序列。
由于多重根线性收敛的问题,可以考虑Newton-Raphson方法加速:
pk=pk−1−Mf(pk−1)f′(pk−1)M>1p_{k}=p_{k-1}-\frac{M f\left(p_{k-1}\right)}{f^{\prime}\left(p_{k-1}\right)}\quad M>1pk=pk−1−f′(pk−1)Mf(pk−1)M>1
证明:
3.4 割线法(Secant Method)
当Newton-Raphson的导数不好显式表达的时候,可以通过两端点的直线的斜率来近似导数。
我们有:
xk+2=g(xk,xk+1)=xk+1−f(xk+1)(xk+1−xk)f(xk+1)−f(xk)x_{k+2}=g\left(x_{k}, x_{k+1}\right)=x_{k+1}-\frac{f\left(x_{k+1}\right)\left(x_{k+1}-x_{k}\right)}{f\left(x_{k+1}\right)-f\left(x_{k}\right)}xk+2=g(xk,xk+1)=xk+1−f(xk+1)−f(xk)f(xk+1)(xk+1−xk)
3.5 Aitken过程加速
使用不定点的迭代,Aitken过程加速又称为史蒂芬森加速(Steffensen’s acceleration).注意,只对一阶方法有效。
limn→∞p−pn+1p−pn=A,p≈pn+2pn−pn+12pn+2−2pn+1+pn=qn\lim _{n \rightarrow \infty} \frac{p-p_{n+1}}{p-p_{n}}=A, \quad p \approx \frac{p_{n+2} p_{n}-p_{n+1}^{2}}{p_{n+2}-2 p_{n+1}+p_{n}}=q_{n}n→∞limp−pnp−pn+1=A,p≈pn+2−2pn+1+pnpn+2pn−pn+12=qn
3.6 Muller方法(Muller’s method)
给定三个初始值(p0,f(p0)),(p1,f(p1)),(p2,f(p2))\left(p_{0}, f\left(p_{0}\right)\right),\left(p_{1}, f\left(p_{1}\right)\right),\left(p_{2},f\left(p_{2}\right)\right)(p0,f(p0)),(p1,f(p1)),(p2,f(p2))
令
t=x−p2h0=p0−p2,h1=p1−p2\begin{aligned} &t=x-p_{2} \\ &h_{0}=p_{0}-p_{2}, h_{1}=p_{1}-p_{2} \\ \end{aligned}t=x−p2h0=p0−p2,h1=p1−p2
我们使用二次函数计算下一个点:
y=at2+bt+cy=a t^{2}+b t+cy=at2+bt+c
则有:
t=h0:ah02+bh0+c=f0⇒ah02+bh0=f0−c=e0t=h1:ah12+bh1+c=f1⇒ah12+bh1=f1−c=e1t=0:a02+b0+c=f2⇒c=f2\begin{aligned} t=h_{0}: a h_{0}^{2}+b h_{0}+c=f_{0} &\Rightarrow a h_{0}^{2}+b h_{0}=f_{0}-c=e_{0} \\ t=h_{1}: a h_{1}^{2}+b h_{1}+c=f_{1} &\Rightarrow a h_{1}^{2}+b h_{1}=f_{1}-c=e_{1} \\ t=0: a 0^{2}+b 0+c=f_{2}& \Rightarrow c=f_{2} \end{aligned}t=h0:ah02+bh0+c=f0t=h1:ah12+bh1+c=f1t=0:a02+b0+c=f2⇒ah02+bh0=f0−c=e0⇒ah12+bh1=f1−c=e1⇒c=f2
解得:
a=e0h1−e1h0h1h02−h0h12,b=e1h02−e0h12h1h02−h0h12a=\frac{e_{0} h_{1}-e_{1} h_{0}}{h_{1} h_{0}^{2}-h_{0} h_{1}^{2}}, \quad b=\frac{e_{1} h_{0}^{2}-e_{0} h_{1}^{2}}{h_{1} h_{0}^{2}-h_{0} h_{1}^{2}}a=h1h02−h0h12e0h1−e1h0,b=h1h02−h0h12e1h02−e0h12
于是得到:
at2+bt+c=0:t=z1,z2⇒zi=−2cb±b2−4acz=argmin{∣zi∣}【对于一个复数,在计算中只保留其实数部分】\begin{aligned} &a t^{2}+b t+c=0: \quad t=z_{1}, z_{2} \Rightarrow z_{i}=\frac{-2 c}{b \pm \sqrt{b^{2}-4 a c}} \\ &z=\arg \min \left\{\left|z_{i}\right|\right\}【\text{对于一个复数,在计算中只保留其实数部分}】 \end{aligned}at2+bt+c=0:t=z1,z2⇒zi=b±b2−4ac−2cz=argmin{∣zi∣}【对于一个复数,在计算中只保留其实数部分】
p3=p2+zp_{3}=p_{2}+zp3=p2+z
继续得到(pˉ1,pˉ2,p3)\left(\bar{p}_{1}, \bar{p}_{2}, p_{3}\right)(pˉ1,pˉ2,p3),其中pˉ1,pˉ2\bar{p}_{1}, \bar{p}_{2}pˉ1,pˉ2是距离p3p_3p3最近的两个点。
4. 其他问题
4.1 如何寻找初值
例如
可以有两个判断条件:
-
【针对r1r_1r1和r2r_2r2】
f(xk−1)f(xk)<0[a,b]=[xk−1,xk]f\left(x_{k-1}\right) f\left(x_{k}\right)<0 \quad[{a}, {b}]=\left[{x}_{{k}-1}, {x}_{{k}}\right]f(xk−1)f(xk)<0[a,b]=[xk−1,xk] -
【针对r3r_3r3】
∣f(xk)∣<ε并且(f(xk)−f(xk−1))(f(xk+1)−f(xk))<0[a,b]=[xk−1,xk+1]\left|f\left(x_{k}\right)\right|<\varepsilon \text { 并且}\left(f\left(x_{k}\right)-f\left(x_{k-1}\right)\right) \left(f\left(x_{k+1}\right)-f\left(x_{k}\right)\right)<0\quad [{a}, {b}]=\left[{x}_{{k}-1}, {x}_{{k}+1}\right]∣f(xk)∣<ε 并且(f(xk)−f(xk−1))(f(xk+1)−f(xk))<0[a,b]=[xk−1,xk+1]
4.2 收敛条件
可以有两个收敛条件:
1. 根据纵坐标
∣f(xk)∣<ε\left|f\left(x_{k}\right)\right|<\varepsilon∣f(xk)∣<ε
误差为:Errorx=∣xk−r∣\text{Error}_{x}=\left|x_{k}-r\right|Errorx=∣xk−r∣
2. 根据横坐标
∣xk−xk−1∣<δ\left|x_{k}-x_{k-1}\right|<\delta∣xk−xk−1∣<δ
由以下推出:
∣xk−r∣<δ⇒∣xk−xk−1∣<δ\left|x_{k}-r\right|<\delta \Rightarrow\left|x_{k}-x_{k-1}\right|<\delta∣xk−r∣<δ⇒∣xk−xk−1∣<δ
误差为:Error f=max{∣f(r−δ)∣,∣f(r+δ)∣}\text { Error }_{f}=\max \{|f(r-\delta)|,|f(r+\delta)|\} Error f=max{∣f(r−δ)∣,∣f(r+δ)∣}
3. 我们也可以把上面两个进行组合:
∣f(xk)∣<ε并且∣xk−r∣<δ\left|f\left(x_{k}\right)\right|<\varepsilon \text{并且}\left|x_{k}-r\right|<\delta∣f(xk)∣<ε并且∣xk−r∣<δ
- 如果针对Newton-Raphson问题,我们还可以有如下的判断标准:
①f′(r)≠0f^{\prime}(r) \neq 0f′(r)=0
②x0∈[r−δ,r+δ]x_{0} \in[r-\delta, r+\delta]x0∈[r−δ,r+δ], δ\deltaδ足够小。
4.3 算法的收敛速度对比
4.4 算法的选择
单根:
Newton-Raphson方法
双根(当分母为0失效):
Newton-Raphson方法
Steffensen’s method