Talk:Rate of convergence

old and untitled[edit]

I think that for linear convergence we need $\mu =\lim \epsilon _{k+1}/\epsilon _{k}$ to be strictly smaller than one. Indeed, the sublinearly convergent seguence $\epsilon _{k}=\log k$ has $\mu =1$ . -- Jitse Niesen 18:00, 27 Dec 2004 (UTC)

--

In the book

Richard L. Burden, J. Douglas Faires (2000), "Numerical Analysis, (7th Ed)", Brooks/Cole. ISBN 0534382169

they just say

\mu >0

for linear convergence.

I agree with you, intuitively

\epsilon _{k}=\log k

converges sublinearly. On the other hand, the sequence

\epsilon _{k}=1/k

converges linearly, but with

\mu =1.

Same for

\epsilon _{k}=1/k^{p}

for

p>1

. There are other strange sequences. For example,

\epsilon _{k}={\frac {1}{2^{k^{2}}}}

does not converge with rate faster than linear (that is with

q>1

), however if we say it converges linearly, one gets

\mu =0.

So, the rate of convergence is not a perfect indicator of how fast a sequence converges. Probably ultimately it is a matter of convention. Do you have any references where

\mu <1

for linear convergence? --Oleg Alexandrov 19:12, 27 Dec 2004 (UTC).

I am abroad, so I cannot check any references at the moment (I'll probably be back in the second week of January), but I probably got it either from the Gautschi book mentioned in the article, or from Süli and Mayers' An Introduction to Numerical Analysis.

According to the definition currently in the article, the sequence eps_k = 1/k does NOT converge linearly and eps_k = 1/2^(k^2) converges sublinearly, but not with order > 1. This is the correct definition in the framework of root-finding algorithms (unless I'm wildly wrong): the error of the bisection method is 1/2^k, which converges linearly, and the error of Newton's method is something like 1/2^(2^k), which converges quadratically.

I think you meant eps_k = 1/2^(k^2) converges superlinearly. You are right about the errors of the bisection method and Newton's method. We need to get the defintion of rate of convergence right, but ultimately, we discuss technicalities, the point of all numerical textbooks is that quadratic convergence is what we want, everything else is not important. Oleg Alexandrov 20:37, 27 Dec 2004 (UTC)

I now realize that there is another definition in numerical ordinary differential equations: a method has order q if the error is like 1/N^q where N = number of variables. A method with error 1/2^N (like the spectral method) would be called exponentially convergent in this context. This can be confusing, and so it should be explained better in the article.

I hope introducing an alternate defintion will not make things more confusing. Oleg Alexandrov 20:37, 27 Dec 2004 (UTC)

By the way, how do you like the book by Burden and Faires? I have never read it, but I saw it references in several Wikipedia articles, so perhaps I should give it a try. However, I still think we need μ < 1 because every convergent sequence satisfies the condition with μ = 1. -- Jitse Niesen 20:05, 27 Dec 2004 (UTC)

This is my wife's favorite numerical analysis book (that should matter a lot! :), it is also the standard textbook at UCLA, where I teach. Recently I was contacted to review a textbook in numerical analysis, and the editors wanted to know how their book compares to Burden and Faires. I would say it is a very nice book, with lots of problems, and good explanations. I think I put that book as references in the several places you saw it. Oleg Alexandrov 20:37, 27 Dec 2004 (UTC)

PS Burden and Faires is also very expensive ($140.95 list price in the US), it is also 7th edition. This probably says that the book is in demand.

Burden and Faires definition again[edit]

That book, unfortunately not only forgets that one must have μ<1, they actually have exercises where they ask to prove x_n=1/n converges linearly . So I think that book is overall confused on that topic. Oleg Alexandrov 20:36, 16 Jan 2005 (UTC)

Cheers, I didn't notice that, so it is a deliberate choice of Burden and Faires. All other books require μ < 1 though, so I'm keeping that definition. It's a pity, since the Burden and Faires book seems pretty nice overall, and apparently also used as a textbook at my university. -- Jitse Niesen 11:55, 17 Jan 2005 (UTC)

I think the point of Burden and Faires is that they are really after quadratic convergence. Therefore, anything slower than that they think is all and the same damn slow linear convergence. :) Oleg Alexandrov 16:11, 17 Jan 2005 (UTC)

This book provides a formal definition on Rate of Convergence of functions that I think gives us a better understanding on this topic. Also, from this definition I find that using Taylor's theorem is the "safest" way to find the rate of convergence of a function. The definition is stated as follow:

Suppose that $\lim _{h\rightarrow 0}G(h)=0$ $\qquad$ and $\qquad$ $\qquad$ $\lim _{h\rightarrow 0}F(h)=L$ . If a positive constant K exists with $\left|F(h)-L\right|\leq K\left|G(h)\right|$ , for sufficiently small $h$ , then we write $F(h)=L+O(G(h))$ .

The functions we use for comparison generally have the form $G(h)={h}^{p}$ , where $p>0$ . (Definition 1.19, P37, Burden and Faires, 7th Edition)

By using Taylor series, we will want $p$ to be the largest value so that $F(h)=L+O({h}^{p})$ .Usually we can stop at the first non-zero term after the limit in the Taylor series. -- Hoai Tran 4:12, 28 Sep 2008 (OU)

Further tweaks to extended definition[edit]

In reference to my simultaneous edit to the article: The new formulation with "converges with at least order q" was taken from the Suli and Mayers book. The previous formulation ("the order of convergence is the largest q") has the disadvantage that it is not obvious that such a q exists. Feel free to revert as it is a bit nit-picking. (PS: Removing Burden and Faires from the references was the right solution) -- Jitse Niesen 12:21, 20 Jan 2005 (UTC)

I like your change. Cheers, Oleg Alexandrov | talk 05:41, 21 Jan 2005 (UTC)

Problem with the convergence examples[edit]

I'm fairly sure that sequence d doesn't converge at all... so can we use it as an example of sublinear convergence? -- EPAstor, 22:39 April 10, 2005 (UTC)

I don't understand. Are you refering to the sequence

1,{\frac {1}{2}},{\frac {1}{3}},{\frac {1}{4}},{\frac {1}{5}},\ldots ?

This sequence certainly converges to zero (for every positive ε, one can find an N such that |d_n| < ε for n ≥ N). Or perhaps you mean that the series

1+{\frac {1}{2}}+{\frac {1}{3}}+{\frac {1}{4}}+{\frac {1}{5}}+\cdots

does not converge? That is true, but it is not what the article says. -- Jitse Niesen 12:25, 11 Apr 2005 (UTC)

I thought the examples given here were a little sketchy and didn't include examples of finding the rate of convergence under the extended definition. I'm currently taking a Numerical Analysis class and using the Burden and Faires text. The examples given in that book for the extended definition involve using the Taylor series and reduce the problem to one of finding the first non-zero term in the Taylor series after the limit. For example, one exercise from Chapter 1 of the Burden and Faires book asked us to find the rate of convergence for

\lim _{h\rightarrow 0}{\frac {sin(h)}{h}}=1

Since we know that the Talor series for sin(h) about zero is:

sin(h)=h-{\frac {{h}^{3}}{3!}}+{\frac {{h}^{5}}{5!}}-...

dividing by h gives us

{\frac {sin(h)}{h}}=1-{\frac {{h}^{2}}{3!}}+{\frac {{h}^{4}}{5!}}-...

Since 1 is the limit and the next non-zero term is a constant times h², our rate of convergence is on the order of h². --Salley Hyatt 15:23 28 Sept 2008 (OU)

Question regarding extended definition[edit]

In the basic definition, it seems that what one is really after is not the lim, but the lim sup. I think that this would actually be equivalent to the extended definition. Does that make any sense? --128.100.151.121 22:58, 11 February 2006 (UTC)[reply]

Yes, using "lim sup" instead of "lim" would make the basic definition equivalent to the extended definition. But I find it simpler that way, without lim sum. No? Oleg Alexandrov (talk) 00:25, 12 February 2006 (UTC)[reply]

Swapping Pages[edit]

Hello everyone, first of all, I don't understand why the order of converging page was directing everyone to the rate of convergence page. They are clearly two different things. They both measure how fast a sequence converges but they are two different way of measuring convergence. Order of convergence is how fast the terms of a sequence get smaller compared to its previous terms. Rate of convergence measures how fast a sequence/function converges compared to a known sequence/function. In addition, this page previously held information about order of convergence as opposed to rate of convergence. So I created the Order of convergence page and swapped both pages with each other because the info on the page titled rate of convergence was actually about the order of convergence. It took me a while to figure out the confusion as well. It is not easy. I have also added a few more examples. In case you are wondering, I used the Burden & Faires 8th edition. I understand from previous discussions on the rate of convergence page that some of you are already familiar with this book. The two definitions are entirely different with the rate of convergence being defined on page 35 and order of convergence being defined on page 75.
A Real Kaiser 06:38, 28 October 2007 (UTC)[reply]

I am familiar with the Burden & Faires book. I think it is rather non-standard in how it defines things. I undid your edits and I made order of convergence redirect here. Oleg Alexandrov (talk) 15:29, 4 April 2008 (UTC)[reply]

Rate of Convergence "Big Oh" notation.[edit]

There is a page on the "Big Oh" notation in Wikipedia but the page on rates of convergence does not have the definition using the "Big Oh" notation. I propose we add a section to the rates of convergence page that gives a definition using "Big Oh" notation and then link to the "Big Oh" notation page for more information. The definition would be as follows:

Suppose {B_n}_n=1^infinity is a sequence known to converge to zero, and {A_n}_n=1^infinity converges to a number A. If a positive constant k exists with

abs(A_n-A) <= k*abs(B_n), for large n

then we say that {A_n}_n=1^infinity converges to A with rate of convergence O(B_n). It is indicated by A_n = A + O(B_n).

Reference: Numerical Analysis by Richard L. Burden and J. Douglas Faires. —Preceding unsigned comment added by 132.235.37.25 (talk) 18:35, 26 September 2008 (UTC)[reply]

Emphasize O( $1/n^{p}$ ) method[edit]

On top of talking about Big O notation, I would also like to propose a section that emphasizes O( $1/n^{p}$ ) convergence, which is concerned in finding largest number p such that for some sequence {a_n} | ( 1 <= n <= infinity ) which converges to a, a_n=a + O( $1/n^{p}$ ).

This is similar to above definition, but is more useful in some cases.

For example if a_n = $(n+1)/(n^{2})$ , b_n = $(n+1)/(n^{3})$ it can be demonstrated that although both {a_n} and {b_n} converge to zero as n -> infinity, {b_n} converges with rate b_n = 0 + O( $1/n^{2}$ ), while a_n = 0 + O( $1/n$ ). This means b_n goes to zero much faster than a_n.

Reference: Numerical Analysis by Richard L. Burden and J. Douglas Faires.

There are two different contexts in which you can talk about the speed of convergence. The first one is iterative methods where n denotes the iteration number, the second one is numerical integration where n denotes the number of grid points. In the first context, "linear convergence" means roughly O(C⁻ⁿ), and in the second context, "linear convergence" means O(1/n). The first context is what's treated in the article now, the second context is what you're talking about. If you can address all this without confusing the reader, please go ahead. -- Jitse Niesen (talk) 21:38, 29 September 2008 (UTC)[reply]

I wasn't very happy in how the two contexts were separated, so I rewrote bits with the hope that it's clearer now. -- Jitse Niesen (talk) 00:58, 30 November 2008 (UTC)[reply]

Discretization[edit]

I think the information in this article concerning convergence speed for discretization methods is incomplete and a little misleading. The article states (in the second paragraph) that the solution of the discretized problem converges to the solution of the continuous problem as the grid size goes to zero, which is true in a theoretical sense, but practically speaking, the rate of convergence is increased by improving the efficiency and accuracy of the algorithm. Decreasing grid size increases accuracy, but also the number of calculations required, which can very quickly become prohibitive. The section entitled, "Convergence Speed For Discretization Methods" is also somewhat misleading. It is not completely clear whether "n" refers to the number of grid points or the number of grid spaces, although I think the author is referring to the latter. There is no mention of LTE (local truncation error) and GTE (global truncation error), and its effect on convergence. I propose to add the following information: When discretizing a continuous problem, such as a differential equation with an initial value, it is often impractical to increase the rate of convergence by decreasing the step size; a better solution is to improve the efficiency of the algorithm. A better algorithm can give the same accuracy with fewer steps, or better accuracy with the same number of steps. Associated with any algorithm is a local truncation error (LTE), which is the error inherent in the method at each step. The global truncation error (GTE) is the total error, n×LTE, where n is the number of steps (time steps, in an initial value problem). The order of a method is based on its GTE. For instance, the Euler Method of solving IVPs (initial value problems) is of order O(h), meaning the total error is of the same magnitude as "h" (one time step); i.e., the GTE = C×h where C is some constant. The Euler Method uses the Forward Difference approximation (not the most accurate approximation of a derivative) and uses one value of the function at each step. In contrast, The Runge-Kutta 4 Method uses a weighted average of four values of the function at each step and achieves a GTE of order O(h^4). Since h is a fraction, h^4 is a much smaller number than h (h = (b-a)/n, where (b-a) is the total interval and n is the number of subintervals), meaning the total error for the same number of steps is much smaller. Alternatively, the Runge-Kutta Method achieves the same level of accuracy as the Euler Method with much fewer steps, which is what is normally desired in a computer algorithm. Gc953496 (talk) 00:04, 10 September 2012 (UTC)[reply]

Yes and no. Yes, that section is somewhat underformulated. It is only consistent for one-dimensional discretizations. And no, this article is not about comparing discretization or integration methods but strictly on the narrow topic of the definition of "rate of convergence" and probably also "order of convergence". And even in the extended sense your approach is wrong, since one would like to, as you do too in the end, speak about the rate or order of convergence of the same method under refinement of the discretization. Also your error globalization formula is not entirely correct.--LutzL (talk) 09:53, 10 September 2012 (UTC)[reply]

LTE and GTE are discussed in Truncation error (numerical integration) and belong there better than here. A sentence with that link might be good.

Mjmohio (talk) 00:55, 19 September 2012 (UTC)[reply]

Correction of Citation for Convergence of Secant Method;[edit]

The citation which references the convergence order of the secant method is incorrect. In particular, instead of providing a 1.681... decimal value (should be 1.618... at the least), it should note that the order is equal to the golden ratio. Thus, the last part of the citation would be more specific and correct if it were written as:

\alpha =

the golden ratio

={\frac {1+{\sqrt {5}}}{2}}\approx 1.618

. JasonYerardiMath (talk) 16:34, 12 September 2012 (UTC)[reply]

Corrected the numerical value. Please note that the letter phi is already a link to the golden ratio.--LutzL (talk) 15:14, 13 September 2012 (UTC)[reply]

regular root[edit]

“

q may be non-integer. For example, the secant method has, in the case of convergence to a regular root, convergence order φ=1.618....

”

What's a regular root? —Tamfang (talk) 03:58, 19 April 2014 (UTC)[reply]

A root where the function is regular. Since the function has to be at least 1,1-smooth to get this result, it means that the function value is zero and the derivative value different from zero.--LutzL (talk) 06:30, 19 April 2014 (UTC)[reply]

"Finally, the sequence {dk} converges sublinearly."[edit]

This sentence should be strengthened to "converges logarithmically" because (1/n -1/[n+1])/(1/[n+1]-1/[n+2])=(n+2)/n, which converges to 1. I will make the change now.144.35.45.98 (talk) 16:43, 6 November 2015 (UTC)[reply]

Definition of convergence with order $q$ .[edit]

In the definition the meaning of the constant $M$ is not explained. Joel Brennan (talk) 22:20, 23 April 2018 (UTC)[reply]

-- I'd like to add that Nocedal and Wright do not restrict M to be in (0,1). They explicitly say that M is "not necessarily less than 1". Check out page 619 of Nocedal and Wright, 2nd Edition. I'll make this change on the page. Jwd0023 (talk) 16:37, 26 February 2019 (UTC)[reply]

yes — Preceding unsigned comment added by 194.250.110.49 (talk) 09:04, 19 November 2023 (UTC)[reply]

Linear convergence is wrong[edit]

The definition for linear convergence should have a "limsup" not a "lim", because the lim does not necessarily exist. For example, you could have a sequence where the distance to the optimium decreases by a factor of 1/2 at odd iterations, but by 1/3 at even iterations. That would still be linearly convergent with rate 1/2, but the lim does not exist. 2001:16B8:60E3:6200:AC65:A419:7775:7CFF (talk) 11:42, 23 April 2020 (UTC)[reply]

2001 16B8 194.250.110.49 (talk) 08:59, 19 November 2023 (UTC)[reply]

Differening definitions of Convergence Order[edit]

There appears to be some disagreement among sources about whether or not $q=1$ is allowed for the order of convergence. In https://link.springer.com/content/pdf/10.1007/BF00939805.pdf they specify that q>1 (in their paper q is called $\tau$ ). But in http://fourier.eng.hmc.edu/e176/lectures/NM/node3.html it is given as $p\geq 1$ . Should we include a note that says both definitions are in use? The-erinaceous-one (talk) 05:20, 1 August 2020 (UTC)[reply]

I went ahead and included note in the article that clarifies the difference with a reference to https://link.springer.com/content/pdf/10.1007/BF00939805.pdf. The-erinaceous-one (talk) 07:04, 1 August 2020 (UTC)[reply]

limit or limit superior?[edit]

I know the definition of order of convergence with lim sup rather than lim, because in many cases the limit does not exist, and one is mainly interested in getting the inequality |x(n+1) - L| ≤ C |x(n) - L|^q, which BTW can also (and actually should) be used to define convergence of order q. In case the sequence might jump to the exact limit, which happens, e.g., when the bisection method is applied to find a root of the form r = m/2^n, one cannot divide by (x(n) - L). — MFH:Talk 14:02, 28 March 2022 (UTC)[reply]

do we assume that mew I. the first paragraph is any non-zero number?[edit]

what is mew?if it can include 0 and infinity then there can be any order of convergence? is this true? can someone clarify within this page. thanks Jarfuls of Tweed (talk) 11:48, 13 December 2022 (UTC)[reply]

nvm, it's clear, it is just explained farther down Jarfuls of Tweed (talk) 11:54, 13 December 2022 (UTC)[reply]

hijacked citation?[edit]

The terms Q-linear and R-linear are used in; The Big O definition when using Taylor series is used in Nocedal, Jorge; Wright, Stephen J. (2006), Numerical Optimization (2nd ed.), Berlin, New York: Springer-Verlag, pp. 619+620, ISBN 978-0-387-30303-1.

The bolded bit was inserted anonymously in October 2008, leaving the first "used in" with no citation to call its own. (The citation itself was added by Jitse Niesen half a year earlier.) I'm going to remove the addition; if you check the book and find that it also supports The Big O definition when using Taylor series, do add it back. —Tamfang (talk) 23:59, 10 November 2023 (UTC)[reply]