Talk:Exponentiation by squaring

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Untitled[edit]

Maybe you should add a link to the Peasant multiplication in Ancient_Egyptian_multiplication


—Preceding unsigned comment added by 85.250.163.119 (talk) 10:21, 23 August 2008 (UTC)[reply]

I guess we should present the iterative version of this algorithm:

power(x,n) is computed as long as n is not negative
    assign 1 to result
    as long as n is positive
       assign result*x to result if n is odd
       assign x*x to x
       assign the truncated to next integer of n diveded by 2 to n
    return result


Robert Dober 2003-07-05 MEDST

This is the 2^k-ary method with k=1 so it is covered. However, the current presentation hides this; when I have time, I will improve it.Fizban99 (talk) 11:07, 23 January 2010 (UTC)[reply]


Never mind, and there is even an error in my algorithm, really well done Robert, sorry for the noise :(

Robert Dober 2003-07-05 MEDST


Article incomplete[edit]

One aspect of this subject not mentioned is that this binary algorithm doesn't get to the exponent as fast as possible. For example:

1. x^62 => x^31*x^31
2. x^31 => x*x^30
3. x^30 => x^15*x^15
4. x^15 => x*x^14
5. x^14 => x^7*x^7
6. x^7 => x*x^6
7. x^6 => x^3*x^3
8. x^3 => x*x^2
9. x^2 => x*x

but assuming we remember all exponents we've previously created...

1. x*x => x^2
2. x^2*x^2 => x^4
3. x^4*x^4 => x^8
4. x^8*x^8 => x^16
5. x^4*x^16 => x^20
6. x^20*x^20 => x^40
7. x^20*x^40 => x^60
8. x^2*x^60 => x^62

This takes one step less. I have some more information on this if it would be interesting to have here. This isn't exactally exponentiating by squaring, but along the same lines.

---Jay

Yes, this would be interesting! Especially if there is (as I would expect) a quick systematic way to get at the fastest method, then this should be mentioned in the article as a variation. (And even if there's not a fast alternative algorithm, then this fact can still be mentioned.) Is this the addition chain exponentiation that Henrygb linked? -- Toby Bartels 19:58, 7 Aug 2004 (UTC)

Yes, both of these are considered addition chains. It is considered hard to generate minimal addition chains (where one takes the absolute least number of steps required to perform the exponentiation). In other words, there's a ton of setup that goes into generating these minimal chains, but they execute faster than any other addition chain. That makes them useless when only a single exponentiation is being performed, as so much time is spent in the setup. Binary addition chains (which is what this article is about) require no setup and are close enough to minimal to make them useful. Note that in the above example, there was only a one step difference. Such a binary addition chain executes somewhat more slowly than a minimal-length chain, but the time saved in skipping initial setup causes the net speed to be in the binary chain's favor. -- Vesta 09:12, 16 Aug 2004 (UTC)
As an aside, by definition exponentiation by squaring is a variation of addition chain exponentiation, not the other way around. :) -- Vesta 09:14, 16 Aug 2004 (UTC)

Iterative version useful?[edit]

It doesn't use recursion, which increases the speed even further.

Is it useful at all, though? To take the example from the article, x^1000000 requires 41 multiplications (and even fewer recursive calls than that). If you calculate x^1000000 for any x bigger than 1, the relative time spent on calls will be completely negligible. Fredrik | talk 16:40, 17 Feb 2005 (UTC)

If the function is executed every millisecond or so, it's probably very useful. 190.60.93.218 (talk) 17:27, 9 September 2013 (UTC)[reply]
Not likely unless you're using an exceptionally slow computer. I ran a test on a rather slow (1.7 GHz) iMac and found the difference between iterating 41 times in a loop vs. recursing was about 76 nanoseconds. If that happened every millisecond, it would be a performance decrease of less than one hundredth of one percent. Mnudelman (talk) 18:09, 3 January 2016 (UTC)[reply]

Those code snippets are "yowch"[edit]

The following Haskell code implements this algorithm in just 3 lines of code:

power b 0          = 1
power b e | even e =     power (b*b) (e `div` 2)
power b e | odd e  = b * power (b*b) (e `div` 2)

The following code is tail-recursive, so it uses less stack space:

power' r b 0          = r
power' r b e | even e = power'  r    (b*b) (e `div` 2)
power' r b e | odd e  = power' (r*b) (b*b) (e `div` 2)
power    b e          = power'  1     b     e

The above will work for any type of b that the * operator is defined on (any type in the Num class). It can be generalized to use any function in place of multiplication:

power f r b 0          = r
power f r b e | even e = power'  r      (f b b) (e `div` 2)
power f r b e | odd e  = power' (f r b) (f b b) (e `div` 2)

The original function is now power (*) 1, and, say, multiplication can be implemented as power (+) 0. --Ihope127 02:13, 12 April 2006 (UTC) (edited 14:43, 12 April 2006 (UTC))[reply]

Far too many examples[edit]

The article needs to be severely cut down. Most wikipedia math articles could use more examples, but this page has far too many examples. The same ground is covered repeatedly. 165.189.91.148 20:54, 10 May 2006 (UTC)[reply]

Text Concatenation[edit]

This algorithm is unsuitable for text concatenation. Instead, the following algorithm is more efficient:

function repeat ( s, n ) {
  if ( s == "" || n < 1 )
    return "";
  if ( n == 1 )
    return s;
  var res = s;
  for ( var i = 2 ; i < n ; i *= 2 )
    res = res + res;
  return res + res.substr( 0, s.length * n - res.length );
}

Results of binary decomposition assume a slightly different algorithm[edit]

The case for n=1 is missing in the original definition of the algorithm. I realize it's probably obvious, but it should probably be included for completeness. In fact, why bother with restricting the odd case to n>2 at all (since it reduces to 'x' anyway for n=1) and just use what's below instead?

But I have a bigger concern, and that's the fact that the definition given above doesn't result in an expression with the same economy that the binary decomposition does. When I found that the binary decomposition of the expression results in one with fewer multiplications, I was confused. I asked myself why a better definition wasn't given instead, one which results in the same number of multiplications as the binary decomposition, but without the intermediate expression. It seems cumbersome to apply the above definition to get one expression, then go through the manual process of binary decomposition of that to get a more economical expression (which is the way it was done in the examples in the article). So why not use the following definition instead:

For instance, in the case of x100, the expression it produces requries 8 multiplications, while the expression produced by the first definition requires 15 multiplications. (For either definition, an extra multiplication is saved by recognizing that Power(x,n)=x Power(x,1)=x (sorry, typo), and treating it as a fourth case.)

Does anyone have any objections to me updating the definition in the article with this updated one? I think it would reduce some confusion.

--Paul 06:53, 7 March 2007 (UTC)[reply]

Agreed. Also I fail to see the need for multiple recursive definitions on the page. I merged the content.Fizban99 (talk) 10:54, 23 January 2010 (UTC)[reply]

Text application[edit]

I don't understand at all how the text application applies at all to the article. I recommend deletion.

R00723r0 08:58, 15 July 2007 (UTC)[reply]

Exponentiation can be viewed as multiplying exponents. Multiplying can be viewed as adding exponents (i.e., logarithms). The text application leverages this by viewing concatenation as a variant of multiplication. Then the same techniques that are used for exponentiation (and thus that are available to multiplying exponents where "multiplying exponents" is just another name for exponentiation) are now available to multiplying strings. Thus it is quite related and should stay as is. —optikos (talk) 03:48, 17 June 2008 (UTC)[reply]

No R00723r0 is quite correct. This is original research that is not related to exponentiation. Please provide a citation of where it has been published before that shows the relation. Furthermore it is a terrible way to do string repetition as it assumes that the copy/append operations are unit-time, where-as in most implementations they would be linear time over the length of the operands. As a result, in any run-time where the linear component dominates over the fixed start-up cost, or in any run-time over large enough operands (ie asymptotically) this method is *slower* than the naive approach of allocating a buffer and then looping over the repeated string to fill it. I have removed this section pending some sort of citation to show that it is not original research. Amoss (talk) 12:38, 9 April 2009 (UTC)[reply]

Programming[edit]

I think it would be a good idea to stick to one programming language throughout the article, or even just a pseudo-language. Furthermore, I don't think most people can benefit from Javascript examples.

R00723r0 09:03, 15 July 2007 (UTC)[reply]

I agree, please just use pseudo-code. Binary exponentiation is simple enough that we hardly need to get into the syntax of any particular programming language du jour. 24.61.43.18 15:36, 14 August 2007 (UTC)[reply]
Your comment did benefit me ironically, since you said there was a JavaScript implementation out there thus making me search for this implementation and helping me out understand 190.60.93.218 (talk) 17:32, 9 September 2013 (UTC)[reply]
More importantly, getting into the syntax of a particular language adds greater complexity to an algorithm's description, languages limit rather than expand the mathematical approach. In reality the only time to be giving an algorithm description in a specific language is if you are targeting developing the algorithm that language, as Wikipedia is not a StackOverflow this should never happen. JSory (talk) 07:50, 16 March 2023 (UTC)[reply]

Simpler[edit]

Why unroll the recursion manually? This is simpler:

For "n=0" and "n is odd", this is standard recursive exponentiation. Just "n is even" case is optimized, according to:

exe (talk) 18:08, 22 May 2008 (UTC)[reply]

I believe the reason is that the former emphasizes you get precisely lg n calls to the function, facilitating the analysis.Fizban99 (talk) 10:48, 23 January 2010 (UTC)[reply]

Underlying idea: I'd like to add one logical step[edit]

This is the current version:

But I feel that the third equation could be modified to say


That would make this equation a lot clearer, and the next step follows by the fourth equation, instead of having both in one step. It should be noted that the algorithm makes both steps at once for increased performance, though. —Preceding unsigned comment added by 134.96.57.229 (talk) 18:03, 6 February 2011 (UTC)[reply]

I didn't really see the post above, so sorry for that. I'm still all for the change though, especially now that we're talking about "underlying idea" instead of the actual algorithm used lateron. Also, for the logarithmic time you don't need to divide the argument by 2 on every iteration; You merely have to do so every O(1) steps. --134.96.57.229 (talk) 18:09, 6 February 2011 (UTC)[reply]

Underlying idea: Negative Exponent[edit]

Seeing how there is an edit war between those who believe that

And those who know that

Just look at this:

Multiply both sides by and you get

Product with same base allows contraction of exponents, such that

Performing the addition yields

which is true by reflexivity.

The other one only works for or .

--134.96.57.229 (talk) 18:43, 6 February 2011 (UTC)[reply]

Basic method[edit]

Nothing explains better than an example, especially for those of us who aren't especially fast learners or who don't think recursively. So, I'm inserting an example at the beginning that shows what's really going on. It's based on the way I've implemented exponentiation by squaring in many languages (as well as computing terms in a Lucas sequence - same idea, but with subscripts instead of exponents). I hope this makes the basic method clearer. MathPerson (talk) 19:56, 3 April 2017 (UTC)[reply]

This first section of the article says: The method is based on the observation that, for a positive integer n, we have

No, it isn't. You don't compute by raising to the 5th power. You get by squaring . Also, the recursive methods shown only obscure the issue. MathPerson (talk) 23:12, 20 May 2019 (UTC)[reply]

Least significant bit is first[edit]

Despite the language that indicates that bits are examined left to right, the first bit tested in the exponent is the least significant bit, when the test is for whether n is odd or even. This should be fixed in the description, yes? —Quantling (talk | contribs) 22:39, 15 May 2022 (UTC)[reply]

The article is correct for the description of the computation (look at he example for a verification). However, in recursive programming, tests are done from right to left for building the stack, and the reverse order is used for computing. This may be confusing. So, I agree that this part of the article deserves to be rewritten. But this cannot be done by changing few words only. D.Lazard (talk) 09:22, 16 May 2022 (UTC)[reply]
Thank you, good point. The confusion comes in part because many of these algorithms are written for non-ALGOL-like languages. We could replace them with something much simpler like this C/C++ snippet
    double power(double x, int n) {
      if (n < 0) {
        n = -n;
        x = 1 / x;
      }
      double y = 1.0;
      while (n > 0) {
        if (n & 1) // if n is odd
          y *= x;
        x *= x;
        n /= 2; // rounds down
      }
      return y;
    }
... or perhaps the equivalent in something like Python. No recursion, no confusion about the bit order, straightforward. What do you think? —Quantling (talk | contribs) 22:40, 16 May 2022 (UTC)[reply]
I disagree with the use of a specific programming language (see MOS:MATH#Algorithms), and pseudocode must be used as far as possible. In any case, the pseudocode style must be coherent through the article, and this is not even the case in § Basic method
However, things are worse than a single coding-style issue: Several distinct algorithms are presented as if they were the same. Some algorithms use explicitly the binary representation of the exponents, and therefore cannot be easily used one many programming languages; other algorithms use simply a parity test for exponents, and are therefore easier to implement. Also, some algorithms, such a the non-tail recursive one, use the formula
while other algorithms use
In one case, the last multiplication in the case of an odd exponent use the initial x, while a different multiplier is used in the iterative algorithm and the tail-recursive one without accumulators.
So, a major rewrite is needed. As I am not willing to do it now, I will tag the article with a link to this section. D.Lazard (talk) 10:34, 17 May 2022 (UTC)[reply]
I have rewritten section § Basic method. Nevertheless I have left the tag {{confusing}}, since much work is needed for clarifying the other sections. Also, I have removed the example, since I do not know how to give an example that is more informative than the given explanations. D.Lazard (talk) 16:27, 17 May 2022 (UTC)[reply]
Excellent, thank you! —Quantling (talk | contribs) 18:10, 17 May 2022 (UTC)[reply]
I tried to extend your excellent prose a little. Please edit as you see fit. —Quantling (talk | contribs) 18:48, 17 May 2022 (UTC)[reply]

Use of the binary representation of the exponent[edit]

Although it contains some interesting content, I'll revert the last edit [1] by Quantling. The reasons are as follows.

The main features of this edit is to introduce from the beginning the binary representation of the exponent, and to use the technical language of language theory ("prefix" and "suffix"). This is not useful for section § Basic method, and therefore could be confusing or too technical for most readers.

IMO, the content added by Quantling could be the starting point of a lacking section that could be called as the heading of this thread. This section summarized as follows: In § Basic method, the bits of the exponent are implicitly used by starting with the less significant one. If one can access to these bits in the opposite order (either with a precomputation or using primitive of assembly language), one may obtain another algorithm that requires exactly the same number of arithmetic operations (this is the algorithm used in the example that I have removed by my edit). Then the pseudocode of this algorithm should be presented. It is also possible to compare this pseudocode with a variant of the algorithm of § Basic method that uses a bit access.

Such a section could be also useful for making sections 3 and 4 easier to understand. D.Lazard (talk) 16:41, 18 May 2022 (UTC)[reply]

@D.Lazard: If you have the chance to follow through on this vision, it would be much appreciated! —Quantling (talk | contribs) 20:55, 18 May 2022 (UTC)[reply]

Magma[edit]

Switching in "magma" for "semigroup" doesn't appear to work. While the former is, in some sense, a generalization of the latter, it appears that the binary operator for a magma is not required to be associative. As such, an expression such as x3 does not appear to be well defined because x • (xx) and (xx) • x are not necessarily equal. In particular, it appears that exponentiation by squaring is not going to be meaningful for a magma. —Quantling (talk | contribs) 12:50, 6 October 2023 (UTC)[reply]

The IP user that I reverted was aware of that, as they switched to "magmas where the operation has the property of power associativity". This was not the reason of my revert. The reason is that few people know this structure, and, AFAIK, there are no common examples of exponentiation by squaring in a structure that is not a semigroup. Exponentiation of matrices is an important example where the multiplication is not the operation of a group, and elliptic curves are examples that are not rings and where the method is used. So, there is no reason for such a generalization in the lead. However, I am not opposed to add somewhere in the body the remark that "magmas where the operation has the property of power associativity" are the most general structures where the method works. D.Lazard (talk) 13:51, 6 October 2023 (UTC)[reply]
Thank you for clarifying that. I added a mention of magma and made some other edits to the article. As always, please feel free to edit further, undo, comment here, .... Thank you —Quantling (talk | contribs) 16:02, 6 October 2023 (UTC)[reply]