User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Pollard Rho loop analysis

Fri May 21, 2021 5:35 pm

In thread "gmp (Gnu multiple precision arithmetic lib) 5x faster on intel i7 than on Pi400 (two samples)"
viewtopic.php?f=33&t=311893
factorization algorithm based on Pollard Rho method with arbitrary precision arithmetic was presented.

In this thread I want to look into the loops that are built by function "x->(x^2+a) mod n", starting with n prime, and later n=p*q product of primes.

I said "to look", by that I mean using graphviz for vizualisation (sudo apt install graphviz).
This is basic small C-program for creating graphviz dot file:

Code: Select all

$ cat loop.c 
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
  unsigned n=atoi(argv[1]);
  unsigned a=atoi(argv[2]);
  unsigned x;

  printf("digraph D {\n");
  printf("size=8 8\n");
  for(x=0; x<n; ++x)
    printf("%u->%u\n", x, (x*x+a)%n);
  printf("}\n");
}
$

It is used by bash script "mak" fot creating digraph for parameters n and a, and display the result:

Code: Select all

$ cat mak
#!/bin/bash
./loop $1 $2 > $1_$2.dot
neato -Tsvgz $1_$2.dot > $1_$2.svgz
eog $1_$2.svgz
$
For the kind of digraphs created, force-based "neato" is the best layout algorithm, and not the standard "dot" or any other of the graphviz layout program alternatives.

Taking prime 83 as example, this is n=83 a=0, showing big diameter cycles:
83_0.svgz.png
83_0.svgz.png
83_0.svgz.png (77.06 KiB) Viewed 1743 times

n=83 and a=1 leads to very small diameter circles, but some very long paths to the circles:
83_1.svgz.png
83_1.svgz.png
83_1.svgz.png (62.83 KiB) Viewed 1743 times

Finally n=83 and a=15 leads to mixture of long diameter circles and long paths:
83_15.svgz.png
83_15.svgz.png
83_15.svgz.png (70.47 KiB) Viewed 1743 times
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

ejolson
Posts: 7574
Joined: Tue Mar 18, 2014 11:47 am

Re: Pollard Rho loop analysis

Fri May 21, 2021 5:42 pm

HermannSW wrote:
Fri May 21, 2021 5:35 pm
Finally n=83 and a=15 leads to mixture of long diameter circles and long paths:
The graph is supposed to look like

ρ

right? I think I see that happening.

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Fri May 21, 2021 5:55 pm

ejolson wrote:
Fri May 21, 2021 5:42 pm
The graph is supposed to look like

ρ

right? I think I see that happening.
Yes, you are right:
https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm


I looked into bigger "n"s, and that was the reason I did choose ".svgz" format for the generated graphs:
SVG (scalable vector graphics) is scalable, and .svgz uses compression as well to keep size low.

This is graph for biggest prime n=9973 below 10000. As you can see there are many circles:
9973_0.svgz.png
9973_0.svgz.png
9973_0.svgz.png (71.85 KiB) Viewed 1717 times

Zooming in to up to 2000% with gnome image viewer "eog" shows the fine details, parts from two big circles nearly touching each other:
9973_0.svgz.2000%.png
9973_0.svgz.2000%.png
9973_0.svgz.2000%.png (122.3 KiB) Viewed 1717 times

I wrote a small program to analyze the circles (it allocates two arrays of size 4*n, so n cannot be too large):
https://gist.github.com/Hermann-SW/83de ... 641c26c590

Code: Select all

$ ./analyze 5 0
1 1 
0: 1 0 1
1: 2 0 1
2: 2 2 3
3: 2 2 3
4: 2 1 2
c 2
m 2 1.0
l 1 1.0
lm 3 2.0
lm2 3
$
The first line says that two circles of length 1 are found (self-loops from number to itself, here 0->0 and 1->1 for "x->(x^2+0) mod 5".
The rows below tell for each number in range 0..n-1 (before the colon) values l, m and l+m.
"c 2" mean two circles found.
Then for l (length of circle), m (length of path until number on cycle reached) and lm (lm[x]=l[x]+m[x]) are reported.
First number is overall maximum, 2nd number is average over all numbers 0..n-1.
Finally lm2 reports lm[2] for start number "2" which is default start value for Pollard Rho method (which has default "a=1").


Using the analysis for n=9973 and a=0 we get more information about the image shown:

Code: Select all

$ ./analyze 9973 0 | grep -v :
1 1 92 92 92 92 92 92 276 276 92 276 276 92 276 276 92 2 6 
c 19
m 2 1.2
l 276 213.9
lm 278 215.1
lm2 94
$
There are 19 circles in total, 6 of which have length 276 and 9 of which have length 92.
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Fri May 21, 2021 6:18 pm

Now lets look at N=997 and a=0:
997_0.svgz.png
997_0.svgz.png
997_0.svgz.png (120.63 KiB) Viewed 1702 times

As can be seen there are three big circles, and analysis says they are of length 82:

Code: Select all

$ ./analyze 997 0 | grep -v :
1 1 82 82 82 2 
c 6
m 2 1.2
l 82 81.0
lm 84 82.2
lm2 84
$
Now doing prime factorization for n-1=996 = 2^2 * 3 * 83, biggest prime factor 83 is just 1 bigger than longest circles found in analysis. This phenomenon was true for nearly all "n"s I tried with a=0.

For finding biggest circles, I looked at primes p, where (p-1)/2 is also prime (safe primes):
https://oeis.org/A005385

This is graph for safe prime n=107 (a=0):
107_0.svgz.png
107_0.svgz.png
107_0.svgz.png (80.89 KiB) Viewed 1702 times
Analysis shows a single big circle of length 52, and 107=2*53+1:

Code: Select all

$ ./analyze 107 0 | grep -v :
1 1 52 
c 3
m 1 0.5
l 52 50.6
lm 53 51.1
lm2 53
$

So likely "a=0" is not a good choice for Pollard Rho factorization algorithm.
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Fri May 21, 2021 6:38 pm

Now looking at RHO-100 1st RSA factoring challenge number:
https://en.wikipedia.org/wiki/RSA_numbers#RSA-100

The factors p and q are no safe primes, but both [p-1 and q-1] have a very long biggest prime factor (18/26 digits):
$ ./factorize 37975227936943673922808872755445627854565536638198
37975227936943673922808872755445627854565536638198: 2 3167 3613 587546788471 3263521422991 865417043661324529
$
$ ./factorize 40094690950920881030683735292761468389214899724060
40094690950920881030683735292761468389214899724060: 2 2 5 41 2119363 602799725049211 38273186726790856290328531
$
$ bc -q
n=1522605027922533360535618378132637429718068114961380688657908494580122963258952897654000350692006139
p=37975227936943673922808872755445627854565536638199
q=40094690950920881030683735292761468389214899724061
n-p*q
0
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Sun May 23, 2021 8:30 pm

All Pollard Rho algorithms I have seen sofar use "x->(x^2+a) mod n" with different, but fixed values for a.
We have seen that a=0 is likely not a good choice.

Some "a"s are good for some inputs, but not that good for other inputs.

I found a method to use a mixture of a1, a2, ..., ak, use:
(x->(x^2+a1) mod n) + n
then
(x->(x^2+a2) mod n) + 2*n
then
...
finally
x->(x^2+ak) mod n
and then repeat with a1, a2, ...

For multiplication, difference building and gcd() computation, the multiple of n offsets don't care.
I need to rewrite analysis.c tool to report numbers for using a1, a2, ..., ak ...
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Thu Jun 10, 2021 10:15 pm

In other thread posting simple hare and tortoise Pollard Rho algorithm was started in parallel on 9 cores with different values for "a":
viewtopic.php?f=33&t=311893&p=1876289#p1876289

Instead of having to wait for 37 seconds to get 31digit number factored with default value "a=1", first program stopped with factorization after only 8 seconds:
multi-factorize.png
multi-factorize.png
multi-factorize.png (46.72 KiB) Viewed 1412 times

Question now is how to choose set of 24 "a" values (I have access to a machine with 24 physical cores) so that first factorization gets reported fast. It needs to be considered that algorithm gets recursively called with "a+1" in case "gcd(hare-tortoise,n)==n":
https://gist.github.com/Hermann-SW/bfa0 ... rize-c-L73

Because of that choosing successive numbers for "a" as in above example is likely not good ...
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Thu Jun 10, 2021 11:02 pm

Next experiment, factoring (2^61-1)^2 that took 1534 seconds on i7.
I started 24 programs in background, with values "1 and primes <89" for a.
First program finished in 178 seconds only!
Default a=1 took 1631 seconds.
Last finisher was a=53 in 2240 seconds:
# for i in 1 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83; do ./factorize 5316911983139663487003542222693990401 $i & done
# 5316911983139663487003542222693990401(37): 2305843009213693951 2305843009213693951
[239379195] 178081430292ns
5316911983139663487003542222693990401(61): 2305843009213693951 2305843009213693951
[381929703] 296462510871ns
5316911983139663487003542222693990401(79): 2305843009213693951 2305843009213693951
[564229691] 463837995445ns
5316911983139663487003542222693990401(2): 2305843009213693951 2305843009213693951
[916265383] 727448510261ns
5316911983139663487003542222693990401(29): 2305843009213693951 2305843009213693951
[966671113] 730703246286ns
5316911983139663487003542222693990401(11): 2305843009213693951 2305843009213693951
[1060573965] 791627178335ns
5316911983139663487003542222693990401(73): 2305843009213693951 2305843009213693951
[1158177900] 861668098355ns
5316911983139663487003542222693990401(71): 2305843009213693951 2305843009213693951
[1195387796] 929350141287ns
5316911983139663487003542222693990401(83): 2305843009213693951 2305843009213693951
[1246192557] 929880773927ns
5316911983139663487003542222693990401(23): 2305843009213693951 2305843009213693951
[1335540845] 1009334758484ns
5316911983139663487003542222693990401(17): 2305843009213693951 2305843009213693951
[1432697673] 1073120510105ns
5316911983139663487003542222693990401(59): 2305843009213693951 2305843009213693951
[1448906908] 1096388994755ns
5316911983139663487003542222693990401(13): 2305843009213693951 2305843009213693951
[1801257367] 1346426834616ns
5316911983139663487003542222693990401(7): 2305843009213693951 2305843009213693951
[1809569828] 1347825243786ns
5316911983139663487003542222693990401(31): 2305843009213693951 2305843009213693951
[1861347840] 1390923977909ns
5316911983139663487003542222693990401(47): 2305843009213693951 2305843009213693951
[1880798495] 1400540741266ns
5316911983139663487003542222693990401(67): 2305843009213693951 2305843009213693951
[1900420809] 1415451734862ns
5316911983139663487003542222693990401(41): 2305843009213693951 2305843009213693951
[2003126736] 1493349090289ns
5316911983139663487003542222693990401(43): 2305843009213693951 2305843009213693951
[2133065076] 1589138138134ns
5316911983139663487003542222693990401(1): 2305843009213693951 2305843009213693951
[2163790305] 1631049634692ns
5316911983139663487003542222693990401(19): 2305843009213693951 2305843009213693951
[2212260579] 1648327735182ns
5316911983139663487003542222693990401(5): 2305843009213693951 2305843009213693951
[2511875037] 1945819858899ns
5316911983139663487003542222693990401(3): 2305843009213693951 2305843009213693951
[2932193235] 2190418722150ns
5316911983139663487003542222693990401(53): 2305843009213693951 2305843009213693951
[3010950498] 2240799600863ns
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Fri Jun 11, 2021 6:06 am

I tried to factor 54 decimal digit square of Mersenne prime M89=(2^89-1).
There are 46 primes below 200, and I chose 1 and 65537 as values for "a" in addition (see very right of long lines).
I had to kill all processes now, after more than 6 CPU hours runtime on each, with none of the 48 programs having reported factorization:

Code: Select all

# ps -wl | grep "factorize [0-9]*6321" | head -3
R     0 116151 111188  6888   624 pts7  23:06 06:20:57 ./factorize 383123885216472214589586755549637256619304505646776321 1
R     0 116152 111188  6888   632 pts7  23:06 06:21:54 ./factorize 383123885216472214589586755549637256619304505646776321 2
R     0 116153 111188  6888   624 pts7  23:06 06:20:36 ./factorize 383123885216472214589586755549637256619304505646776321 3
# ps -wl | grep "factorize [0-9]*6321" | tail -3
R     0 116196 111188  6888   684 pts7  23:06 06:21:54 ./factorize 383123885216472214589586755549637256619304505646776321 197
R     0 116197 111188  6888   680 pts7  23:06 06:41:54 ./factorize 383123885216472214589586755549637256619304505646776321 199
R     0 116198 111188  6888   684 pts7  23:06 06:19:31 ./factorize 383123885216472214589586755549637256619304505646776321 65537
#

So better algorithm and/or better "set of 48 'a's" is needed to factor M89^2 ...
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Fri Jun 11, 2021 11:55 am

I did add original brent cycle finding algorithm as option to factorize.c:
https://gist.github.com/Hermann-SW/bfa0 ... /revisions

Now simple help is printed when run without arguments:

Code: Select all

# ./factorize 
factorize [n [a] [floyd|brent]]
  Determines 1st factor for composite n with timing;
  hangs indefinitely for prime n.
# 

"brent" option is typically more than double as fast than floyd.
I did repeat the 24 parallel programs run to factor (2 ^61-1)^2.
First solution was reported after 68 seconds for brent, compared to 178 seconds for floyd.
The order of completion for the different "a" values is identical for first three entries only (68/114/180 vs. 178/296/463 seconds):
# for i in 1 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83; do ./factorize 5316911983139663487003542222693990401 $i brent & done
# 5316911983139663487003542222693990401(37): 2305843009213693951 2305843009213693951
[239379200] 68965957453ns (brent)
5316911983139663487003542222693990401(61): 2305843009213693951 2305843009213693951
[381929728] 114078558342ns (brent)
5316911983139663487003542222693990401(79): 2305843009213693951 2305843009213693951
[564229696] 180483982185ns (brent)
5316911983139663487003542222693990401(71): 2305843009213693951 2305843009213693951
[597693920] 189778132962ns (brent)
5316911983139663487003542222693990401(11): 2305843009213693951 2305843009213693951
[707049312] 203749813136ns (brent)
5316911983139663487003542222693990401(2): 2305843009213693951 2305843009213693951
[916265408] 248934910426ns (brent)
...
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Sun Jun 13, 2021 2:56 pm

There are two reasons why "Brent cycle finding" is faster that "Floyd cycle finding" in factorize.c:
  1. for Floyd there are 3 f() computations per loop [f(x) and f(f(y))]
    https://gist.github.com/Hermann-SW/bfa0 ... rize-c-L56
    while for Brent there is one "f(x)" inside loop
    https://gist.github.com/Hermann-SW/bfa0 ... ize-c-L112
    and one matching outside the loop
    https://gist.github.com/Hermann-SW/bfa0 ... ize-c-L135
  2. the costly "gcd()" is computed for each loop for Floyd, but only every (when k becomes >=32) 32nd loop for Brent
    https://gist.github.com/Hermann-SW/bfa0 ... ize-c-L120
So ratio 3/2 for f() computation and 32/1 for gcd() computation per loop for Floyd compared to Brent, resulting in Brent being more than 2x faster than Floyd.


While Brent is faster than Floyd for Pollard Rho, it is no magic.
I did start 40 "Brent" factorize programs Friday night (on CPUs with 24 cores/48 threads), with values for "a" of 1..20, 101..110 and 1001..1010.
None of them has completed yet after 40 hours of runtime each, to factor 54 decimal digit number (2^89-1)^2 ...
2021-06-13-163323_800x480_scrot.png
2021-06-13-163323_800x480_scrot.png
2021-06-13-163323_800x480_scrot.png (111.82 KiB) Viewed 1278 times

Code: Select all

$ bc -q
(2^89-1)^2
383123885216472214589586755549637256619304505646776321

P.S:
For a prime p: 2^(p-1) mod p = 1
For product of two primes n=p*q: 2^((p-1)*(q-1)) mod n = 1
"bc -q" output:

Code: Select all

print n, "=", p, "*", q, "\n"
1081=23*47
(2^(p-1))%p
1
(2^(q-1))%q
1
(2^(n-1))%n
165
(2^((p-1)*(q-1)))%n
1
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Tue Jun 15, 2021 11:11 pm

I wanted to know more about the cycle lengths of "f(x)=(x^2+a) mod n", for simplicity with a=0. Instead of composite numbers n=p*q being product of two primes, for analysis looking at primes n. This is diagram for "f(x)=(x^2+0) mod 31":

Code: Select all

30: 2 3 5 | 1 1 4 4 2 4 
31_0.svgz.jpg
31_0.svgz.jpg (20.45 KiB) Viewed 1212 times

I used original "factorize.c" from gmplib to factor "p-1" for prime "p", and then used "analyze.c" shown earlier in this thread to determine the cycle lengths:
$ for p in 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97;\
> do for t in `./factorize.orig $(($p-1))`; do echo -n "$t "; done;\
> echo -n " | "; ./analyze $p 0 | grep -v : | head -1; done
2: 2 | 1 1
4: 2 2 | 1 1
6: 2 3 | 1 1 2
10: 2 5 | 1 1 4
12: 2 2 3 | 1 1 2
16: 2 2 2 2 | 1 1
18: 2 3 3 | 1 1 6 2
22: 2 11 | 1 1 10
28: 2 2 7 | 1 1 3 3
30: 2 3 5 | 1 1 4 4 2 4
36: 2 2 3 3 | 1 1 6 2
40: 2 2 2 5 | 1 1 4
42: 2 3 7 | 1 1 3 6 2 3 6
46: 2 23 | 1 1 11 11
52: 2 2 13 | 1 1 12
58: 2 29 | 1 1 28
60: 2 2 3 5 | 1 1 4 4 4 2
66: 2 3 11 | 1 1 10 10 10 2
70: 2 5 7 | 1 1 12 4 12 3 3
72: 2 2 2 3 3 | 1 1 6 2
78: 2 3 13 | 1 1 12 12 12 2
82: 2 41 | 1 1 20 20
88: 2 2 2 11 | 1 1 10
96: 2 2 2 2 2 3 | 1 1 2
$

Most times biggest prime factor of "p-1" (minus 1) is a cycle length of "p":

Code: Select all

52: 2 2 13  |  1 1 12

Sometimes biggest cycle length of "p" is smaller (11) than biggest prime factor of "p-1":

Code: Select all

46: 2 23  |  1 1 11 11 

And sometimes it is bigger (6):

Code: Select all

72: 2 2 2 3 3  |  1 1 6 2 

Not that easy to fully describe how the relation is ...


P.S:
Now nearly 40*100=4000 CPU hours completed since last Friday night, still Brent cycle finding Pollard Rho algorithm has not factorized 54 decimal digit number (2^89-1)^2 for any of the 40 "a" values:
(on workstation with two 2.6GHz Intel® Xeon® Gold 6126 processors with 24 cores (48 hardware threads) in total)
nearly_4000_CPU_hours.jpg
nearly_4000_CPU_hours.jpg
nearly_4000_CPU_hours.jpg (235.68 KiB) Viewed 1224 times
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Wed Jun 16, 2021 6:59 pm

I got more insight into the number of cycles and maximal cycle lengths!
And I used the Online Encyclopedia of Integer Sequences (oeis.org) for that.

#cycles can be determined by copy+paste of the numbers reported into oeis.org input field:

Code: Select all

$ for((i=3; i<=100; ++i)); do ./analyze $i 0 | grep -v : | head -2 | tail -1 | cut -f2 -d\ ; done
There is exactly one matching sequence A023153:
https://oeis.org/search?q=2+2+2+4+3+2+3 ... &go=Search

Maximal cycle length:

Code: Select all

$ for((i=1; i<=88; ++i)); do ./analyze $i 0 | grep -v : | head -1 | sed "s/ /,/g" | python -c "print(max(input()))"; done
Again exactly one matching sequence A279186:
https://oeis.org/search?q=1+1+1+1+1+1+2 ... &go=Search
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Tue Jun 22, 2021 11:02 pm

Factorization is still running, 40 processes with 263.5 CPU hours each is 10540 CPU hours in total.

I always thought that maximal cycle length in Pollard Rho is of order sqrt(n), but that is wrong in general.
I learned from sequence A248218 (period of x->(x^2+1) mod n with start value 0)
https://oeis.org/A248218
example a(357911) = 54670, that powers of 71 have long cycles.
I used analyze.c to compute a(71^i) with 2<=i<=6.
And the period length is 770*71^(i-2) for n=71^i.
770/71^2=0.15274747073993255306, so period for those composite numbers is linear.

I used analyze.c
https://gist.github.com/Hermann-SW/83de ... 641c26c590
to further determine all start values that end up on start value 0 cycle (that numbered 1).
A(n) being the number of thos start values, A(71^i) = 3711 * 71^(i-2) for i=2..5:

Code: Select all

$ ./analyze $((71*71)) 1 | grep ": 1 " | wc --lines
3711
$
Now 3711/71^2=0.73616345963102559016, so majority of start values end on start value 0 cycle.

I submitted new results to oeis.org, the draft is in review currently:
https://oeis.org/draft/A248218

Computing a(71^5) with analyze.c did use 26.9g ram during computation.
For computing a(71^6) I did not have access to a computer with 26.9*71=1910GB ram, so I used new period.c
https://gist.github.com/Hermann-SW/eeb0 ... 761ab38998
that only determines period for cycle starting at some given start value.
Took less than an hour (3528 seconds) to compute a(128,100,283,921) = a(71^6) = 19,566,994,370 on Intel i7:

Code: Select all

$ ./period $((71*71*71*71*71*71))
n=128100283921 a=1 s=0: 19566994370 [71536325699] 3528148502656ns
$ 

Let P be period computed. In case start value is on cycle already, then variable x moved P and variable z moved 2*P steps, 3*P steps in total (58.7 billion steps for previous example, those needing 1 CPU hour to compute and compare for equality (hare z and tortoise x)).


P.S
> I submitted new results to oeis.org, the draft is in review currently:
> https://oeis.org/draft/A248218
>
Submission was approved after review process:
  • results under Comments section
  • analyze.c and period.c in Links section
  • explanation in Prog section under "(C)"
https://oeis.org/A248218
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Tue Jun 29, 2021 6:11 pm

I thought that linear cycle length (15%) for n=71^i with 2 <= i <= 5 is bad for runtime of Pollard-Rho factorization algorithm. The opposite is true, every value on start value 0 cycle has gcd of 71 to the value after applying "x -> (x^2 + 1) mod n" exactly 11 times, so factorization completes quickly.
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Tue Jul 13, 2021 12:18 pm

HermannSW wrote:
Fri May 21, 2021 5:35 pm
...
This is basic small C-program for creating graphviz dot file:

Code: Select all

$ cat loop.c 
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
  unsigned n=atoi(argv[1]);
  unsigned a=atoi(argv[2]);
  unsigned x;

  printf("digraph D {\n");
  printf("size=8 8\n");
  for(x=0; x<n; ++x)
    printf("%u->%u\n", x, (x*x+a)%n);
  printf("}\n");
}
$

It is used by bash script "mak" for creating digraph for parameters n and a, and display the result:

Code: Select all

$ cat mak
#!/bin/bash
./loop $1 $2 > $1_$2.dot
neato -Tsvgz $1_$2.dot > $1_$2.svgz
eog $1_$2.svgz
$
...
I played with this again, and sometimes after changes in loop.c, I forgot to compile before calling "mak n a" and wodered why things did not change.

One option would be to add Makefile, but then 3 files are needed (Makefile, loop.c and mak).
Instead I reduced everything into single file "loop2":
https://gist.github.com/Hermann-SW/004f ... 2a37eef09f

It is combined bash script and C file.
This part is the excutable bash part:

Code: Select all

#!/bin/bash

#if 0
if [ $0 -nt /tmp/$0 ]; then gcc -xc -Wall -Wextra <(tail -n+2 $0) -o /tmp/$0; fi
/tmp/$0 $1 $2 > $1_$2.dot
neato -Tsvg $1_$2.dot > $1_$2.svg
eog $1_$2.svg
exit
#endif
...
For executable name "loop2" it tests whether "/tmp/loop2" has not been compiled, or whether "loop2" timestamp is newer than that of "/tmp/loop2". If so, it extracts the C portion of "loop2" as file with "<(tail -n+2 $0)" and compiles it (as "C" with "-xc").
I changed output format from compressed SVG to just SVG, because browsers can deal with SVG.
For displaying the graph still gnome viewer is used, but opening the generated SVG in browsers allows to search for strings in the graph labels with CTRL-F.

For supporting browser search, all node labels get pre- and suffixed with non-break space Unicode U+00A0 character (that keeps labels "looking" the same). When calling "lab()" more than once in a print statement, separate buffers are needed:

Code: Select all

char nbsp[3] = { 0xC2, 0xA0, 0x00 };
char buf[10][100];
unsigned int buc=0;
char *lab(unsigned u)
{
  if (++buc==10)  buc=0;
  sprintf(buf[buc], "\"%s%u%s\"", nbsp, u, nbsp);
  return buf[buc];
}
Now searching for label "8" in browser press CTRL+F, and then <STRG+SHIFT>a0<ENTER>8<STRG+SHIFT>a0<ENTER><ENTER>.

This part of C code created the Graphviz graph, and creates different node box shape for labels "x", when difference of "x" and "n-x" has greatest common divisor with "n" greater than 1:

Code: Select all

  printf("digraph D {\n");
  printf("size=\"8 8\"\n");

  for(x=0; x<n; ++x)
    if (gcd(dif(x, n-x), n) > 1)
      printf("%s [shape=box]\n", lab(x));

  for(x=0; x<n; ++x)
    printf("%s->%s\n", lab(x), lab((x*x+a)%n));

  printf("}\n");

Executing this command ...

Code: Select all

pi@raspberrypi4B:~/rho $ ./loop2 91 1

... displays this graph:
2021-07-13-110347_800x480_scrot.png
2021-07-13-110347_800x480_scrot.png
2021-07-13-110347_800x480_scrot.png (55.15 KiB) Viewed 533 times

If you want to play with loop2, make sure that graphviz is installed ("sudo apt install graphviz").
For using the many features of Graphviz, here is guide:
http://graphviz.org/pdf/dotguide.pdf


Now its time to investigate something new with this tool ...
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Tue Jul 13, 2021 9:01 pm

HermannSW wrote:
Tue Jul 13, 2021 12:18 pm
For using the many features of Graphviz, here is guide:
http://graphviz.org/pdf/dotguide.pdf


Now its time to investigate something new with this tool ...
I played with modified tools and tried non-standard polynomial transformations.
Looking at n=11*19=209 the rectangular box shapes were hard to spot when not zooming in.
So this print statement tells Graphviz to fill the box with lightgrey color:

Code: Select all

printf("%s [shape=box,style=filled,bgcolor=lightgrey]\n", lab(x));

The first experiments went into one direction compared to the trees seen sofar in this thread:
2021-07-13-222154_800x480_scrot.png
2021-07-13-222154_800x480_scrot.png
2021-07-13-222154_800x480_scrot.png (89.85 KiB) Viewed 484 times

Finally I did a completely different approach, resulting in self-loop 1->1 as the only cycle (9 o'clock of center):
2021-07-13-222351_800x480_scrot.png
2021-07-13-222351_800x480_scrot.png
2021-07-13-222351_800x480_scrot.png (54.66 KiB) Viewed 484 times

Will see whether the approaches will provide advantages for factorization of n ...
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

User avatar
HermannSW
Posts: 4258
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Pollard Rho loop analysis

Thu Jul 29, 2021 8:18 am

For prime p and a in 0..p-1:

Code: Select all

a^(p-1) = 1 mod p

Today I learned that arbitrary precision binary calculator "bc" is not capable of computing above equation for really big exponents. I took smaller prime factor of RHO-100:
https://en.wikipedia.org/wiki/RSA_numbers#RSA-100

And got error using "bc":

Code: Select all

$ bc -q
p=37975227936943673922808872755445627854565536638199
2^(p-1) % p
Runtime error (func=(main), adr=8): exponent too large in raise
$ 

After building gmplib
https://gmplib.org/#DOWNLOAD

building tool "pexpr" in demos ("make pexpr") allows to compute that in less than a second!

Code: Select all

$ export p=37975227936943673922808872755445627854565536638199
$ ./pexpr "2^($p-1) % $p"
1
$ ./pexpr "2^($p-2) % $p"
18987613968471836961404436377722813927282768319100
$ 

P.S:
Playing with Rho100 prime factors:

Code: Select all

$ a="(2*3*5*7*11*13*17*19*23*29*31*37*41*43*47*53*59*61*67*71*73*79*83*89*97)"
$ p=37975227936943673922808872755445627854565536638199
$ q=40094690950920881030683735292761468389214899724061
$ ./pexpr "$a^(($p-1)*($q-1)) % ($p*$q)"
1
$
https://stamm-wilbrandt.de/2wheel_balancing_robot
https://stamm-wilbrandt.de/en#raspcatbot
https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://github.com/Hermann-SW/raspiraw
https://stamm-wilbrandt.de/en/Raspberry_camera.html

Return to “C/C++”