jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Something the Pi is not good at - calculating Pi

Wed May 15, 2013 3:39 pm

Well, you wouldn't expect it to be very good at number crunching but, considering its name, I thought that I should at least have a go at calculating Pi.

I resurrected a very old C program of mine for the job and compiled it on the Pi. That went smoothly but it was very portable C that had previously been compiled on multiple very different architectures.

I set it to calculate 1,000,000 places. The program produces intermediate reports and completion estimates. The completion estimates are accurate. I didn't let the program run to completion but the estimates were about 154.5 hours, that's nearly a week of continuous calculation. This 2.5GHz lap top was estimating 3h13m, nearly 48 times faster.

The algorithm is not very good but it does get the answer right. It is one that I figured out myself in school many years ago and implemented in Fortran. I did not run to 1,000,000 places then, the cash register (*) would not have been up to it. I only ran to 500 places.

Many years later but still many years ago, I ported the program to C and ran to completion for 1,000,000 places on an IBM RS/6000. This took a month. So, the Pi beats that old mini-computer and it is a lot smaller, cheaper to buy, and cheaper to run.

(*) It was an NCR. 32k words of memory. Each word was 24 bits or 4 6 bit characters. This machine predated bytes and ASCII. It had no disc. We fed programs in on paper tape or punched cards and saved them to tape (which required messages to the operator to mount suitable tapes).

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24129
Joined: Sat Jul 30, 2011 7:41 pm

Re: Something the Pi is not good at - calculating Pi

Wed May 15, 2013 3:53 pm

Sounds a bit slow - were you using the same compiler and options on both devices? I'd be interested to see the same sort of test done using C using the GCC compiler of both platforms with full optimisation on both. Is the algorithm dependant of having lots of memory?
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Re: Something the Pi is not good at - calculating Pi

Wed May 15, 2013 4:11 pm

I didn't go to any trouble optimising it. Just compile and run. The memory requirements are modest, just a few MB. Only a tiny amount of float arithmetic but huge amounts of integer arithmetic. The lap top version was compiled with Microsoft Visual Studio.

You are right, the speed difference is too big. The Pi's processor is 700MHz and the lap top 2.GHz. The two very different architectures won't necessarily achieve the same work per clock cycle but I would not expect them to be so far apart either. I'll try to get the time to go back and investigate.

The program makes a lot of use of long long int (64 bit). Do they perform reasonably on the Pi? I can set the program to use long int but that will cause it to do a lot more calculations.

Ravenous
Posts: 1956
Joined: Fri Feb 24, 2012 1:01 pm
Location: UK

Re: Something the Pi is not good at - calculating Pi

Wed May 15, 2013 4:44 pm

That's a really intriguing difference. Does gcc (or whatever compiler you are using) have any profiling options at all, to identify how many calls are made to each routine? (I'm really not knowledgeable about the detail of gcc at all.)

Might be down to word length differences or something between the two platforms. You already said it uses little memory, so it's probably not related to addressing differences. I wonder does long long int really mean 64 bits on all platforms or are there compiler diferences... or does the compiler not optimise in the same way on different platforms... one is RISC and the other is CISC or something...

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24129
Joined: Sat Jul 30, 2011 7:41 pm

Re: Something the Pi is not good at - calculating Pi

Wed May 15, 2013 6:06 pm

jwlawler wrote:I didn't go to any trouble optimising it. Just compile and run. The memory requirements are modest, just a few MB. Only a tiny amount of float arithmetic but huge amounts of integer arithmetic. The lap top version was compiled with Microsoft Visual Studio.

You are right, the speed difference is too big. The Pi's processor is 700MHz and the lap top 2.GHz. The two very different architectures won't necessarily achieve the same work per clock cycle but I would not expect them to be so far apart either. I'll try to get the time to go back and investigate.

The program makes a lot of use of long long int (64 bit). Do they perform reasonably on the Pi? I can set the program to use long int but that will cause it to do a lot more calculations.
So you just used the standard gcc compiler on the PI? Can you try it with -O3 (full optimisation), and on the PC do a release build and see if that makes a difference?
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Re: Something the Pi is not good at - calculating Pi

Wed May 15, 2013 10:19 pm

Yes, I just used the gcc that came with Debian. It was compiled with -O2 but I just recompiled it with -O3. This made no detectable difference and it still was estimating 6.5 days for completion.

I can't easily recompile the PC version since I don't currently have any compiler loaded. I do have debug and release builds but they give pretty much the same estimates.

The long long int must be at least 64 bits or it would not produce the correct results. long int is not 64 bits, attempting to use it for the intermediate results caused overflows and incorrect results.

Another possibility is that I am not getting 64 bit arithmetic in hardware but software emulation. I can test for this but setting the program to work in a 32 bit mode. When 64 bit arithmetic is available in hardware, this would slow it down since it will do more than twice as many calculations. However, if the 64 bit arithmetic is not done in hardware then it may get quicker. This seems to be the case, the run time came down to 43.5 hours. Is the processor in the Pi not capable of 64 bit integer arithmetic or does gcc not utilise it?

Heater
Posts: 13866
Joined: Tue Jul 17, 2012 3:02 pm

Re: Something the Pi is not good at - calculating Pi

Wed May 15, 2013 10:45 pm

Any chance you could post the source code so that we can all have a play with it? Say on github.com with MIT license. Sorry if that is asking to much.

48 times slower...

We have at least a factor of 3 difference in speed just comparing clock speed.
Perhaps a factor 4 or more if the PC version is using 64 bit ints
That's 12 to 20 times or so accounted for.

The last factor of 3 or 4 might be down to cache differences, or the CISC vs RISC nature of the processors.

Who knows, I'm only hand waving.
Memory in C++ is a leaky abstraction .

W. H. Heydt
Posts: 11063
Joined: Fri Mar 09, 2012 7:36 pm
Location: Vallejo, CA (US)

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 4:16 am

Is your laptop a 64-bit native processor? You say you make use of 64-bit values and the Pi is really a 32-bit machine, so on the Pi you're doing double-precision and on the laptop that may not be the case.

Heater
Posts: 13866
Joined: Tue Jul 17, 2012 3:02 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 6:46 am

jwlawler says he is using "long long int" which will be a 64 bit variable no matter if it is compiled on a 32 or 64 bit machine.

On a 64 bit machine an addition of long long int will be a single instruction, On a 32 bit machine that will compile to at least two instructions. Multiply and divide will require even more.

That alone could account for two or four or so speed difference, especially if there is a lot of multiplying going on.

He also says he can build the code to use 32 bit ints on both machines. That would give a fairer speed comparison for more normal software.
Memory in C++ is a leaky abstraction .

jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 7:28 am

Remember the 48 times difference was when I was running the 64 bit version on both machines. I recompiled it to 32 bit mode on the Pi and it got quicker. The difference was now only a factor of 12.5. This is not a particularly fair comparison since that is the PC using 64 bit arithmetic against the Pi using 32 bit. I need to get a compiler loading on my laptop and try it in the 32 bit mode there as well. Since the lap top can do 64 bit arithmetic in hardware, this will slow it down. I don't recall exactly how much it will be slowed. Certainly at least twice and maybe more. I either need to do it or think more carefully about my algorithm. I could probably use a friend's Intel / Linux machine which may be a more interesting comparison: different hardware but more similar O/S.

I don't think that I want to release the source to the general public but I would be happy to give it to a few of you who are generally interested in the topic. Is there any private message facility in this forum?

Before I do, please note that this program was first written nearly 40 years ago in Fortran in school. I ported it to C as my first program when I was learning that language. I am not sure when that was but well over 20 years ago. Note C not C++, C++ was not yet popular. The only thing that I have done to it since then is adapt it to make use of the larger ints on new machines. By adjusting one of the header files, you can set it to run on 16, 32, or 64 modes. So, comment on it and criticise in a constructive manner it but remember that it is a schoolboy's program from long ago. Even the maths was by me in school. It is far from the best Pi algorithm, much better ones exist.

I have held on to the program as a memento. Each time, I encounter an interestingly different computer, I try it as a personal benchmark. One previous benchmark was some PowerPC based mini-computers against PCs. I was disappointed that the PCs usually won. To be fair these were business orientated machines so they were not intended for number crunching but it was still disappointing. At one point, the minis managed to win but because I got 64 bit arithmetic on them first. When I got a PC with 64 bit arithmetic, it retook the lead. By PC, I mean a typical Intel / Windows machine.

Do you want to know the maths behind the program?

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24129
Joined: Sat Jul 30, 2011 7:41 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 8:35 am

12.5 is more of a sensible number. So looks like forcing 64 bit calculations on a 32 bit ARM does slow it down dramatically (effectively a factor of 4 - it could be as simple as 64bit add needing 4 instructions instead of 1 for a 32 bit add). Interesting.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 8:54 am

I should remember the relative speeds of the 32 and 64 bit modes (when both are available in hardware) since I have run the program many times over the years. Unfortunately, I can't, it is too long since the last run. It will definitely be at least a factor of 2. As you may guess, the program stores the big numbers in arrays of ints. The bigger the int that I can use, the fewer the elements that I need. So, going from 32 to 64 bits means that the arrays are half the size and the big loops are half the length. What I don't remember is whether other more subtle factors apply and the difference is more than a factor of 2. I'll try to find out over the weekend.

Heater
Posts: 13866
Joined: Tue Jul 17, 2012 3:02 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 12:03 pm

jwlawler,

You should not be so shy about it. There is plenty of horrible code that has been published on github or elsewhere. Perhaps not because the author intends anyone to use it but just because it's a good place to share things when having a discussion like this. Also it's a convenient back up and revision control is a bonus. If you tell the story of the code, as you have told it here, in the README file it will end up as the first thing people read if they stumble across it, they will quite likely be impressed this was a school project and first C/Fortran program.

I wish you would reconsider.

As it is, it is better that you don't post me the code. Any such code on my machine will quite likely leak out one day as it will be thrown in with many other code examples and I don't want to have to keep track of whose secret is whose.

I would be very interested in the maths behind your code though. Perhaps it my inspire a little "calculate Pi on the Pi as fast as possible" competition.
Memory in C++ is a leaky abstraction .

jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 12:28 pm

Heater,

You have me pretty much convinced. If I have the spare time over the weekend, I'll write a little preamble and publish the code.

Here's the outline of the maths.

Here's a well known formula for Pi.

pi / 4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 . . .

Nice and simple. The trouble is that it converges very, very slowly. A nice feature is that the partial sums will be alternating over and under estimates so the true value is between your last two partial sums. The bad bit is that gap closes very slowly. About 10 terms to get 1 digit, 100 to get 2, 1000 to get 3 etc. Even the world's most powerful computer won't get far with that formula. So, even in school, I didn't bother to program it.

Now that formula comes from the Taylor series for arctan(1). So, I played with some other functions and I calculated the series for arcsin(0.5). This was a rather more complex formula but had the huge advantage of giving a decimal place for every 1.66 terms of the series. So, the extra effort of the more complex formula would pay off very quickly.

pi / 6 = u1 -u2 + u3 - u4 . . .

Each term is calculated from the previous one with a few operations. It has the same alternating over / under estimate behaviour.

It sounds like a linear algorithm, i.e. twice as many calculations gives twice the accuracy and from a maths point of view it is. However, when it becomes a real computer program it becomes quadratic since for twice the number of decimal places you need to calculate the terms with twice as many decimal places. So you have twice as many terms which take twice as long to calculate so 4 times rather than 2 times slower.

So, anyone that wants to use my program to calculate Pi to 1,000,000,000 places, beware it will take 1,000,000 as long as the estimates in this thread and not just 1000 times as long. You won't break any records using this program.

For the high precision values, I used arrays of ints. In the decimal version (there is also a binary version), I store 2 digits per byte. I need to be able to add and subtract these high precision values but I don't need to multiple or divide one high precision value by another. I only need to be able to multiply and divide these high precision numbers by ordinary ints. Two high precision numbers need to be retained: the sum so far (s in the program) and the latest term in the series (u). With 2 digits per byte but two values needed, the storage required is as many bytes as the decimal places you want. So, 1MB for 1,000,000 places. Of course, there are a few other variables but not many and their size is fixed.

I forget the limits of the program but it will tell you itself. You set a header file with some type definitions
to suit your hardware and it figures out what it can do. If you ask too much of it, it will tell you.

Heater
Posts: 13866
Joined: Tue Jul 17, 2012 3:02 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 12:46 pm

I do recommend github.com.

git is a bit funky to use but they have good instructions and I'm guessing you don't want to start doing any development on this code.

Sounds like you have a good start on the README.md file that you will need in the posts here already.
Memory in C++ is a leaky abstraction .

User avatar
Jim Manley
Posts: 1600
Joined: Thu Feb 23, 2012 8:41 pm
Location: SillyCon Valley, California, and Powell, Wyoming, USA, plus The Universe
Contact: Website

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 1:49 pm

24 GFLOPS available in multiple, parallel pipelines in the GPU and people are still screwing around with ints (even very long ones) on a RISC processor for crying out loud. Get real, as in real numbers folks, or at least floating-point numbers, as the Pi SoC has both the GPU and a floating-point hardware unit. IEEE standard 754-2008 quadruple-precision floating-point provides 112 bits of mantissa (roughly 34 decimal digits) and you can get as many digits of accuracy as you want by just keeping track of the last couple of bits of the mantissa when doing extended-precision, cascaded calculations. I'll see your puny 64-bit, int-based calculation and raise you 48 bits at a whack ;)

This once again points up how most people just don't get how they should be using the Pi - it's the GPU, stoopid! Poor Eben, James, and the rest of the crew have been slaving over hot pipelines to rev up that wonderful VideoCore IV warp drive and you guys are still fiddling with a friggin' carburetor - and a single-barreled one at that! Better check your points - and the rest of the Lucas-built electrical componentry while you're under the hood, boys! :D

It reminds me of the joke about who was more intelligent, a mathematician or an engineer. They were placed at one end in a room, with a ravishing naked, um, "technician" in a reclined position on a couch at the other end of the room, and a clock on the wall over the "tech". They were instructed that they could advance half the distance to the "tech" as each minute passed on the clock. When the first minute expired, the engineer advanced halfway across the room, but the mathematician remained in his original place against the wall. Another minute went by and the engineer advanced to three-quarters of the way across the room, but the mathematician remained fixed in place. The organizers stopped the contest to ask the mathematician if he understood the rules, to which he smugly replied, "Well, anyone with a wit's worth of intelligence knows that you could advance toward the 'tech' for an infinite amount of time and still never actually get there." The organizers said to the engineer, "Well, it seems the mathematician has bested you, old boy.", to which the engineer quickly shot back, "Oh, yeah? Well give me ten more minutes and I'll be close enough for engineering approximations!"

Think outside the box, folks - it's a wonderful new world out here :lol:
The best things in life aren't things ... but, a Pi comes pretty darned close! :D
"Education is not the filling of a pail, but the lighting of a fire." -- W.B. Yeats
In theory, theory & practice are the same - in practice, they aren't!!!

Ravenous
Posts: 1956
Joined: Fri Feb 24, 2012 1:01 pm
Location: UK

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 1:59 pm

Well, I think in this case the appeal is that the OP has used what is almost the same algorithm on several decades' worth of machines. That's what makes this thread interesting to me. The relative efficiency of the algorithm isn't that important.


Heater
Posts: 13866
Joined: Tue Jul 17, 2012 3:02 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 2:53 pm

Jim,

We look forward to you providing an example.

Let's start with something simple, say adding together two one million digit numbers. That should be easy enough. Then we can move on to million digit multiply and divide etc.

If you can make it portable code that I can compile and run on any machine all the better. Afterall my PC has an NVIDIA GPU to for me to play on. Well, let's just say current architectures to make it easy.
Memory in C++ is a leaky abstraction .

jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 3:01 pm

Jim, you make some good points but Ravenous has it right. If I was really trying to calculate Pi from scratch on a Pi then this is not how I would do it. I wouldn't even start with any coding at all. Devising a better algorithm would have a much greater benefit. Just as my ancient arcsin(0.5) algorithm was a major improvement on the well known arctan(1) one, there are even better algorithms still. In this game, a good algorithm and inefficient programming will probably beat a poor algorithm and super efficient programming.

I have noticed the surprisingly good GPU, the Pi does an impressive job of playing HD videos.

Back to the main topic. As Ravenous correctly guesses, this was a bit of fun with some ancient code. Any time, I do any programming on a new architecture, I give this program a spin and see how it does. Apart from being a bit of fun, it does indicate how far computers have come. The first time that I ran this program to completion for 1,000,000 places was about 20 years ago. It took a month of continuous running on a mini-computer (around the size of a fridge). I don't remember what it cost but I am quite sure that it was way more than the Pi. Even without optimisation, that such a tiny cheap computer can do the job in a few days is a huge advance.

If I had the time (I don't) it could be fun to try your ideas. I don't immediately see how to get million (or more) decimal places out of floating point variables. Doing it with ints is fairly easy, it is rather like doing arithmetic with long numbers on paper (carries, etc).

pygmy_giant
Posts: 1562
Joined: Sun Mar 04, 2012 12:49 am

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 3:05 pm

re: GPU - using OpenCL to access the GPU for arithmatical processing was explored in this thread: http://www.raspberrypi.org/phpBB3/viewt ... a&start=75 I think the conclusion was that porting it is easier said than done and that the effort required would provide small gains due to the videocore's architecture...

Heater
Posts: 13866
Joined: Tue Jul 17, 2012 3:02 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 3:12 pm

I think the idea is not to use the floating point handling of the GPU (or other FPUs) for actually handling floating point numbers.

Rather, make use of the fact that a quad sized floating point number has 112 bits of mantissa, as Jim said. One can use that as 112 bit integers. So bingo you have a whole bunch of parallel 112 bit integer processors in your GPU.

It's like what happens in JavaScript. That language treats all numbers as floats. Double precision. But you can quite happily do integer work as long as you stick within range of the mantissa.
Memory in C++ is a leaky abstraction .

jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 3:16 pm

Ravenous wrote:Well, I think in this case the appeal is that the OP has used what is almost the same algorithm on several decades' worth of machines. That's what makes this thread interesting to me. The relative efficiency of the algorithm isn't that important.
Not just "almost the same algorithm", it is the same nearly 40 year old one. I rewrote the original Fortran as C once but the logic of the program didn't change. The obvious language changes and an adjustment from the 24 bit words of the original machine and the 16 bits of the PC that I used for the first C version. Since then, it has some minor portability tweaks and changes to exploit the larger ints available. The maths has not changed and the logic for the high precision arithmetic has not changed either.

One feature that scares me off floats for this very high precision arithmetic is the relative difficulty of tracking the errors. I wanted a program which didn't produce an answer that was probably approximately correct; I wanted one that I could sure was correct. This meant tracking the rounding errors carefully. For example, when you ask it for 1,000,000 places, it actually calculates 1,000,008 because it calculates that the errors will infect at most the last 4 digits (the other 4 are a safety margin). Many years later, with the internet, I could easily access other peoples's calculations and I was pleased to see that mine was right.

P.S. You can call me jwlawler if you wish but I am more commonly known as John.

jwlawler
Posts: 83
Joined: Sun May 12, 2013 12:15 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 3:20 pm

Heater wrote:I think the idea is not to use the floating point handling of the GPU (or other FPUs) for actually handling floating point numbers.

Rather, make use of the fact that a quad sized floating point number has 112 bits of mantissa, as Jim said. One can use that as 112 bit integers. So bingo you have a whole bunch of parallel 112 bit integer processors in your GPU.

It's like what happens in JavaScript. That language treats all numbers as floats. Double precision. But you can quite happily do integer work as long as you stick within range of the mantissa.
112 bit integers would be good. That would be 4 times the speed of the 32 bit version. I don't think that I have the time to try this but after I publish the code, I would be interested to see others try to adapt my code.

User avatar
jojopi
Posts: 3085
Joined: Tue Oct 11, 2011 8:38 pm

Re: Something the Pi is not good at - calculating Pi

Thu May 16, 2013 4:36 pm

Jim Manley wrote:This once again points up how most people just don't get how they should be using the Pi - it's the GPU, stoopid!
Are you saying that we should not calculate pi on a Pi, and we should just make pretty graphics instead? Because, regardless of theoretical FLOPS, calculating pi on the VC4 would be extremely challenging with only an OpenGL ES API.

If you are brave enough to attempt a GPU-based pi calculation without licensing proprietary tools, then I will provide an ARM-based comparison. Am I really "stoopid"?

By the way, 32bit ARM and 64bit AMD each have an integer multiply instruction that takes two full values and produces a double precision result, spread across two registers. You do not get that in FP.

Return to “General discussion”