- gordon@drogon.net
**Posts:**2020**Joined:**Tue Feb 07, 2012 2:14 pm**Location:**Devon, UK-
**Contact:**Website

At this point, I'm going to have to say mea culpa... I'd changed the code to use doubles and that was the version I was running and presenting numbers here rather than the original integer version - my reason for that was to try to get a better comparison with RTB which only supports doubles... Oh well, as you were, etc.

0.141 is what I get on the 100 iteration integer version.

-Gordon

0.141 is what I get on the 100 iteration integer version.

-Gordon

timrowledge,

If an engineer wants to build a bridge out of steel beams he might want to measure the tensile strength of those steel beams. That of course does not mean his bridge will stay up, that depends on his design.

Similarly when undertaking a software project we might want to know our language/compiler/VM can actually do what we want. Some measurements can help there.

I don't get what that statement means.

Oh, yeah, all off topic. It's been an interesting chat nonetheless.

By the way, where are the Smalltalk results for the simple benchmarks presented here?

Never heard that before. Whoever said that? What on earth does it mean anyway?...a reminder that "when a measure becomes used as a benchmark it ceases to have any meaning as a measure".

If an engineer wants to build a bridge out of steel beams he might want to measure the tensile strength of those steel beams. That of course does not mean his bridge will stay up, that depends on his design.

Similarly when undertaking a software project we might want to know our language/compiler/VM can actually do what we want. Some measurements can help there.

I don't get what that statement means.

Oh, yeah, all off topic. It's been an interesting chat nonetheless.

By the way, where are the Smalltalk results for the simple benchmarks presented here?

Memory in C++ is a leaky abstraction .

It does not fail, zero is the correct result when dividing these twoHeater wrote: What does; Python do when you write:

1000000000000000000000000000000 / 3000000000000000000000000000000

?

Edit: It fails:

>>> 1000000000000000000000000000000 / 3000000000000000000000000000000

0L

Your using Python 2 by the look of it, try Python 3

Code: Select all

```
Python 3.4.3 (default, Nov 17 2016, 01:08:31)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 1000000000000000000000000000000 / 3000000000000000000000000000000
0.3333333333333333
>>>
```

Python 2 always did floor division (like C).

Since "factorial" appears on the Mini Squeak demonstration page as a built-in function, I thought it had been optimized. The productivity of a programming environment is significantly increased by the availability of high-quality subroutine libraries. Python includes many convenient wrappers for C libraries. Would it be possible for Smalltalk to use gmp for its bigint support?timrowledge wrote:Unsurprisingly a program using carefully crafted code optimised as part of a package of numerics magic is an order of magnitude faster than a simple recursion. Which is another illustration of how difficult it is to glean any meaning from simple benchmarks, and a reminder that "when a measure becomes used as a benchmark it ceases to have any meaning as a measure".

jahboater,
I could make the case that this is still a fail. It has thrown away a lot of the precision I thought I had.

Slippery things, numbers. Makes me think that C did the best thing by working with number types the underlying hardware can deal with easily and efficiently and forcing you to do something special with libraries or whatever if you want something else. As does JS.

OK. But who ever said they were integers?It does not fail, zero is the correct result when dividing these two integers.

It's worse than I thought. You mean they changed the semantics of the language in moving from Python 2 to Python 3?. Now I understand why nobody wants to use itIn Python 3 "/" does a floating point division by default, and "//" does the usual integer floor division. Python 2 always did floor division (like C).

Code: Select all

```
>>> 1000000000000000000000000000000 / 3000000000000000000000000000000
0.3333333333333333
```

Slippery things, numbers. Makes me think that C did the best thing by working with number types the underlying hardware can deal with easily and efficiently and forcing you to do something special with libraries or whatever if you want something else. As does JS.

Memory in C++ is a leaky abstraction .

You did: 1000000000000000000000000000000 is an integer, 1000000000000000000000000000000.0 is floating point. Same as 2 and 2.0, just because they are large doesn't change the type. Its true in most languages except the really sad ones.Heater wrote:OK. But who ever said they were integers?

Unfortunately so.It's worse than I thought. You mean they changed the semantics of the language in moving from Python 2 to Python 3?. Now I understand why nobody wants to use it

Well I just tried a few things and it looks like the arbitrary precision is only for integers, why it cant just use mpfr for real's I don't know. 1000.0 ** 1000.0 fails with an overflow error but 1000 ** 1000 works fine - odd.I could make the case that this is still a fail. It has thrown away a lot of the precision I thought I had.Code: Select all

`>>> 1000000000000000000000000000000 / 3000000000000000000000000000000 0.3333333333333333`

Definitely (for many reasons)!Heater wrote:Slippery things, numbers. Makes me think that C did the best thing by working with number types the underlying hardware can deal with easily and efficiently and forcing you to do something special with libraries or whatever if you want something else. As does JS.

You want something like (apt-get install python3-gmpy2):jahboater wrote:why it cant just use mpfr for real's I don't know.

Code: Select all

```
>>> from gmpy2 import mpfr
>>> mpfr(1000) ** 1000
mpfr('1e+3000')
```

How does 1/3 have any less precision than 100/300? They are both exact ratios of integers and exactly equal to each other.Heater wrote:I could make the case that this is still a fail. It has thrown away a lot of the precision I thought I had.

ejolson,

However when Python returns 0.3333333333333333 as a result those divisions that is only an approximation and not equal to either of them.

It's still a rational number. That is:

3333333333333333 / 10000000000000000

Which is clearly is not 1/3 or 100/300.

Perhaps I gave a bad example. Imagine two 1000 digit numbers made up of random looking digits. One of which is nearly three times bigger than the other. When I divide them in Python I get a floating point value approximately one third. It only has 16 digits after the decimal point, I have lost nearly a thousand digits of precision! Now my result equals 1/3 which is wrong.

Aside: One should never compare floating point values for equality.

Certainly 1/3 and 100/300 are exact ratios of integers and equal. What we have here is a "rational" number.How does 1/3 have any less precision than 100/300? They are both exact ratios of integers and exactly equal to each other.

However when Python returns 0.3333333333333333 as a result those divisions that is only an approximation and not equal to either of them.

It's still a rational number. That is:

3333333333333333 / 10000000000000000

Which is clearly is not 1/3 or 100/300.

Perhaps I gave a bad example. Imagine two 1000 digit numbers made up of random looking digits. One of which is nearly three times bigger than the other. When I divide them in Python I get a floating point value approximately one third. It only has 16 digits after the decimal point, I have lost nearly a thousand digits of precision! Now my result equals 1/3 which is wrong.

Aside: One should never compare floating point values for equality.

Memory in C++ is a leaky abstraction .

As 1/3 has a non-terminating dyadic expansion, rather than losing a thousand, it seems you have lost an infinite number of digits. However, this doesn't seem unique to Python.Heater wrote:.Perhaps I gave a bad example. Imagine two 1000 digit numbers made up of random looking digits. One of which is nearly three times bigger than the other. When I divide them in Python I get a floating point value approximately one third. It only has 16 digits after the decimal point, I have lost nearly a thousand digits of precision! Now my result equals 1/3 which is wrong.

Even more off topic, but how does JS deal with that? It stores integers in 64-bit floats, but I would be upset if I couldn't safely compare them for equality. My guess is it probably just works and nearbyint(x) == x if they are genuine integers.Heater wrote: Aside: One should never compare floating point values for equality.

ejolson,

In these cases one is putting thousands of digits in and getting out only a handful of digits out in a float. Most of the precision is lost. Even if the result could be written as a finite number of digits after the decimal point.

But yes, I don't know any languages that don't have problems with numbers that need to be taken into consideration.

As such the equality operator works correctly on integers in JS.

Otherwise numbers become floating point. At which point the usual issues with the equality operator apply. Same as any other language.

That is why I said that perhaps 1/3 was a bad example. And started talking about the two 1000 digit numbers of random digits. Which when divided give me something less than 1.As 1/3 has a non-terminating dyadic expansion, rather than losing a thousand, it seems you have lost an infinite number of digits.

In these cases one is putting thousands of digits in and getting out only a handful of digits out in a float. Most of the precision is lost. Even if the result could be written as a finite number of digits after the decimal point.

This particular issue is unique to Python (and perhaps a few other little known languages) because it supports big integers but not arbitrary precision reals.However, this doesn't seem unique to Python.

But yes, I don't know any languages that don't have problems with numbers that need to be taken into consideration.

In Javascript integers will be represented precisely as long as they can be represented in the 53 bits that fit into a the mantissa of a 64 bit float. (Or is it 52 bits, I forget)....how does JS deal with that?

As such the equality operator works correctly on integers in JS.

Otherwise numbers become floating point. At which point the usual issues with the equality operator apply. Same as any other language.

Memory in C++ is a leaky abstraction .

- gordon@drogon.net
**Posts:**2020**Joined:**Tue Feb 07, 2012 2:14 pm**Location:**Devon, UK-
**Contact:**Website

52 + sign bit from what I recall.Heater wrote: In Javascript integers will be represented precisely as long as they can be represented in the 53 bits that fit into a the mantissa of a 64 bit float. (Or is it 52 bits, I forget).

I did some work with floating point stuff in 8-bit micros waay back (starting with some old code present in the Apple II!) and things like rounding, equality, and cumulative errors were part of the course I did at uni... Wonder if they still teach stuff like that?

-Gordon

I studied numerical analysis back in the day. I don't recall any mention of those pesky cumulative errors.

The only error analysis we had beaten into us was in Physics.

For years in work we did not use floating point on radar systems and such. There was no hardware support. But our programming language, Coral 66, had fixed point number support. That kind of concentrates the mind on number ranges, precision, errors etc.

I will always remember when some new youngster joined the team and started complaining about the lack of float support. The project manager told him:

"If you think you need to use floating point then you don't understand the problem. If you really need to use floating point then you don't understand the new problem you then have."

He was kind of hard that way.

I guess generations of programmers have learned this kind of thing the hard way, by scratching their heads for a week wondering why their results are way off or things that should be equal are not!

The only error analysis we had beaten into us was in Physics.

For years in work we did not use floating point on radar systems and such. There was no hardware support. But our programming language, Coral 66, had fixed point number support. That kind of concentrates the mind on number ranges, precision, errors etc.

I will always remember when some new youngster joined the team and started complaining about the lack of float support. The project manager told him:

"If you think you need to use floating point then you don't understand the problem. If you really need to use floating point then you don't understand the new problem you then have."

He was kind of hard that way.

I guess generations of programmers have learned this kind of thing the hard way, by scratching their heads for a week wondering why their results are way off or things that should be equal are not!

Memory in C++ is a leaky abstraction .

So you can compare two floats for equality safely if they are both integers which is what I thought.Heater wrote: In Javascript integers will be represented precisely as long as they can be represented in the 53 bits that fit into a the mantissa of a 64 bit float. (Or is it 52 bits, I forget).

As such the equality operator works correctly on integers in JS.

There are 52 explicit significand (mantissa) bits plus an extra implicit one bit if the number is "normal" plus the sign bit. I think compared to integer storage you are really getting 54 bits. And you have support for -0 !

Intel x87 has 64 significand bits plus the sign bit meaning that any 64-bit integer, signed or unsigned, can be stored without loss of precision which is cool.

Last edited by jahboater on Wed Dec 14, 2016 11:11 am, edited 1 time in total.

They didn't when I was at uni. In fact I didn't learn about floating point rounding till after I retired. I always naively thought that the "round half away from zero" was the norm - it is not.gordon@drogon.net wrote: I did some work with floating point stuff in 8-bit micros waay back (starting with some old code present in the Apple II!) and things like rounding, equality, and cumulative errors were part of the course I did at uni... Wonder if they still teach stuff like that?

jahboater,

10 / 3 == 3

is true. But in JS the left hand side becomes 3.3333333333333335 and the comparison returns false. One has to be sure one is working with ints by using Math.round or Math.floor appropriately.

JS has another quirk in that logical operations will always truncate results to 32 bit integers. So x|0 will chop x down to 32 bits. I guess that is a result of being designed in the days of 32 bit machines. Turns out to be a very useful feature when transpiling C source in Javascript where it enables almost native execution speed of the resulting JS.

Intel x87 uses 80 bit float registers doesn't it? However JS sticks to IEEE 754 double-precision format which fits in 64 bits so only 53 bit ints are available in there. Same as C doubles I believe.

Yes. But it does require a little thought. For example if you are used to ints in C you might casually expect thatSo you can compare two floats for equality safely if they are both integers which is what I thought.

10 / 3 == 3

is true. But in JS the left hand side becomes 3.3333333333333335 and the comparison returns false. One has to be sure one is working with ints by using Math.round or Math.floor appropriately.

JS has another quirk in that logical operations will always truncate results to 32 bit integers. So x|0 will chop x down to 32 bits. I guess that is a result of being designed in the days of 32 bit machines. Turns out to be a very useful feature when transpiling C source in Javascript where it enables almost native execution speed of the resulting JS.

Intel x87 uses 80 bit float registers doesn't it? However JS sticks to IEEE 754 double-precision format which fits in 64 bits so only 53 bit ints are available in there. Same as C doubles I believe.

Memory in C++ is a leaky abstraction .

Yes, the IEEE 754 double-precision format is very widely available, so its good choice for JS. NEON and Intel SSE/AVX support it of course. The x87 80-bit format is regarded by Intel as "legacy" despite them not offering anything else with as much precision and range - shame. On 64-bit platforms only, _float128 is available in GCC with 33 decimal digits of precision! (in software).Heater wrote: Intel x87 uses 80 bit float registers doesn't it? However JS sticks to IEEE 754 double-precision format which fits in 64 bits so only 53 bit ints are available in there. Same as C doubles I believe.

Countries which measure their national debt in the 10's of trillions with cent accuracy need 15 significant digits to do so. Since the 53-bit mantissa of double precision hasjahboater wrote:The x87 80-bit format is regarded by Intel as "legacy" despite them not offering anything else with as much precision and range - shame.Heater wrote: Intel x87 uses 80 bit float registers doesn't it? However JS sticks to IEEE 754 double-precision format which fits in 64 bits so only 53 bit ints are available in there. Same as C doubles I believe.

53*log10(2) = 15.95458977

decimal digits of precision, then it would appear calculations involving the national debt could be performed using JavaScript or maybe even Microsoft Excel. Note, however, that 10 cents is not possible to represent accurately using using any floating point representation. In the particular case of national-debt-sized numbers one can expect rounding errors of size

0.5*10^13/2^53 = 0.0005

Assuming such errors occur with equal probability to have positive and negative signs, then cumulative errors are expected to grow as the square root of the number of arithmetic operations. In particular, any sum of 400 terms is expected to be wrong by

sqrt(400)*0.0005 = 0.01

or, in other words, a cent. Note that the most such a sum could be wrong by is 20 cents.

While it seems the solution should be to reduce the national debt, using COBOL to compute the sums is probably easier--even from a code readability and modifiability viewpoint.

But you'd be wrong.Heater wrote:Makes me think that C did the best thing by working with number types the underlying hardware can deal with easily and efficiently and forcing you to do something special with libraries or whatever if you want something else.

Code: Select all

```
Petite Chez Scheme Version 9.4
Copyright 1984-2016 Cisco Systems, Inc.
> (/ 1000000000000000000000000000000 3000000000000000000000000000000)
1/3
>
```

Here is a memoized version of the Collatz algorithm and a parallel memoized version. I have made some attempts at producing readable code. Note that Collatz requires 64-bit arithmetic which is very expensive when the ARMv8 cores of the Pi 3B are running in 32-bit compatibility mode. It would be interesting to know whether memoization and parallelization result in similar performance increases when the CPU is running in 64-bit mode. I also tested the parallel code on the FriendlyArm T3 in 32-bit mode which is an octo-core ARMv8 SBC. The results are as follows:The parallel memoized code is 28.8 times faster than the original on the Pi 3B and 47.9 times faster for the FA T3. Included are copies of all three programs for reference.

Code: Select all

```
Pi 3A FA T3
Original Code 14.4235 13.4910 sec
Memoized Code 1.5065 1.4932 sec
Parallel Code 0.5001 0.2816 sec
```

Code: Select all

```
/* Original C code for Collatz algorithm provided by jahboater with
modified with additional timing routines based on gettimeofday.
https://www.raspberrypi.org/forums/viewtopic.php?p=1080630#p1080630
*/
#include <stdio.h>
#include <inttypes.h>
#include <sys/time.h>
static double tic_time;
void tic() {
struct timeval tp;
gettimeofday(&tp,0);
tic_time = (double)tp.tv_sec + (double)tp.tv_usec * 1.e-6;
}
double toc() {
struct timeval tp;
gettimeofday(&tp,0);
return ((double)tp.tv_sec + (double)tp.tv_usec * 1.e-6) - tic_time;
}
int
main( void )
{
int maxsteps = 0;
int64_t maxval = 0;
tic();
for( int64_t x = 1; x < 10000000; ++x )
{
int steps = 0;
int64_t n = x;
while( n > 1 )
{
if( n & 1 )
n = 3 * n + 1;
else
n /= 2;
++steps;
}
if( steps > maxsteps )
{
maxsteps = steps;
maxval = x;
}
}
double t=toc();
printf( "max %lld steps %d\n", maxval, maxsteps );
printf( "elapsed %g sec\n", t );
}
```

Code: Select all

```
/* Memoized version of Collatz code.
The size of the cache can be tuned by changing the M below and
was determined experimentally.
*/
#include <stdio.h>
#include <sys/time.h>
static double tic_time;
void tic() {
struct timeval tp;
gettimeofday(&tp,0);
tic_time=(double)tp.tv_sec+(double)tp.tv_usec*1.e-6;
}
double toc() {
struct timeval tp;
gettimeofday(&tp,0);
return((double)tp.tv_sec+(double)tp.tv_usec*1.e-6)-tic_time;
}
#define N 10000000
#define M (4*1024*1024)
unsigned short memo[M];
unsigned collatz(unsigned n){
unsigned long long k=n;
unsigned c=0;
while(k>1){
if(k<M&&memo[k]){
c+=memo[k];
break;
}
if(k&1) k=3*k+1;
else k/=2;
c++;
}
return c;
}
void mkmemo(unsigned n,unsigned c){
unsigned long long k=n;
while(k>1){
if(k<M) {
if(memo[k]) return;
memo[k]=c;
}
if(k&1) k=3*k+1;
else k/=2;
c--;
}
}
unsigned maxc,nstar;
int main(){
tic();
for(unsigned n=1;n<N;n++){
unsigned c=collatz(n);
if(maxc<c){
maxc=c;
nstar=n;
}
if(n<M) mkmemo(n,c);
}
double t=toc();
printf("Maximum is collatz(%u) = %u steps.\n",
nstar,maxc);
printf("Memoized elapsed time %g seconds.\n",t);
return 0;
}
```

Code: Select all

```
/* Parallel memoized version of Collatz code.
The size of the cache can be tuned by changing the M below and
was determined experimentally. The cache uses a lockless design
based on the fact that writes to short variable types are atomic.
*/
#include <stdio.h>
#include <cilk/cilk.h>
#include <sys/time.h>
static double tic_time;
void tic() {
struct timeval tp;
gettimeofday(&tp,0);
tic_time=(double)tp.tv_sec+(double)tp.tv_usec*1.e-6;
}
double toc() {
struct timeval tp;
gettimeofday(&tp,0);
return((double)tp.tv_sec+(double)tp.tv_usec*1.e-6)-tic_time;
}
#define N 10000000
#define M (4*1024*1024)
#define NCPU 8
unsigned short memo[M];
unsigned collatz(unsigned n){
unsigned long long k=n;
unsigned c=0;
while(k>1){
if(k<M&&memo[k]){
c+=memo[k];
break;
}
if(k&1) k=3*k+1;
else k/=2;
c++;
}
return c;
}
void mkmemo(unsigned n,unsigned c){
unsigned long long k=n;
while(k>1){
if(k<M) {
if(memo[k]) return;
memo[k]=c;
}
if(k&1) k=3*k+1;
else k/=2;
c--;
}
}
unsigned maxc[NCPU],nstar[NCPU];
int main(){
tic();
cilk_for(unsigned cpu=0;cpu<NCPU;cpu++){
unsigned nstart=1+cpu*(N-1)/NCPU;
unsigned nend=1+(cpu+1)*(N-1)/NCPU;
for(unsigned n=nstart;n<nend;n++){
unsigned c=collatz(n);
if(maxc[cpu]<c){
maxc[cpu]=c;
nstar[cpu]=n;
}
if(n<M) mkmemo(n,c);
}
}
for(unsigned cpu=1;cpu<NCPU;cpu++){
if(maxc[0]<maxc[cpu]){
maxc[0]=maxc[cpu];
nstar[0]=nstar[cpu];
}
}
double t=toc();
printf("Maximum is collatz(%u) = %u steps.\n",
nstar[0],maxc[0]);
printf("Parallel elapsed time %g seconds.\n",t);
}
```

Last edited by ejolson on Fri Dec 30, 2016 4:22 am, edited 1 time in total.

Interesting ...

The Intel x87 (which has been a standard part of every Intel CPU for decades) has a true 64 bit mantissa, giving 19.2659 digits - no need for COBOL! It also has a maximum value of 1.189e+4932 enough for any ones national debt I should think, and probably enough to count all the atoms in the universe. The Pi3 when running in 64-bit mode will have available a 113-bit mantissa giving 34.016 digits.ejolson wrote: decimal digits of precision, then it would appear calculations involving the national debt could be performed using JavaScript or maybe even Microsoft Excel. Note, however, that 10 cents is not possible to represent accurately using using any floating point representation.

tufty,

Actually I'm with you the dangers of C's number types leading to bugs, security issues etc. On the other hand C is a refelction of what most computer hardware looks like and I love it for that.

Did you run that scheme example on a Pi? I did not think Petite Chez Scheme was available for ARM.

It's interesting that it produces a rational number result, other scheme interpreters I have used return a recurring float like most other languages.

Of course your example is the equivalent of:
It shows how one can side step the problem of number representation by not actually returning a number as a result!

That's kind of OK with me. We were never expected to turn our math problem results into real numbers, the simplest expression would do fine.

How does it deal with irrational numbers? "(/ 1 rootTwo)" perhaps?

A scheme version of the collatz problem above would be great. Hint, hint...

Actually I'm with you the dangers of C's number types leading to bugs, security issues etc. On the other hand C is a refelction of what most computer hardware looks like and I love it for that.

Did you run that scheme example on a Pi? I did not think Petite Chez Scheme was available for ARM.

It's interesting that it produces a rational number result, other scheme interpreters I have used return a recurring float like most other languages.

Of course your example is the equivalent of:

Code: Select all

```
> (/ 1 3)
1/3
```

That's kind of OK with me. We were never expected to turn our math problem results into real numbers, the simplest expression would do fine.

How does it deal with irrational numbers? "(/ 1 rootTwo)" perhaps?

A scheme version of the collatz problem above would be great. Hint, hint...

Memory in C++ is a leaky abstraction .

??? Where is this 128bit floating point support?jahboater wrote:The Pi3 when running in 64-bit mode will have available a 113-bit mantissa giving 34.016 digits.

Yes it has 128bit NEON registers but that ranges from a 16x 8bit vector to a 2x 64bit vector. AFAIK it isn't a single 128bit value for anything other than load/store.

She who travels light — forgot something.

jahboater,

Be careful there. When you national debt gets to 1.189e+4932 you still only have those 19 significant digits. You have lost 3 or 4 thousand least significant digits. The book keepers will not like that one bit.The Intel x87 ... has a true 64 bit mantissa, giving 19.2659 digits ... It also has a maximum value of 1.189e+4932 enough for any ones national debt I should think,

Currently that is estimated at 10e80. So if you actually wanted to count each one the wimpy 19 digits of x87 is no good....probably enough to count all the atoms in the universe.

Memory in C++ is a leaky abstraction .