User avatar
jojopi
Posts: 3041
Joined: Tue Oct 11, 2011 8:38 pm

Re: Making readable modifyable code.

Mon Dec 12, 2016 10:47 pm

Heater wrote:Looked at this way I would say the program as presented is buggy. It does not check for overflow of anything. It presumes you know what will happen before you start!
I think my Python implementation avoided any such assumptions? viewtopic.php?p=1026459#p1026459
To be really wicked we should insist that this algorithm handle integers of arbitrary size, not just limited to 64 bits.
I would be happy to see you handle even 64 bits in JavaScript. (It uses double, getting 53 effective bits, but drops to 32 if you use any integer operators, even on 64 bit platforms.) Can you allocate more than 1GB per process yet?

The only program I ever wrote in JS turned out slower than in Python. If you can make it run at all, then can you explain why changing "if (n in dict)" to "if (dict[n])" would make the whole thing twice as fast? Surely the latter version must test whether the item exists before it can test whether it zero, so it is doing more work?

Code: Select all

var dict = {1:0, 2:1};

function func(n) {
  if (n in dict) {
    return dict[n];
  } else if (n & 1) {
    return dict[n] = 1+func(3*n+1);
  } else {
    return dict[n] = 1+func(n/2);
  }
}

var maxsteps = 0, maxval = 0;

var x = 1;
while (x < 10000000) {
  steps = func(x);
  if (steps > maxsteps) {
    maxsteps = steps;
    maxval = x;
  }
  x += 1;
}

console.log("max", maxval, "steps", maxsteps);

timrowledge
Posts: 1139
Joined: Mon Oct 29, 2012 8:12 pm
Location: Vancouver Island
Contact: Website

Re: Making readable modifyable code.

Mon Dec 12, 2016 10:57 pm

As an aside, for tracking Squeak performance as we develop the VM and so on, we prefer a small suite of benchmarks that seem to have reasonable implementation spread -
n-body (which does some analysis of orbits)
binary tree
chameleon redux
threading (which is carefully designed to smash most systems code caches about and give any other clever code tricks nightmares

There's some versions of them on https://benchmarksgame.alioth.debian.org It's interesting to note that for the n-body example my Pi3 running Squeak is running the test in ~36 seconds as opposed to javascript/node.js Oh, yes, that is javascript running on a quad-core ubuntu x86 2.4GHz with 4Gb of ram. So not too shabby. And in fact the Pi/Squeak binary trees result is 25 against x86/javascript at ~56.

I'm quite sure you could come up with benchmarks that show language X crushing Smalltalk as well. That's how it is with benchmarks. Take with a (carefully benchmarked!) pinch of salt.
Making Smalltalk on ARM since 1986; making your Scratch better since 2012

ejolson
Posts: 2015
Joined: Tue Mar 18, 2014 11:47 am

Re: Making readable modifyable code.

Mon Dec 12, 2016 11:42 pm

Heater wrote:No, memoization is nothing to do with dynamic programming. It is a technique that can be used any any programming system.
Thanks! I looked it up and found a Wikipedia entry, then deleted my original question before I saw your reply. I wonder if it would result it much of a speedup for this program.
Last edited by ejolson on Tue Dec 13, 2016 1:24 am, edited 2 times in total.

ejolson
Posts: 2015
Joined: Tue Mar 18, 2014 11:47 am

Re: Making readable modifyable code.

Mon Dec 12, 2016 11:50 pm

timrowledge wrote:
jahboater wrote:Collatz is simple enough that its trivial to implement in any sensible language, and yet not so simple that it all gets optimized out - and it takes a decent amount of time to execute. Its not constrained by memory.
And that makes it almost completely useless as a meaningful benchmark. Anything 'trivial' is, well, trivial.
While the algorithm is easy to implement, the result of running the algorithm is not trivial and remains an open problem. Whether the solution would lead to Fields medal probably depends on how much new mathematics was developed to solve the problem.

ejolson
Posts: 2015
Joined: Tue Mar 18, 2014 11:47 am

Re: Making readable modifyable code.

Tue Dec 13, 2016 1:18 am

timrowledge wrote:If you want an interesting but still trivial test how about something that actually goes a little beyond 32 int integers

Code: Select all

10000 factorial asString size
That is
work out the *exact* value of 10,000 factorial - no wimping out to floating point
convert to a string - one you could print out
find the size of said string
print that value.

That's 35660 digits long and takes 570 mS on my Pi3. Simply working out the exact value take ~300 mS. I'd post the answer here but it is after all 34Kb long...
In C the code looks like

Code: Select all

    mpz_fac_ui(nfact,10000);
    unsigned int digits=strlen(mpz_get_str(0,10,nfact));
with a resulting execution time of about 20mS on a Pi 3. For reference a complete working program looks like

Code: Select all

#include <stdio.h>
#include <math.h>
#include <string.h>
#include <sys/time.h>
#include <gmp.h>

static double tic_time;
void tic() {
    struct timeval tp;
    gettimeofday(&tp,0);
    tic_time=(double)tp.tv_sec+(double)tp.tv_usec*1.e-6;
}
double toc() {
    struct timeval tp;
    gettimeofday(&tp,0);
    return((double)tp.tv_sec+(double)tp.tv_usec*1.e-6)-tic_time;
}

int main(){
    mpz_t nfact;
    mpz_init(nfact);

    tic();
    mpz_fac_ui(nfact,10000);
    unsigned int digits=strlen(mpz_get_str(0,10,nfact));
    double t=toc();

    printf("Actual number of digits is %u\n",digits);
    printf("Elapsed time %g mS.\n",t*1000);
    return 0;
}
and produces the output

Code: Select all

$ ./fact 
Actual number of digits is 35660
Elapsed time 19.5279 mS.
Sometimes the elapsed time is 40mS, apparently because the program executes so fast that the frequency governor doesn't have time to switch from 600 Mhz to 1200 Mhz. Note that these timing have been performed with the default libraries in Raspbian which are compiled to be compatible with the ARMv6 CPU in the original Raspberry Pi. Faster times are likely possible with ARMv7 optimized code.

jahboater
Posts: 3027
Joined: Wed Feb 04, 2015 6:38 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 8:19 am

Heater wrote:No, memoization is nothing to do with dynamic programming. It is a technique that can be used any any programming system.

Consider:

You write a function to calculate factorials. You know: 5! = 5 * 4 * 3 * 2 * 1, for example.

As your program is run perhaps that function is called to calculate the factorial of 5. So it does all of the above multiplications.

Later it might be called to calculate the factorial of 6.

What to do?

Do all that multiplication again: 6 * 5 * 4 * 3 * 2 * 1 ?

Or, having remembered the result of factorial 5 just calculate 6 * factorial5.

Boom, by remembering what you did before, what you are asked to do later is much faster.

As you see this is a huge speed boost in the collatz problem as given.

We could calling it "caching ". It's a trade off between speed and memory consumption.
For primes, the "Sieve of Eratosthenes" ?

jahboater
Posts: 3027
Joined: Wed Feb 04, 2015 6:38 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 9:25 am

Yes, yes, I know a trivial benchmark doesn't mean much. But it should give a ballpark figure for the speed of the two VM's doing common day to day stuff, integers, simple loops and conditionals .
We have 3x C for JS and 30x C for basic, so the differences are large.
Also we should be comparing the VM's, not the run-time libraries which are commonly not even written in the host language.

The idea of a trivial benchmark is that it should be quick and easy to implement in any programming language, so as not to take up too much of peoples time. Clearly it should only use features likely to be in any language. I think the requirement for large integers in collatz might make it a bit too difficult in some languages, so here is an even simpler benchmark using plain 32-bit integers that takes a significant amount of time. Any language that cant implement something like this is probably not viable for general use.
Find all the perfect numbers up to 10,000.

Code: Select all

#include <stdio.h>

int
main( void )
{
  int n = 2;

  do
  {
    int a = 0, d = n / 2;

    do
    {
      if( n % d == 0 )
        a += d;
    }
    while( --d );

    if( a == n )
      printf( "%d\n", n );

    n += 2;
  }
  while( n < 10000 );
}

Code: Select all

[email protected]:~ $ gcc -O3 pfn.c -o pfn
[email protected]:~ $ time ./pfn
6
28
496
8128

real	0m0.134s
user	0m0.140s
sys	 0m0.000s

ejolson
Posts: 2015
Joined: Tue Mar 18, 2014 11:47 am

Re: Making readable modifyable code.

Tue Dec 13, 2016 12:02 pm

jahboater wrote:I think the requirement for large integers in collatz might make it a bit too difficult in some languages, so here is an even simpler benchmark using plain 32-bit integers that takes a significant amount of time.
From a code readability point of view, I would suggest using at least 4-space indents, standard for loops and at least one comment. Thus,

Code: Select all

// Find all the perfect numbers up to 10,000

#include <stdio.h>

int main()
{
    for (int n = 2; n < 10000; n += 2) {
        int a = 0;
        for (int d = n / 2; d > 0; d--) {
            if (n % d == 0)
                a += d;
        }
        if (a == n)
            printf("%d\n", n);
    }
}
Run time is the same as no algorithmic changes have been made. As this is a n^2 algorithm and the next perfect number is 33550336, then to compute the next number using the stated algorithm would take

(.1 sec)*(33550336/10000)^2 = 13 days

of Pi 3 compute time. Using a parallel code could reduce this to around 3.25 days; however, since all perfect numbers up to 2^7431333 are known to be of the form 2^(p-1)*(2^p-1), a small change to the code

Code: Select all

// Find all the perfect numbers up to 33,550,336

#include <stdio.h>

int main()
{
    for (int tpp = 2; tpp <= 8192; tpp*=2) {
        int n = (tpp/2)*(tpp-1);
        int a = 0;
        for (int d = n / 2; d > 0; d--) {
            if (n % d == 0)
                a += d;
        }
        if (a == n)
            printf("%d\n", n);
    }
}
allows the work of 13 days to be completed in about .2 seconds using a single thread. Other algorithmic optimizations could be made as well.

User avatar
Paeryn
Posts: 2169
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Making readable modifyable code.

Tue Dec 13, 2016 1:21 pm

ejolson wrote:

Code: Select all

int main()
{
    for (int tpp = 2; tpp <= 8192; tpp*=2) {
        int n = (tpp/2)*(tpp-1);
        int a = 0;
        for (int d = n / 2; d > 0; d--) {
            if (n % d == 0)
                a += d;
        }
        if (a == n)
            printf("%d\n", n);
    }
}
allows the work of 13 days to be completed in about .2 seconds using a single thread. Other algorithmic optimizations could be made as well.
Definitely an easy optimisation missed, the inner for loop doesn't need to count anywhere near n/2 times, only sqrt(n) times.

Code: Select all

#include <math.h>
#include <stdio.h>

int main(void)
{
    for (int tpp = 2; tpp <= 8192; tpp *= 2) {
        int n = (tpp/2)*(tpp-1);
        int a = 1;
        int d = (int)sqrt(n);
        if (n % d == 0)
            a += d + (d * d == n ? 0 : n / d);
        d--;
        while (d > 1) {
            if (n % d == 0)
                a += d + (n / d);
            d--;
        }
        if (a == n)
            printf("%d\n", n);
    }
    return 0;
}

[email protected]:~/Programming/Haskell $ gcc -O3 perfect.c -std=c99 -lm -o perfect
[email protected]:~/Programming/Haskell $ time ./perfect
6
28
496
8128
33550336

real    0m0.008s
user    0m0.000s
sys     0m0.000s
[edited]I forgot to not count the sqrt twice when it is a factor, thanks Heater.
Last edited by Paeryn on Tue Dec 13, 2016 1:47 pm, edited 1 time in total.
She who travels light — forgot something.

Heater
Posts: 9985
Joined: Tue Jul 17, 2012 3:02 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 1:23 pm

jahboater,

Not sure how the perfect numbers thing gets us away from the need for big integers but I'll go for it. Javascript on a Pi 3 can easily match your C code:

Code: Select all

'use strict';

function perfectNumbers() {
    var n = 2;
    var a;
    var d;

    do {
        a = 0;
        d = n / 2;
        do {
            if (n % d === 0) {
                a += d;
            }
        } while (--d);

        if (a === n) {
            console.log(n);
        }

        n += 2;
    } while (n < 10000);
}

while (1) {
    var startTime = new Date().getTime();
    perfectNumbers();
    var endTime = new Date().getTime();;
    console.log('It took ' + (endTime - startTime) + ' ms.');
}

Code: Select all

[email protected]:~ $ node perfect-numbers-2.js 
6
28
496
8128
It took 210 ms.
6
28
496
8128
It took 145 ms.
6
28
496
8128
It took 146 ms.
Notice how this speeds up after the first execution. I guess that is the JIT kicking in. I think that is a fair way of timing as most things I run more than once during the life of a program.

I'm wondering how much of your timing result is the time it takes to load the executable and print the results.

Heater
Posts: 9985
Joined: Tue Jul 17, 2012 3:02 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 1:27 pm

Applying Paeryn's optimization of only doing the inner loop sqrt(n) times I get the Javascript version down to 9ms.

Code: Select all

'use strict';

function isPerfect(n)
{
   var sum = 1;
   var i;
   var sqrt = Math.floor(Math.sqrt(n));
   for (i = sqrt-1; i > 1; i--) {
       if (n % i === 0) {
           sum += i + n/i;
       }
   }
   if(n % sqrt === 0) {
       sum += sqrt + (sqrt * sqrt === n ? 0 : n / sqrt);
   }
   return sum === n;
}

function perfectNumbers() {
    var n;
    for (n = 1; n < 10000; n++) {
        if (isPerfect(n))
        console.log(n);
    }
}

while (1) {
    var startTime = new Date().getTime();
    perfectNumbers();
    var endTime = new Date().getTime();;
    console.log('It took ' + (endTime - startTime) + ' ms.');
}
This is why I don't use C/C++ much now a days. JS easier and quicker to develop, easier to change, and the performance gains of using C/C++ are mostly not worth the hassle.

User avatar
[email protected]
Posts: 1982
Joined: Tue Feb 07, 2012 2:14 pm
Location: Devon, UK
Contact: Website

Re: Making readable modifyable code.

Tue Dec 13, 2016 1:42 pm

ejolson wrote:Run time is the same as no algorithmic changes have been made. As this is a n^2 algorithm and the next perfect number is 33550336, then to compute the next number using the stated algorithm would take

(.1 sec)*(33550336/10000)^2 = 13 days

of Pi 3 compute time. Using a parallel code could reduce this to around 3.25 days; however, since all perfect numbers up to 2^7431333 are known to be of the form 2^(p-1)*(2^p-1), a small change to the code

Code: Select all

// Find all the perfect numbers up to 33,550,336

#include <stdio.h>

int main()
{
    for (int tpp = 2; tpp <= 8192; tpp*=2) {
        int n = (tpp/2)*(tpp-1);
        int a = 0;
        for (int d = n / 2; d > 0; d--) {
            if (n % d == 0)
                a += d;
        }
        if (a == n)
            printf("%d\n", n);
    }
}
allows the work of 13 days to be completed in about .2 seconds using a single thread. Other algorithmic optimizations could be made as well.
Again, just for the LOLs, I re-did this in RTB. The C version on a Pi 3 takes one second in my setup - not sure how you got 0.2 there:

Code: Select all

$ gcc -std=c99 -O3 -o perfect perfect.c
[email protected]:~ $ time ./perfect 
6
28
496
8128
33550336

real	0m0.987s
user	0m0.980s
sys	0m0.000s
[email protected]:~ $ time ./perfect 
6
28
496
8128
33550336

real	0m1.017s
user	0m1.010s
sys	0m0.000s
[email protected]:~ $ time ./perfect 
6
28
496
8128
33550336

real	0m1.000s
user	0m0.990s
sys	0m0.000s
RTB (all calcs. done it 64-bit floating point) takes 38 and a bit times more...

Code: Select all

// Find all the perfect numbers up to 33,550,336

start = time

tpp = 2
while tpp <= 8192 cycle
  n = (tpp / 2) * (tpp - 1)
  a = 0
  for d = n / 2 to 1 step -1 cycle
    if (n mod d) = 0 then
      a = a + d
    endif
  repeat

  if a = n then print n

  tpp = tpp * 2
repeat
et = time
print "Done in "; (et - start) / 1000
end
While I'd love to spend the time making RTB go faster, it's time I feel better spent elsewhere!

-Gordon
--
Gordons projects: https://projects.drogon.net/

jahboater
Posts: 3027
Joined: Wed Feb 04, 2015 6:38 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 2:45 pm

Heater wrote:I'm wondering how much of your timing result is the time it takes to load the executable and print the results.
Thats a good point,
I added some internal timing with the new posix clock_gettime( CLOCK_MONOTONIC, ... ) (which replaces gettimeofday) and it still took:

Code: Select all

[email protected]:~ $ ./pfn
6
28
496
8128
Time is 132.362 ms
So your JS times are as fast as C near enough!

jahboater
Posts: 3027
Joined: Wed Feb 04, 2015 6:38 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 2:58 pm

ejolson wrote:From a code readability point of view, I would suggest using at least 4-space indents, standard for loops and at least one comment.
I used 4-space indents in C for about 30 years, then some coding standard required 2-space indents - which, surprisingly, I eventually grew to like!

Code: Select all

#include <stdio.h>

int main( void )
{
    for (int tpp = 2; tpp <= 8192; tpp*=2) {
        int n = (tpp/2)*(tpp-1);
        int a = 0;
        for (int d = n / 2; d > 0; d--) {
            if (n % d == 0)
                a += d;
        }
        if (a == n)
            printf("%d\n", n);
    }
}
This shows how much of a waste of time hand coding assembler or other micro-optimizations are ...
Doing the research and fixing the algorithm wins hands down.

ejolson
Posts: 2015
Joined: Tue Mar 18, 2014 11:47 am

Re: Making readable modifyable code.

Tue Dec 13, 2016 4:30 pm

[email protected] wrote:Again, just for the LOLs, I re-did this in RTB. The C version on a Pi 3 takes one second in my setup - not sure how you got 0.2 there:
The RTB version seems quite readable, but again I would prefer at least 4-space indenting. Using the -mcpu=cortex-a7 compiler flag makes a difference in the speed of the C code. In particular

Code: Select all

$ gcc --version
gcc (Raspbian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -O3 -std=gnu99 -Wall -mcpu=cortex-a7 -lm perfect.c -o perfect
$ time ./perfect
6
28
496
8128
33550336

real    0m0.177s
user    0m0.170s
sys 0m0.000s
$ time ./perfect
6
28
496
8128
33550336

real    0m0.144s
user    0m0.150s
sys 0m0.000s
where perfect.c should be the same as the code you ran earlier.

Code: Select all

// Find all the perfect numbers up to 33,550,336

#include <stdio.h>

int main()
{
    for (int tpp = 2; tpp <= 8192; tpp*=2) {
        int n = (tpp/2)*(tpp-1);
        int a = 0;
        for (int d = n / 2; d > 0; d--) {
            if (n % d == 0)
                a += d;
        }
        if (a == n)
            printf("%d\n", n);
    }
}

User avatar
[email protected]
Posts: 1982
Joined: Tue Feb 07, 2012 2:14 pm
Location: Devon, UK
Contact: Website

Re: Making readable modifyable code.

Tue Dec 13, 2016 5:06 pm

ejolson wrote:
gordo[email protected] wrote:Again, just for the LOLs, I re-did this in RTB. The C version on a Pi 3 takes one second in my setup - not sure how you got 0.2 there:
The RTB version seems quite readable, but again I would prefer at least 4-space indenting.
One of the things I love about working for myself is that most of the time I adhere to a set of coding standards that I prefer... The same style I've used for the past 30 years when working on my own code... It's hard to change other than when I have to - e.g. when working on a shared project and then the first thing I used to do was create a set of 'cb' rules to make any code I accidentally write in my style match theirs... I like lots of white space and high contrast - helps my dyslexia. I like curly brackets to align vertically - which then causes me much irritation as I could not get an efficient way for cycle..repeat to line up in RTB. Oh well - the odd sacrifice doesn't hurt :-)

When I wrote the editor for RTB, I made the TAB key indent by 2 spaces - except when the line above had more than 2 spaces then it indented to the start of the first non-space character. It's a bit weird but it's what I like.


Using the -mcpu=cortex-a7 compiler flag makes a difference in the speed of the C code. In particular

Code: Select all

$ gcc --version
gcc (Raspbian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -O3 -std=gnu99 -Wall -mcpu=cortex-a7 -lm perfect.c -o perfect
$ time ./perfect
6
28
496
8128
33550336

real    0m0.177s
user    0m0.170s
sys 0m0.000s
[/quote]

Right. So doing that give me a best time of 0.295 over a few runs.. This is a Pi v3, no overclock, Raspbian lite, same gcc 4.9.2. Up to date Raspbian jessie, but no systemd and running kernel 4.8.13 though. Putting the code in a loop 100 times, gives 25.3s, or 0.253 per cycle...

Probably not anything to really worry about for now... But that's still half speed. This is your first optimisation of the code - no square root..

-Gordon
--
Gordons projects: https://projects.drogon.net/

Heater
Posts: 9985
Joined: Tue Jul 17, 2012 3:02 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 5:28 pm

jojopi,
I think my Python implementation avoided any such assumptions?...I would be happy to see you handle even 64 bits in JavaScript.
Ah, yes, the "thermo-nuclear" weapon of big integer arithmetic. As far as I can tell your Python code can handle thousands of digits easily. At least until memory runs out. I have no idea if there is any other limitation.

Such big numbers can be handled in C, JS, etc by means of a big number library. Admittedly it's not so convenient or pretty. I have no idea how they might compare speed wise.

If you are heavily into big integer arithmetic for sure think about using a language that supports it nicely.
It uses double, getting 53 effective bits, but drops to 32 if you use any integer operators, even on 64 bit platforms.
True enough. Except it will only drop to 32 bits if one does logical operations on the numbers. Regular integer maths can use all those 53 bits.
Can you allocate more than 1GB per process yet?
Sure. http://blog.caustik.com/2012/04/11/esca ... n-node-js/
The only program I ever wrote in JS turned out slower than in Python. If you can make it run at all, then can you explain why changing "if (n in dict)" to "if (dict[n])" would make the whole thing twice as fast?
In the same thread as you posted that slower than Python code I presented a JS version that was about 25 times faster than your Python. Both using memoization.

I have no idea why your JS is so slow, perhaps because you are using an object to do the memoization rather than an array. Perhaps because you have another function call overhead.

ejolson
Posts: 2015
Joined: Tue Mar 18, 2014 11:47 am

Re: Making readable modifyable code.

Tue Dec 13, 2016 5:31 pm

[email protected] wrote:Right. So doing that give me a best time of 0.295 over a few runs.. This is a Pi v3, no overclock, Raspbian lite, same gcc 4.9.2. Up to date Raspbian jessie, but no systemd and running kernel 4.8.13 though. Putting the code in a loop 100 times, gives 25.3s, or 0.253 per cycle...
Failure of the frequency governor to switch to 1200 is sometimes because of a marginal power supply. On the bright side, the Pi is less likely to overheat when running at 600 MHz and RTB is likely twice as fast as you reported.

User avatar
[email protected]
Posts: 1982
Joined: Tue Feb 07, 2012 2:14 pm
Location: Devon, UK
Contact: Website

Re: Making readable modifyable code.

Tue Dec 13, 2016 6:01 pm

ejolson wrote:
[email protected] wrote:Right. So doing that give me a best time of 0.295 over a few runs.. This is a Pi v3, no overclock, Raspbian lite, same gcc 4.9.2. Up to date Raspbian jessie, but no systemd and running kernel 4.8.13 though. Putting the code in a loop 100 times, gives 25.3s, or 0.253 per cycle...
Failure of the frequency governor to switch to 1200 is sometimes because of a marginal power supply. On the bright side, the Pi is less likely to overheat when running at 600 MHz and RTB is likely twice as fast as you reported.
Official "black" 2.5A PSU. No undervoltage messages reported and it definitely runs at 1.2GHz when running the code. Temp goes from 45°C to nearly 50°C during the 100-loop run.

I run this:

Code: Select all

watch -n1  'uptime ; /opt/vc/bin/vcgencmd measure_temp ; /opt/vc/bin/vcgencmd measure_volts ; /opt/vc/bin/vcgencmd measure_clock arm ; /opt/vc/bin/vcgencmd measure_clock core'
Idle:

Code: Select all

temp=45.1'C
volt=1.2000V
frequency(45)=600000000
frequency(1)=250000000
running:

Code: Select all

temp=49.9'C
volt=1.3375V
frequency(45)=1200000000
frequency(1)=400000000
Hmm..

-Gordon
--
Gordons projects: https://projects.drogon.net/

jahboater
Posts: 3027
Joined: Wed Feb 04, 2015 6:38 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 6:21 pm

ejolson wrote:
[email protected] wrote:Again, just for the LOLs, I re-did this in RTB. The C version on a Pi 3 takes one second in my setup - not sure how you got 0.2 there:
The RTB version seems quite readable, but again I would prefer at least 4-space indenting. Using the -mcpu=cortex-a7 compiler flag makes a difference in the speed of the C code.
Why don't you use cortex-a53 for the Pi3? The Pi2 is a cortex-a7 (though the Pi2 V2 is a cortex-a53 as well).

Code: Select all

-mcpu=cortex-a53 -mfpu=neon-fp-armv8

ejolson
Posts: 2015
Joined: Tue Mar 18, 2014 11:47 am

Re: Making readable modifyable code.

Tue Dec 13, 2016 6:23 pm

[email protected] wrote:Official "black" 2.5A PSU. No undervoltage messages reported and it definitely runs at 1.2GHz when running the code. Temp goes from 45°C to nearly 50°C during the 100-loop run.

I run this:

Code: Select all

watch -n1  'uptime ; /opt/vc/bin/vcgencmd measure_temp ; /opt/vc/bin/vcgencmd measure_volts ; /opt/vc/bin/vcgencmd measure_clock arm ; /opt/vc/bin/vcgencmd measure_clock core'
Idle:

Code: Select all

temp=45.1'C
volt=1.2000V
frequency(45)=600000000
frequency(1)=250000000
running:

Code: Select all

temp=49.9'C
volt=1.3375V
frequency(45)=1200000000
frequency(1)=400000000
Hmm..

-Gordon
That is very odd. Early versions of the Pi 3 Bluetooth driver flooded the CPU with interrupts and reduced performance. Is it possible your newer kernel has the older driver? I'm not sure when or if the changes made it upstream. I'm currently running the default kernel 4.4.21-v7+ #911. Do your results change with the kernel?
Last edited by ejolson on Tue Dec 13, 2016 6:27 pm, edited 1 time in total.

jahboater
Posts: 3027
Joined: Wed Feb 04, 2015 6:38 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 6:27 pm

Heater wrote: Ah, yes, the "thermo-nuclear" weapon of big integer arithmetic. As far as I can tell your Python code can handle thousands of digits easily. At least until memory runs out. I have no idea if there is any other limitation.
I believe Python uses normal integers up to their limit, then silently starts using arbitrary precision.
I don't think you need tell it anything.
Just go into python and type 1000 ** 1000.
Pretty clever.

ejolson
Posts: 2015
Joined: Tue Mar 18, 2014 11:47 am

Re: Making readable modifyable code.

Tue Dec 13, 2016 6:37 pm

jahboater wrote:Why don't you use cortex-a53 for the Pi3? The Pi2 is a cortex-a7 (though the Pi2 V2 is a cortex-a53 as well).

Code: Select all

-mcpu=cortex-a53 -mfpu=neon-fp-armv8
For this program and version of gcc there is no difference between cortex-a53 and cortex-a7: the assembler obtained using -S is exactly the same.

timrowledge
Posts: 1139
Joined: Mon Oct 29, 2012 8:12 pm
Location: Vancouver Island
Contact: Website

Re: Making readable modifyable code.

Tue Dec 13, 2016 6:40 pm

ejolson wrote:
timrowledge wrote:If you want an interesting but still trivial test how about something that actually goes a little beyond 32 int integers

Code: Select all

10000 factorial asString size
...
In C the code looks like...
Unsurprisingly a program using carefully crafted code optimised as part of a package of numerics magic is an order of magnitude faster than a simple recursion. Which is another illustration of how difficult it is to glean any meaning from simple benchmarks, and a reminder that "when a measure becomes used as a benchmark it ceases to have any meaning as a measure".

And in any case there seems to have been a bit of a drift (fascinating thought it may be) from the nominal topic...
Making Smalltalk on ARM since 1986; making your Scratch better since 2012

Heater
Posts: 9985
Joined: Tue Jul 17, 2012 3:02 pm

Re: Making readable modifyable code.

Tue Dec 13, 2016 6:48 pm

That is right. Pretty clever.

JS on the other hand works with 64 bit floats. That means integer arithmetic is spot on until you exceed 53 bits (or is it 52?), then you are into floating point land.

What does; Python do when you write:
1000000000000000000000000000000 / 3000000000000000000000000000000

?

Edit: It fails:

>>> 1000000000000000000000000000000 / 3000000000000000000000000000000
0L

Interestingly if you put a ".0" on the end of those numbers, to indicate that you might want a real number result, Pythons seems to cjange to using floating point math and throw away a ton of precision.

Seems you have to be careful in Python as well.

Return to “General programming discussion”