jdonald
Posts: 446
Joined: Fri Nov 03, 2017 4:36 pm

Re: Why moving to 64bit?

Mon Sep 02, 2019 3:44 am

Paeryn wrote:
Mon Sep 02, 2019 2:12 am
The representation that CPython uses is PyLong which divides the integer into an array of 30-bit integers (yes 30, not 32) and if more than one word is needed then the absolute value is stored (the sign is encoded in the length of the integer).
Indeed. Looking closer at PyLong's constituent datatypes it does use some ambiguous unsigned long / long types in there:
https://github.com/python/cpython/blob/ ... r.h#L44-56 . But apparently that #elif clause is limited to PYLONG_BITS_IN_DIGIT == 15, which I've confirmed only gets set that way in 32-bit builds.

The unnecessary uses of long that raised my suspicions at first are the ones like so: https://github.com/python/cpython/searc ... g_fromlong
It'll even do return PyLong_FromLong(-1); because there is no PyLong_FromInt() function. I don't think it can inline these calls unless link-time code generation is now a thing on Linux.

But this would be a few extra registers in a limited number of places, and cannot explain the double-digit performance losses.

Having seen the codebase now, I'm realizing that every tiny object is allocated on the heap as a PyObject*. It makes sense that doubling the pointer width will result in more overhead in programs that on the surface appeared to be compute-bound. In fact it probably affects these programs even more so if they're doing many little char- or integer math operations.

ejolson
Posts: 6003
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Sun Sep 08, 2019 4:29 am

jdonald wrote:
Sun Sep 01, 2019 3:33 pm
Heater, your Rust tests were ARMv6 baselined and thus invalid. Please see my posts above and rerun your tests.
How do you specify ARMv7 and with Cortex-A53 tuning using the Rust compiler?

I have rerun and updated the Python3 timings for the anagram programs here. The loss in performance when moving from 32-bit Raspbian to 64-bit Gentoo at only 16 to 17 percent is not as significant as it used to be. I suspect Python on Raspbian is also compiled to be reverse compatible with the ARMv6 instruction set. Therefore, optimal 32-bit performance numbers may be noticeably better than what was used in that comparison.

jdonald
Posts: 446
Joined: Fri Nov 03, 2017 4:36 pm

Re: Why moving to 64bit?

Sun Sep 08, 2019 5:41 am

ejolson wrote:
Sun Sep 08, 2019 4:29 am
How do you specify ARMv7 and with Cortex-A53 tuning using the Rust compiler?
It's tricky with rustc. I provided guidelines for adding such args a few pages back. Let us know if that gets you anywhere.

Heater
Posts: 16845
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why moving to 64bit?

Sun Sep 08, 2019 7:09 am

jerrm,
Maybe put the loop in the rust program itself to minimize the system overhead?
A very good idea for bench marking, in fact I have had such timing code in the anagram and other challenge codes for a long time. There is a couple of problems with the idea:

1) It hides the load and start up times of the programs. That could be significant for the interpreted language solutions as they parse the source at start up. It could also hide the actual output time, again often very significant.

2) For things like JS with it's JIT engine it will get faster and faster as things are looped. The JIT engine learns how to optimize it dynamically at run time if it is iterated.

All of this means that what we want is not a typical benchmark with many loops over some algorithm but rather the actual, user observed run time, of a single run, from command to result.

As an example, in the case of the Rust anagram finder I put a loop around the actual anagram finder function. Thus timing the algorithm only, excluding time taken to read the dictionary file and print the output. The result is dramatic:

Code: Select all

$ cargo run  --bin insane-british-anagram --release > /dev/null
   Compiling insane-british-anagram v0.1.3 (/mnt/c/Users/heater/conveqs/insane-british-anagram-rust)
    Finished release [optimized] target(s) in 3.62s
     Running `target/release/insane-british-anagram`
Execution time: 301ms
Execution time: 50ms
Execution time: 51ms
Execution time: 50ms
...
The second iteration is massively faster than the first. Given that the use case is just to run the program once using a timing loop like this would badly bias the result.

Why does it get so much faster?

Not sure really. I suspect that dictionary file, read once at start up, is not actually read from disk at that time. Rather Linux lazily reads in pages as the anagram loop accesses it, thus slowing down the first iteration.

Also the memory allocator I am using is very good at not giving memory back to the OS prematurely.

I would be very interested to see if the C version of the anagram finder also speeds up like this when iterated many times in the same run.

Source here for anyone who want's to play: https://github.com/ZiCog/insane-british ... m-rust.git
Memory in C++ is a leaky abstraction .

pica200
Posts: 219
Joined: Tue Aug 06, 2019 10:27 am

Re: Why moving to 64bit?

Sun Sep 08, 2019 12:43 pm

There is another factor. Your code easily fits in the L2 cache and partly in the L1 cache bypassing the overhead of loading the code again from the slug DRAM. It also caches a good chunk of the input data. For small programs working with small data sets this works fine but the L2 cache is way to small to compensate for the halved DRAM bus. I bet the A72 would do significantly better without these limitations.

cyclic
Posts: 16
Joined: Thu May 30, 2013 4:47 pm

Re: Why moving to 64bit?

Sun Sep 08, 2019 2:30 pm

Is zfs a reason for needing 64bit?

Heater
Posts: 16845
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why moving to 64bit?

Sun Sep 08, 2019 3:02 pm

@pica200,

Certainly caching can have a huge effect on performance. Mostly we worry about making data cache friendly but code layout can also be unfriendly to the instruction caching.

As you say, this is a small program and the data is not so big so the impact of cache misses is probably not so significant.

I think the impact of instruction cache misses is negligible here. The loop in that code is small and fits in cache, once it has been around the first time there will be no more instruction caches misses. It goes around many times is the cache loading time is amortized to near zero.

I still feel there is something with the file access that makes that huge difference from the first run to the second. The file is not actually read from disk to memory in that read statement. Rather the blocks are fetched later when the algorithm accesses the memory where they should be. So the first run has all the overhead of doing the actual disk reads.

Anyway, I gave up trying to optimize that program further when I saw this timing result. If those numbers are true then there is not much more I can do.

@cyclic,

Quite possibly. ZFS sounds like a great idea.
Memory in C++ is a leaky abstraction .

pica200
Posts: 219
Joined: Tue Aug 06, 2019 10:27 am

Re: Why moving to 64bit?

Sun Sep 08, 2019 5:58 pm

fread() does a bit of caching internally aswell if i recall correctly. read(), which is basically just a syscall wrapper only goes through the fs cache of the kernel. You could use read() and give the kernel hints that you want non-cached reads for repeated, realistic results.

ejolson
Posts: 6003
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Tue Sep 17, 2019 12:09 am

jdonald wrote:
Tue Aug 13, 2019 4:54 am
Once I added -mcpu=cortex-a72, sysbench gets 10x faster in the 32-bit case to become on par with aarch64.
I have tried to reproduce this result here but was unable. My understanding is that the Cortex-A72 running in 32-bit mode does not have any 64-bit registers and so can't perform any 64-bit divisions. Did you change the source code so that the 64-bit integers given by "unsigned long long" appear as "unsigned long" 32-bit integers everywhere?

jdonald
Posts: 446
Joined: Fri Nov 03, 2017 4:36 pm

Re: Why moving to 64bit?

Tue Sep 17, 2019 1:14 am

Thanks for investigating. Today I have been trying to reproduce my earlier result in that same Debian armhf container and have been unable to. I'm positive I didn't modify any of the C source code when examining this last time.

At this point, my best guess as to what happened last month was that I misread numbers on my screen, then drew incorrect "aha" conclusions after seeing that objdump -d sysbench initially lacked udiv instructions then contained them if compiled for the newer CPU core. Back then I had grepped the assembly from the sysbench binary as a whole, not specifically cpu_execute_event().

So for now I think it's safe to go on the record saying that sysbench --test=cpu is still an order of magnitude faster when compiled for 64-bit, no matter how the 32-bit baseline is tuned. I'll update the other related threads with more details soon.

ejolson
Posts: 6003
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Thu May 28, 2020 7:55 am

jamesh wrote:
Sun Aug 18, 2019 10:38 pm
ejolson wrote:
Sun Aug 18, 2019 10:31 pm
jamesh wrote:
Sun Aug 18, 2019 9:26 pm
What 8GB Pi4? There isn't one.
I thought at least one was made in order to pass the compliance testing, which lists 8GB as being certified. Are you sure there isn't an 8GB model sitting on your desk?
No 8GB Pi4 on my desk. Never seen one.
Sorry for resurrecting this thread, but do you have an 8GB Pi on your desk now?

User avatar
rpdom
Posts: 17720
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: Why moving to 64bit?

Thu May 28, 2020 7:58 am

ejolson wrote:
Thu May 28, 2020 7:55 am
jamesh wrote:
Sun Aug 18, 2019 10:38 pm
ejolson wrote:
Sun Aug 18, 2019 10:31 pm
I thought at least one was made in order to pass the compliance testing, which lists 8GB as being certified. Are you sure there isn't an 8GB model sitting on your desk?
No 8GB Pi4 on my desk. Never seen one.
Sorry for resurrecting this thread, but do you have an 8GB Pi on your desk now?
Nah, jamesh has the 16GB Pi, but he's not allowed to say yet. :lol:
Unreadable squiggle

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 27422
Joined: Sat Jul 30, 2011 7:41 pm

Re: Why moving to 64bit?

Thu May 28, 2020 8:18 am

ejolson wrote:
Thu May 28, 2020 7:55 am
jamesh wrote:
Sun Aug 18, 2019 10:38 pm
ejolson wrote:
Sun Aug 18, 2019 10:31 pm
I thought at least one was made in order to pass the compliance testing, which lists 8GB as being certified. Are you sure there isn't an 8GB model sitting on your desk?
No 8GB Pi4 on my desk. Never seen one.
Sorry for resurrecting this thread, but do you have an 8GB Pi on your desk now?
Nope, and I don't think I have yet seen one!
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 9905
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: Why moving to 64bit?

Thu May 28, 2020 8:19 am

ejolson wrote:
Thu May 28, 2020 7:55 am
jamesh wrote:
Sun Aug 18, 2019 10:38 pm
ejolson wrote:
Sun Aug 18, 2019 10:31 pm
I thought at least one was made in order to pass the compliance testing, which lists 8GB as being certified. Are you sure there isn't an 8GB model sitting on your desk?
No 8GB Pi4 on my desk. Never seen one.
Sorry for resurrecting this thread, but do you have an 8GB Pi on your desk now?
They've been in moderately short supply around the office for initial testing, but James was not lying back in August that there wasn't an 8GB Pi around.
I think first modified boards were around January, and those were reworked 4GB models to test out the new RAM chip.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

ag123
Posts: 111
Joined: Sun Dec 18, 2016 7:54 am

Re: Why moving to 64bit?

Thu May 28, 2020 8:33 am

increasingly 64bits vs 32bits is more a matter of software maintenance. this is a really bad result but it is happening.
some distributions and apps shipped only 64 bits and some shipped only 32 bits.
32 bits tend to be more efficient on memory (the size of int is 32 bits, while it is 64 bits on 64 bits) and runs faster in 32 bits partly as memory transfers could be as much as halved so more data fit in cache as well.

but the move towards either 'pure 64' or 'pure 32' apps and distributions is segregating the industry. e.g. i think even Microsoft dropped 32 bits OS in the more recent releases of Windows 10. it would be sad if the industry dropped 32 bits simply because it is easier to maintain only 64 bits regardless of efficiency or other reasons.

the choice of 64 bits vs 32 bits is not a matter of the different physical efficiencies between running certain apps on 32 bits or 64 bits otherwise.
it is more a 'corporate' maintenance decision to either stick with either 64 bits or 32 bits to reduce (human) maintenance workloads, i.e. some will just ship 64 bits os and 64 bits binaries, no 'compatible' or fallback 32 bit libraries even if your chip supports it. it is 64 bits take it or leave it.

many will just declare 'end-of-life' for 'older' software and will even refuse to provide support for your 'older' hardware / software
Last edited by ag123 on Thu May 28, 2020 9:03 am, edited 4 times in total.

User avatar
dickon
Posts: 1808
Joined: Sun Dec 09, 2012 3:54 pm
Location: Home, just outside Reading

Re: Why moving to 64bit?

Thu May 28, 2020 8:58 am

Actually, the main difference is in pointer size, which is 64b; ints are still 32b. Most platforms -- Linux included -- have gone with the LP64 model, not the ILP64 model. To be different, Microsoft use LLP64.

User avatar
jahboater
Posts: 6286
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: Why moving to 64bit?

Thu May 28, 2020 8:59 am

ag123 wrote:
Thu May 28, 2020 8:33 am
32 bits tend to be more efficient on memory (the size of int is 32 bits, while it is 64 bits on 64 bits) and runs faster in 32 bits partly as memory transfers could be as much as halved so more data fit in cache as well.
No, "int" remains the same size at 32 bits
Linux (Raspbian) uses the LP64 memory model which on Aarch64 says Longs and Pointers are 64-bit, integers remain 32-bit.
(I guess on some other platforms like SPARC, integers could be 64-bits).
In 32-bit mode, the model is ILP32 meaning Integers, Longs and Pointers are all 32 bit.
Pi4 8GB running PIOS64 Lite

User avatar
dickon
Posts: 1808
Joined: Sun Dec 09, 2012 3:54 pm
Location: Home, just outside Reading

Re: Why moving to 64bit?

Thu May 28, 2020 9:20 am

Around the early-'90s, the Unix world could see some important changes were on the horizon: shared libraries (with ELF) and 64b being the two obvious things, and in a rare display of unity, adopted both ELF and LP64. Both MIPS64 and SPARC64 have useful 32b operations, so it made sense.

The only real outlier is Windows. Microsoft allegedly had a lot of code around which assumed sizeof(long) == 4, so long long became a thing. There were also some slightly unusual architectures which did ILP64, but they weren't really mainstream, and IIRC were 64b clean, rather than patches to an already extant ISA. My memory is getting hazy now.

ag123
Posts: 111
Joined: Sun Dec 18, 2016 7:54 am

Re: Why moving to 64bit?

Thu May 28, 2020 9:46 am

jahboater wrote:
Thu May 28, 2020 8:59 am
ag123 wrote:
Thu May 28, 2020 8:33 am
32 bits tend to be more efficient on memory (the size of int is 32 bits, while it is 64 bits on 64 bits) and runs faster in 32 bits partly as memory transfers could be as much as halved so more data fit in cache as well.
No, "int" remains the same size at 32 bits
Linux (Raspbian) uses the LP64 memory model which on Aarch64 says Longs and Pointers are 64-bit, integers remain 32-bit.
(I guess on some other platforms like SPARC, integers could be 64-bits).
In 32-bit mode, the model is ILP32 meaning Integers, Longs and Pointers are all 32 bit.
thanks indeed, i just did some tests

Code: Select all

#include <stdio.h>

int main(int argc, char*argv) {
  printf("size of int %d\n", sizeof(int));

}
i used a 64 bits cross compiler as that in Raspbian Buster is 32 bits
https://www.linaro.org/downloads/

Code: Select all

gcc-linaro-7.5.0-2019.12-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc --static -o inttest inttest.c
then run it on Pi4

Code: Select all

size of int 4
if i changed sizeof(int) to sizeof(long) that is 8 bytes, that's correct. int remains at 32 bits.

Heater
Posts: 16845
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why moving to 64bit?

Thu May 28, 2020 11:06 am

I did some tests as well just now.

I have some code that demonstrates cache friendly and cache unfriendly ways of adding the transpose of a huge 2D array to another huge array of the same size. https://github.com/ZiCog/loop_blocking. The arrays in my test are 8192 by 8192 for a total of nearly 68 mega bytes each.

I tweaked it to use 32 or 64 bit integer array elements and compiled and ran it on the Pi 4 in 32 bit mode and 64 bit mode under nspawn64.

Here are the results:

Code: Select all

loop_blocking execution times:

       |   32 bit ints   |    64 bits  
--------------------------------------       
Pi 32  |   844ms         |    1216ms
Pi 64  |   858ms         |    1198ms


loop_blocking code size:

       |   32 bit ints   |    64 bits  
--------------------------------------       
Pi 32  |   2715          |    3763
Pi 64  |   3442          |    3450    
Given the accuracy of my timing and the fact that I only ran each test once I conclude:

1) Execution time of 32 or 64 bit instruction sets is the same.

2) Obviously changing the data size impacts performance, the same in both cases mind.

3) The size of the executables in memory is the same as far as anyone might care.

I conclude that there is no downside in terms of performance or memory space with 64 bit over 32 bit and that anyone making such claims has not done any measurements and hence no not of what they speak.

Of course modern operating systems and applications do seem to be suffering from massive code bloat and inefficiency but I think that can be put down to many other factors. Interpreted languages, Java, Unicode and internationalization support, graphical "bling", generally bigger images and video and other data, lazy software engineers, etc, etc.

Note: The timings above are for the cache friendly versions of the code. Performing the same operations in the naive cache unfriendly way is many times slower. Code is in the repo above if anyone wants to play with it.
Memory in C++ is a leaky abstraction .

User avatar
PeterO
Posts: 5968
Joined: Sun Jul 22, 2012 4:14 pm

Re: Why moving to 64bit?

Thu May 28, 2020 11:25 am

ejolson wrote:
Thu May 28, 2020 7:55 am
Sorry for resurrecting this thread, but do you have an 8GB Pi on your desk now?
YES ! 8-) Downloading 64 bit OS now :-)

PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

plugwash
Forum Moderator
Forum Moderator
Posts: 3695
Joined: Wed Dec 28, 2011 11:45 pm

Re: Why moving to 64bit?

Thu May 28, 2020 2:06 pm

Heater wrote:
Thu May 28, 2020 11:06 am
I conclude that there is no downside in terms of performance or memory space with 64 bit over 32 bit and that anyone making such claims has not done any measurements and hence no not of what they speak.
I conclude that you are taking one example and extrapolating too much. Your code deals with a lot of numbers and only a handful of pointers, so the memory and cache penalty from going 64-bit is negligible.

That does not mean that the memory and cache penalty of 64-bit pointers is always negligible.

Heater
Posts: 16845
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why moving to 64bit?

Thu May 28, 2020 4:13 pm

plugwash wrote:
Thu May 28, 2020 2:06 pm
Heater wrote:
Thu May 28, 2020 11:06 am
I conclude that there is no downside in terms of performance or memory space with 64 bit over 32 bit and that anyone making such claims has not done any measurements and hence no not of what they speak.
I conclude that you are taking one example and extrapolating too much. Your code deals with a lot of numbers and only a handful of pointers, so the memory and cache penalty from going 64-bit is negligible.

That does not mean that the memory and cache penalty of 64-bit pointers is always negligible.
Yes of course. Such micro benchmarks should be taken with a pinch of salt. I look forward to seeing counter examples from those who suggest 64 bits hinders performance. What matters at the end of the day though is how ones actual application performs.

However I have to say that the whole point of that code is about cache pressure. Data cache at least. Use your cache unwisely and you soon have a far bigger performance hit than going 64 bit.
Memory in C++ is a leaky abstraction .

User avatar
jahboater
Posts: 6286
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: Why moving to 64bit?

Sat May 30, 2020 12:16 pm

Heater wrote:
Thu May 28, 2020 4:13 pm
However I have to say that the whole point of that code is about cache pressure. Data cache at least. Use your cache unwisely and you soon have a far bigger performance hit than going 64 bit.
I thought the costs of paging stuff in and out all the time for large programs, or large total memory requirements, under LPAE was the worst problem for 32-bits ? All gone in 64-bit mode of course.

The 64-bit instruction set A64 has been re-designed and streamlined for modern hardware which, now we are using out-of-order CPU's (the A72), may be significant. The slowest instructions have been removed.
Pi4 8GB running PIOS64 Lite

Return to “General discussion”