User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 8:39 pm

Heater wrote:
Thu Nov 15, 2018 8:09 pm
I could not find any way to get gcc to generate AIF files. If you know the AIF format then perhaps it's possible to use GCC's objdump utiity to dump the executable format in various textual formats and write a converter from one of those to AIF.

What is that gcc option you have there "-mlibscl"? I cannot find any reference to that option in any GCC documentation.

I was curious as to what DDE is. I can't find any reference to that on the web either.
It is in the help files that come with gcc for RISC OS. The option I think is only available in the RISC OS version of gcc, so if you are trying to use a different platform it is unlikely to produce an AIF. For the moment see the PRM's for the AIF header (do not have a quick reference on hand).

The DDE is the Desktop Development Environment on RISC OS. It is the native C compiler, Linker, Make utility, Assembler, BASIC Compiler, Resource Editor, etc, all distrubited together. It is a comercial product.
For info on DDE you can see these:
https://www.riscosopen.org/content/sales/dde
https://en.wikipedia.org/wiki/Acorn_C/C%2B%2B
There are many many more.
Aside:
So RISC V is behind the curve in that regards, it would take 35 years for RISC V to catch up if RISC V becomes popular enough to ever catch up.
More like 35 months and it's mostly done already. GCC and Clang now target RISC V. RISC V is officially supported by the Linux kernel. Almost all of Debian and RedHat and other Linux distributions have RISC V builds, that's tens of thousands of packages that we know and love.

There are some important things still missing. Like a JVM and modern Javascript engines for browsers.

There is a paradox here: Whilst the instruction set is the most important interface in all of computer science it is not expected that people start writing lots of code in RISC V assembler.
I am talking about catching up as in having as much software written for it specifically (not just ported) as the ARM does.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 8:44 pm

Redundant post.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

Heater
Posts: 12149
Joined: Tue Jul 17, 2012 3:02 pm

Re: Divide By Depth??

Thu Nov 15, 2018 9:06 pm

DavidS.

Is that GCC for RISC OS some customized, non-mainline, fork of GCC? If so I guess you will have to ask whoever it's developers are.

If GCC supported AIF output I'd expect to be able to build an GCC cross-compiler that targeted RISC OS on my Intel PC. The documentation would be in the usual GCC documentation. I can't find it.
I am talking about catching up as in having as much software written for it specifically (not just ported) as the ARM does.
I see.

In that case RISC V will never catch up. Which is a good thing. Nobody writes code targeting specific instruction sets architectures today. Except code generators in compilers and JITS of course, and code in the lowest levels of operating systems.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 10:00 pm

Heater wrote:
Thu Nov 15, 2018 9:06 pm
DavidS.

Is that GCC for RISC OS some customized, non-mainline, fork of GCC? If so I guess you will have to ask whoever it's developers are.

If GCC supported AIF output I'd expect to be able to build an GCC cross-compiler that targeted RISC OS on my Intel PC. The documentation would be in the usual GCC documentation. I can't find it.
Yes you can cross compile for RISC OS, using the RISC OS GCCSDK, see:
https://www.riscos.info/index.php/GCCSDK

And for the RISC OS version see:
https://www.riscos.info/index.php/GCC_for_RISC_OS
I am talking about catching up as in having as much software written for it specifically (not just ported) as the ARM does.
I see.

In that case RISC V will never catch up. Which is a good thing. Nobody writes code targeting specific instruction sets architectures today. Except code generators in compilers and JITS of course, and code in the lowest levels of operating systems.
It is a good thing that is not true (we would not have your ZiCog if it were). Even you do it.

It is the software developed for specific targets that truely sets them apart from the rest. The stuff that is portable to anything is just the same old same old.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

Re: Divide By Depth??

Thu Nov 15, 2018 10:35 pm

DavidS wrote:
Thu Nov 15, 2018 10:00 pm
It is the software developed for specific targets that truely sets them apart from the rest. The stuff that is portable to anything is just the same old same old.
Sorry David, I agree with much of what you say, not not this.

Write a useful program and restrict its use to just one single version of one single architecture, on one single OS platform, is just crazy!
Apart from the fact that very few people can use your program as is, it almost immediately becomes obsolete as the hardware moves on.
(that's what happens when you write in assembler - its far too costly to rewrite).

Write the same program in a portable language, and you immediately have a wide audience and a future.

UNIX was initially written in assembler. Early on, the clever engineers who wrote it realized the stupidity of that and re-wrote it from scratch in C.
(at the same time making it a full multi-user OS).

That enlightened rewrite ensured the success of UNIX.
I was around in the early days, and there were constant stories of UNIX ports being done in two weeks - unheard of in those days. So UNIX spread like wildfire and became successful.

The hardware for super computers tends to change often, because top performance was more important than backwards compatibility. Because of the frequent changes, they stopped using their proprietary OS's and changed to UNIX because it was easy to port!
Now all the top 500 super computers run Linux.

If a program is really good and useful, make sure that people can use it on all platforms, both now and in the future.

The magic of C is that (with a recent compiler) you can write large portable programs that will run faster than any human could possibly write in assembler.
Last edited by jahboater on Thu Nov 15, 2018 10:44 pm, edited 1 time in total.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 10:44 pm

jahboater wrote: Sorry David, I agree with much of what you say, not not this.
While I do like assembly, that is deffinitely not what I am saying in this case.

It is the code that sets things apart, uses a specific systems features, that makes any good system stand out. If it be written in C, BCPL, Pascal, BASIC, or even assembly language, it matters not. It is making use of what the target provides different from other targets that is what sets the targets apart.

If this were not the case we would all still be running the early versions of MS-BASIC on our personal computers as the primary User Interface, and tunnel to the single tasking super minimal OS.

So Why do you choose the OS that you choose? Why do you choose the CPU that you use? Something tells me it is not because of what is the same as everything else.

And when such things are ported then it becomes a different thing, it is not quite the same in some way that makes using it different.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

Heater
Posts: 12149
Joined: Tue Jul 17, 2012 3:02 pm

Re: Divide By Depth??

Thu Nov 15, 2018 10:55 pm

DavidS.

So what is the prolem with creating AIF files then, Why not just use that cross compiler? What am I missing here?

That is an interesting point about my creating ZiCog. I don't think it ibvalidates my arguments though. ZiCog was never intentended to be useful. It was just my fun hobby project. A PASM learning exercise. A challenge to get an 8080 (originally) emulator to fit into the 500 odd instruction space of a Propeller COG. No more significant than if I had spent my time on soduku.

It has been suggested that such an emulator could be done on an ARM MCU. That is certainly true. And more easily as well. Had I wanted to do that I would have written it in C. That would have been easier, perform well enough and ensure it could be made to work on other MCUs. MIPS for example. More usable by more people. I would not bother doing that though, there is no challenge and there are already many such emulators in C out there.

Cross-platform software might be "same old same old". My claim is that being cross-platform makes it a lot more useful to a lot more people than if it were machine specific. It also ensures that it remains useful for a lot longer. In support of my claim I point to all the thousands of package available for the Pi in Raspbian.

Ergo, striving to create portable software is a great thing. Anything else is a dead end.

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

Re: Divide By Depth??

Thu Nov 15, 2018 10:58 pm

DavidS wrote:
Thu Nov 15, 2018 10:44 pm
It is making use of what the target provides different from other targets that is what sets the targets apart.
OK I see what you mean by that. You can take advantage of specific nice features that a particular target has and make a better program.
Which is a good thing. The bad thing is that your program will only ever run on that one target; which restricts the number of users that can use it, and I posit, will cause its demise in the long run.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 11:01 pm

jahboater wrote:
Thu Nov 15, 2018 10:58 pm
DavidS wrote:
Thu Nov 15, 2018 10:44 pm
It is making use of what the target provides different from other targets that is what sets the targets apart.
OK I see what you mean by that. You can take advantage of specific nice features that a particular target has and make a better program.
Which is a good thing. The bad thing is that your program will only ever run on that one target; which restricts the number of users that can use it, and I posit, will cause its demise in the long run.
I will agree with that completely. There are some things that need a bit more portability, if there useflulness calls for such.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 11:09 pm

Heater wrote:
Thu Nov 15, 2018 10:55 pm
DavidS.

So what is the prolem with creating AIF files then, Why not just use that cross compiler? What am I missing here?
No problem at all. I was originally only pointing out that if you use gcc with the normal options on RISC OS it will produce an elf, that requires the UnixLib loader to load it, and can only use the UnixLib dynamic ELF format libraries, which are not native to RISC OS, and thus you take a hit for doing so.
That is an interesting point about my creating ZiCog. I don't think it ibvalidates my arguments though. ZiCog was never intentended to be useful. It was just my fun hobby project. A PASM learning exercise. A challenge to get an 8080 (originally) emulator to fit into the 500 odd instruction space of a Propeller COG. No more significant than if I had spent my time on soduku.

It has been suggested that such an emulator could be done on an ARM MCU. That is certainly true. And more easily as well. Had I wanted to do that I would have written it in C. That would have been easier, perform well enough and ensure it could be made to work on other MCUs. MIPS for example. More usable by more people. I would not bother doing that though, there is no challenge and there are already many such emulators in C out there.
While you may have been simply toying around, what you made is quite usefull. And many things are made that way, that turn out to be usefull.

I am currently toying around with a 3D rendering engine, that I do not intend to be truely useful. It is just going to become part of a little toy STL slicer that produces gcode for FDM 3D printers, and also part of a simple script based CAD, both of which are just toys not intended to be truely useful to anyone.

Despite my toys being of little use in my eyes, I am writing them ultimately in C (testing in BBC BASIC V) and minimizing the amount of OS specific code used, and avoiding HW specific code completely. Even though I could do some neat extras if I did not worry about portability, these kind of toys are something that someone may wish to port someday (kind of like an MCU based 8-bit CPU emulator).
Cross-platform software might be "same old same old". My claim is that being cross-platform makes it a lot more useful to a lot more people than if it were machine specific. It also ensures that it remains useful for a lot longer. In support of my claim I point to all the thousands of package available for the Pi in Raspbian.

Ergo, striving to create portable software is a great thing. Anything else is a dead end.
Though as you say there are cases where it is appropriate to target one platform. You gave the logic in the argument.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 11:12 pm

DavidS wrote:
Thu Nov 15, 2018 11:01 pm
jahboater wrote:
Thu Nov 15, 2018 10:58 pm
DavidS wrote:
Thu Nov 15, 2018 10:44 pm
It is making use of what the target provides different from other targets that is what sets the targets apart.
OK I see what you mean by that. You can take advantage of specific nice features that a particular target has and make a better program.
Which is a good thing. The bad thing is that your program will only ever run on that one target; which restricts the number of users that can use it, and I posit, will cause its demise in the long run.
I will agree with that completely. There are some things that need a bit more portability, if there useflulness calls for such.

That said if the usefulness does not call for portability, and you are taking a performance hit in making something portable, well then.

I am currently working on a few toys that it is unlikely anyone else will find interesting, though I am still making sure that they end up being in C, and that the amount of OS Specific code is kept down to a minimum (and well isolated) to improve potential portability. In doing so I am not able to do a couple of things that could take extra advantage of RISC OS and the Raspberry Pi HW that would be nice to have in the programs I am writing.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

Heater
Posts: 12149
Joined: Tue Jul 17, 2012 3:02 pm

Re: Divide By Depth??

Thu Nov 15, 2018 11:14 pm

DavidS.
If this were not the case we would all still be running the early versions of MS-BASIC on our personal computers as the primary User Interface, and tunnel to the single tasking super minimal OS.
No. People concerned about cross-platform software would not be using MS-BASIC or Visual Basic etc. That was Windows specific.
So Why do you choose the OS that you choose? Why do you choose the CPU that you use? Something tells me it is not because of what is the same as everything else.
Because the OS we choose runs on Intel PC's, Raspberry Pi (and other) ARM, MIPS machines, PowerPC embedded systems, now RISC V, and many others.
And when such things are ported then it becomes a different thing, it is not quite the same in some way that makes using it different.
Hmmm.... Recently I have been using a Win 10 PC a lot. On that PC I have a lot of the same applications I use on Linux: Chrome and Firefox browsers, InteliJ IDE, VS Code, Atom, Sublime, Quartus, Gimp, Inkscape, Lire Office, Etcher, etc... Not to mention command line things like node.js, Python, Scala, and so on.

Guess what? They are so much the same sometimes I forget this is not Linux machine.

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

Re: Divide By Depth??

Thu Nov 15, 2018 11:19 pm

DavidS wrote:
Thu Nov 15, 2018 11:12 pm
That said if the usefulness does not call for portability, and you are taking a performance hit in making something portable, well then.
Talking about programming languages, portability and performance:-

The C math library (-lm) has always been partially written in assembler.
In the "sysdeps" tree there are assembler routines for each of the supported platforms.

Recently the developers rewrote much of the assembler stuff in C - to improve the performance!

C compilers have come of age, the best human written assembler cannot compete ... (*)

Obviously there were the additional benefits of easy maintenance and portability.

(*) obviously there are special cases (SIMD is one), mainly where the human coder has special knowledge of the data that the compiler cannot know. Even that's changing, the modern data-flow analysis compilers do is so clever.
Last edited by jahboater on Thu Nov 15, 2018 11:25 pm, edited 1 time in total.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 11:24 pm

Heater wrote:
Thu Nov 15, 2018 11:14 pm
DavidS.
If this were not the case we would all still be running the early versions of MS-BASIC on our personal computers as the primary User Interface, and tunnel to the single tasking super minimal OS.
No. People concerned about cross-platform software would not be using MS-BASIC or Visual Basic etc. That was Windows specific.
What????
Not VB, NO NO.

I said MS-BASIC. You know the basic interpreter that was in the ROM of just about every early computer, was the BASIC of th Apple II, many z80 systems, the Commodore 8-bit machines, etc. It was the user interface of these machines.

What I said is that if people did not want to push the limits of the target systems we would still be using that, which is where platform specific code comes from.
So Why do you choose the OS that you choose? Why do you choose the CPU that you use? Something tells me it is not because of what is the same as everything else.
Because the OS we choose runs on Intel PC's, Raspberry Pi (and other) ARM, MIPS machines, PowerPC embedded systems, now RISC V, and many others.
And when such things are ported then it becomes a different thing, it is not quite the same in some way that makes using it different.
Hmmm.... Recently I have been using a Win 10 PC a lot. On that PC I have a lot of the same applications I use on Linux: Chrome and Firefox browsers, InteliJ IDE, VS Code, Atom, Sublime, Quartus, Gimp, Inkscape, Lire Office, Etcher, etc... Not to mention command line things like node.js, Python, Scala, and so on.

Guess what? They are so much the same sometimes I forget this is not Linux machine.
You are talking about two platforms that try very hard to copy each other in there UI. And as such none of what is being ported took advantage of features specific to either platform, which was the case I was talking about and said I was talking about.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Thu Nov 15, 2018 11:32 pm

jahboater wrote:
Thu Nov 15, 2018 11:19 pm
DavidS wrote:
Thu Nov 15, 2018 11:12 pm
That said if the usefulness does not call for portability, and you are taking a performance hit in making something portable, well then.
Talking about programming languages, portability and performance:-

The C math library (-lm) has always been partially written in assembler.
In the "sysdeps" tree there are assembler routines for each of the supported platforms.

Recently the developers rewrote much of the assembler stuff in C - to improve the performance!

C compilers have come of age, the best human written assembler cannot compete ... (*)

Obviously there were the additional benefits of easy maintenance and portability.
Yes that is a truth for sure. The ease of maintenance that comes from Assembly sacrifices some portabiltiy. This is one of the unfortunate truths.
(*) obviously there are special cases (SIMD is one), mainly where the human coder has special knowledge of the data that the compiler cannot know. Even that's changing, the modern data-flow analysis compilers do is so clever.
Remember that there is a fairly big price we are paying for compilers to be capable of doing that level of code analysis, as it can get rather complex. The code analysis is eating RAM, CPU Time, and disk space. This is why a simple compiler like TCC is so much faster, smaller, and more effecient in its own use of resources, it does not attempt to do any significant analysis.

The tradition used to be let the compiler do the simple optimizations, and hand optimize the speed critical sections of the compilers output. At least when dealing with portable code. This is still a better model to me, as it often means spending 20 minutes optimizing the critical sections by hand to save a couple of hours of compile time.

And the hand optimizations only need to be done one time (unless the code of the critical section changes), not every compile.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

Heater
Posts: 12149
Joined: Tue Jul 17, 2012 3:02 pm

Re: Divide By Depth??

Fri Nov 16, 2018 3:06 am

DavidS.
And as such none of what is being ported took advantage of features specific to either platform...
That is a fair point.

I imagine that inorder to be cross platform they have to work at some level of lowest common denominator in terms of GUI features.

Personally I very happy that that lowest common denominator is high enough that all those applications behave exactly the same on all platforms they run on. That means I don't have any weirdness or even differences to get used to when moving from place to place. Everything is the same familiar and comfortable.
The ease of maintenance that comes from Assembly...
Sometimes I start to believe that you are joking with us. Nobody could possibly claim, with a straight face, that writing code in assembler increases the ease of maintenance. It does not. Having worked for too many years in assembler, on various machines, I know this to be true.
...sacrifices some portabiltiy.
Similarly no one can claim that assembler is more portable. It is not.
The tradition used to be let the compiler do the simple optimizations, and hand optimize the speed critical sections of the compilers output.
Perhaps it was.

Three decades ago I had such notions on a project for it's performance bottle neck. Not even in C, this project was in PL/M 86. I found I could get a huge speed boost by rearranging the algorithms PL/M source a bit and unrolling a loop. After that I looked into the assembler output to see what further gains to could be had. Turned out to only be a few percent. We left as PL/M source as we had achieved the performance target and having readable, maintainable source was a better idea.

Today it's very hard to out do the compiler with even those source tweaks. It knows many more tricks to optimize code that you do. Also optimizations that work well on one machine may deoptimize things when moved to another machine. This is true even when moving code between different generations of the same architecture. New processor revisions with different caches, pipelines, branch predictors, execution units, instructions may turn you old optimizations into deoptimizations.

At the end of the day the last thing you want in a large code base is different source code for every architecture you want to run on and every variant of processor within each architecture. That would be a huge waste of man power and a maintenance nightmare.

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

Re: Divide By Depth??

Fri Nov 16, 2018 8:22 am

Heater wrote:
Fri Nov 16, 2018 3:06 am
Today it's very hard to out do the compiler with even those source tweaks. It knows many more tricks to optimize code that you do. Also optimizations that work well on one machine may deoptimize things when moved to another machine. This is true even when moving code between different generations of the same architecture. New processor revisions with different caches, pipelines, branch predictors, execution units, instructions may turn you old optimizations into deoptimizations.
Exactly.

Take ARM CPU's for example, a clever assembler programmer will schedule the instruction sequence to avoid delays from dependencies.
Its very hard and time consuming to get it right.
Then comes the next model where the hardware does all that for you (Out of Order) execution.
Then the next model where they are starting to fuse instructions together (so placing unrelated instructions in between is a disaster).
And so on.

Example:
You might think that "movt" depends on the prior "movw" and carefully put another instruction in between so the "movw" has time to complete.
The latest ARM cpu's will fuse adjacent movw/movt instructions into a single operation (so it loads a full 32-bit immediate with one instruction).
The instruction you placed in between foils that. So you have to rewrite ...

The compiler understands all this and will do the right thing for each ARM CPU version (nowadays without even having to tell it the CPU model (-mcpu=native)).

On my Pi3+ the compiler emits about 3000 lines of assembler per second (with optimization) - much much faster than the human assembler programmer.

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

Re: Divide By Depth??

Fri Nov 16, 2018 8:35 am

DavidS wrote:
Thu Nov 15, 2018 11:32 pm
The tradition used to be let the compiler do the simple optimizations, and hand optimize the speed critical sections of the compilers output. At least when dealing with portable code. This is still a better model to me, as it often means spending 20 minutes optimizing the critical sections by hand to save a couple of hours of compile time.
No longer needed.
Look at the assembler output from a recent version of GCC - if you find a single unnecessary or misplaced instruction, its very surprising. If you do find an extra instruction, look harder (and over a wider area) - it almost always means you have not understood what the compiler is doing. It may well have deduced your algorithm and improved it. If you are convinced that the instruction is not needed - raise a bug.

Only worth doing of course if you have the latest compiler version (GCC 8.2 at the time of writing). Another advantage of recent versions if you are looking at the assembler produced is that (from GCC 7) it can intersperse the source code, so you can easily find things.

If you have Raspbian, this script will download, build, and install the latest version of GCC.

Code: Select all

#!/bin/bash

#
#  This is the new GCC version to install.
#
VERSION=8.2.0

#
#  For the Pi or any computer with less than 2GB of memory.
#
if [ -f /etc/dphys-swapfile ]; then
  sudo sed -i 's/^CONF_SWAPSIZE=[0-9]*$/CONF_SWAPSIZE=1024/' /etc/dphys-swapfile
  sudo /etc/init.d/dphys-swapfile restart
fi

if [ -d gcc-$VERSION ]; then
  cd gcc-$VERSION
  rm -rf obj
else
  wget ftp://ftp.fu-berlin.de/unix/languages/gcc/releases/gcc-$VERSION/gcc-$VERSION.tar.xz
  tar xf gcc-$VERSION.tar.xz
  cd gcc-$VERSION
  contrib/download_prerequisites
fi
mkdir -p obj
cd obj

#
#  Now run the ./configure which must be checked/edited beforehand.
#  Uncomment the sections below depending on your platform.  You may build
#  on a Pi3 for a target Pi Zero by uncommenting the Pi Zero section.
#  To alter the target directory set --prefix=<dir>
#

# Pi3+, Pi3, and new Pi2
../configure --enable-languages=c,c++ --with-cpu=cortex-a53 \
  --with-fpu=neon-fp-armv8 --with-float=hard --build=arm-linux-gnueabihf \
  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

# Pi Zero's
#../configure --enable-languages=c,c++ --with-cpu=arm1176jzf-s \
#  --with-fpu=vfp --with-float=hard --build=arm-linux-gnueabihf \
#  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

# x86_64
#../configure --disable-multilib --enable-languages=c,c++ --enable-checking=no

# Odroid-C2 AArch64
#../configure --enable-languages=c,c++ --with-cpu=cortex-a53 --enable-checking=no

# Old Pi2
#../configure --enable-languages=c,c++ --with-cpu=cortex-a7 \
#  --with-fpu=neon-vfpv4 --with-float=hard --build=arm-linux-gnueabihf \
#  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

#
#  Now build GCC which will take a long time.  This could range from
#  4.5 hours on a Pi3B+ up to about 50 hours on a Pi Zero.  It can be
#  left to complete overnight (or over the weekend for a Pi Zero :-)
#  The most likely causes of failure are lack of disk space, lack of
#  swap space or memory, or the wrong configure section uncommented.
#  The new compiler is placed in /usr/local/bin, the existing compiler remains
#  in /usr/bin and may be used by giving its version gcc-6 (say).
#
if make -j `nproc`
then
  echo
  read -p "Do you wish to install the new GCC (y/n)? " yn
  case $yn in
   [Yy]* ) sudo make install ;;
     * ) exit ;;
  esac
fi

#
# An alternative way of adding swap
#
#sudo dd if=/dev/zero of=/swapfile1GB bs=1G count=1
#sudo chmod 0600 /swapfile1GB
#sudo mkswap /swapfile1GB
#sudo swapon /swapfile1GB



User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Fri Nov 16, 2018 10:11 am

Heater wrote:
Fri Nov 16, 2018 3:06 am
DavidS.
And as such none of what is being ported took advantage of features specific to either platform...
That is a fair point.

I imagine that inorder to be cross platform they have to work at some level of lowest common denominator in terms of GUI features.

Personally I very happy that that lowest common denominator is high enough that all those applications behave exactly the same on all platforms they run on. That means I don't have any weirdness or even differences to get used to when moving from place to place. Everything is the same familiar and comfortable.
The ease of maintenance that comes from Assembly...
Sometimes I start to believe that you are joking with us. Nobody could possibly claim, with a straight face, that writing code in assembler increases the ease of maintenance. It does not. Having worked for too many years in assembler, on various machines, I know this to be true.
Kind of a joke of forms I guess. I was pointing out how messy a lot of high level language code is now days. I see uninteligable C presented as good examples every day, and worse C code in production level products. Sometimes it looks like the author of sections of many projects is bidding to win the Obfuscated C Contest, and is very good at the obfucation part.

There are multiple sides to that coin. Yes well written code in a High Level Language will be more maintainable, though code written by enough people in C can become much more of a headache to maintain than the same written in well thought out assembly.

Though as long as the code is clean and has a minimal of contributers, yes high level languages are much easier to maintain. Or if the assembly language author is not very good about keeping there code modular high level languages are much easier to maintain.
...sacrifices some portabiltiy.
Similarly no one can claim that assembler is more portable. It is not.
Which is why I said that using assembly sacrifices some portability. At best it ties you to a single CPU.
The tradition used to be let the compiler do the simple optimizations, and hand optimize the speed critical sections of the compilers output.
Perhaps it was.

Three decades ago I had such notions on a project for it's performance bottle neck. Not even in C, this project was in PL/M 86. I found I could get a huge speed boost by rearranging the algorithms PL/M source a bit and unrolling a loop. After that I looked into the assembler output to see what further gains to could be had. Turned out to only be a few percent. We left as PL/M source as we had achieved the performance target and having readable, maintainable source was a better idea.
Cool. Though a few percent can make a huge difference in code that is run hundreds of thousands of time per second, a fraction of a percent can make a huge difference at that.
Today it's very hard to out do the compiler with even those source tweaks. It knows many more tricks to optimize code that you do. Also optimizations that work well on one machine may deoptimize things when moved to another machine. This is true even when moving code between different generations of the same architecture. New processor revisions with different caches, pipelines, branch predictors, execution units, instructions may turn you old optimizations into deoptimizations.
There is no question that modern compilers are capable of extreme optimizations, though at a cost which is often way to high. It makes since to allow the compiler to do the optimizations if the project is small enough that it does not take very long to compile with the optimizations turned on.

On the other hand when a project gets to the point where it takes more than half an hour to compile with the needed level of optimization (which as you know does not need to be very big), then it no longer makes sence to have the compiler do so much optimization (and it is time to fall back to tcc).

Remember that the time critical sections are usually only a few small sections of the code, and if you know exactly where these are it is fairly easy to optimize by hand in a fairly small period of time (under half an hour in 99% of cases) as it is often less than 1% of the code that needs optimized. And once you have optimized the output from the compiler by hand it need not be redone until there is a change in the code (unless there was a bug introduced by the hand optimization, fairly rare though does happen).

So it is still worth the time to hand optimize for a given target, as it saves many hours of wasted time waiting on a compiler.
At the end of the day the last thing you want in a large code base is different source code for every architecture you want to run on and every variant of processor within each architecture. That would be a huge waste of man power and a maintenance nightmare.
I agree with that completely. To the limit of it being worth the time. Wasting time is not worth it either.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 22064
Joined: Sat Jul 30, 2011 7:41 pm

Re: Divide By Depth??

Fri Nov 16, 2018 10:18 am

DavidS wrote:
Thu Nov 15, 2018 11:32 pm
Remember that there is a fairly big price we are paying for compilers to be capable of doing that level of code analysis, as it can get rather complex. The code analysis is eating RAM, CPU Time, and disk space. This is why a simple compiler like TCC is so much faster, smaller, and more effecient in its own use of resources, it does not attempt to do any significant analysis.
Which is why people have build machines and build servers. You are slowing down the compile process in order to improve the final result, which is the important bit. You only need to buy a big build rig once, but the improvements to the final output are seem by ALL the machines running the resulting code.

Almost anything you can push in to the compile process from the development and execution processes is worth doing.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Fri Nov 16, 2018 10:26 am

jahboater wrote:
Fri Nov 16, 2018 8:35 am
DavidS wrote:
Thu Nov 15, 2018 11:32 pm
The tradition used to be let the compiler do the simple optimizations, and hand optimize the speed critical sections of the compilers output. At least when dealing with portable code. This is still a better model to me, as it often means spending 20 minutes optimizing the critical sections by hand to save a couple of hours of compile time.
No longer needed.
Look at the assembler output from a recent version of GCC - if you find a single unnecessary or misplaced instruction, its very surprising. If you do find an extra instruction, look harder (and over a wider area) - it almost always means you have not understood what the compiler is doing. It may well have deduced your algorithm and improved it. If you are convinced that the instruction is not needed - raise a bug.
I have looked at the output. To reach that level of optimization often requires the compiler to have a long compile time, long enough that it is a waste of time to do for every recompile (even if you do not clean).
Only worth doing of course if you have the latest compiler version (GCC 8.2 at the time of writing). Another advantage of recent versions if you are looking at the assembler produced is that (from GCC 7) it can intersperse the source code, so you can easily find things.

If you have Raspbian, this script will download, build, and install the latest version of GCC.

Code: Select all

#!/bin/bash

#
#  This is the new GCC version to install.
#
VERSION=8.2.0

#
#  For the Pi or any computer with less than 2GB of memory.
#
if [ -f /etc/dphys-swapfile ]; then
  sudo sed -i 's/^CONF_SWAPSIZE=[0-9]*$/CONF_SWAPSIZE=1024/' /etc/dphys-swapfile
  sudo /etc/init.d/dphys-swapfile restart
fi

if [ -d gcc-$VERSION ]; then
  cd gcc-$VERSION
  rm -rf obj
else
  wget ftp://ftp.fu-berlin.de/unix/languages/gcc/releases/gcc-$VERSION/gcc-$VERSION.tar.xz
  tar xf gcc-$VERSION.tar.xz
  cd gcc-$VERSION
  contrib/download_prerequisites
fi
mkdir -p obj
cd obj

#
#  Now run the ./configure which must be checked/edited beforehand.
#  Uncomment the sections below depending on your platform.  You may build
#  on a Pi3 for a target Pi Zero by uncommenting the Pi Zero section.
#  To alter the target directory set --prefix=<dir>
#

# Pi3+, Pi3, and new Pi2
../configure --enable-languages=c,c++ --with-cpu=cortex-a53 \
  --with-fpu=neon-fp-armv8 --with-float=hard --build=arm-linux-gnueabihf \
  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

# Pi Zero's
#../configure --enable-languages=c,c++ --with-cpu=arm1176jzf-s \
#  --with-fpu=vfp --with-float=hard --build=arm-linux-gnueabihf \
#  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

# x86_64
#../configure --disable-multilib --enable-languages=c,c++ --enable-checking=no

# Odroid-C2 AArch64
#../configure --enable-languages=c,c++ --with-cpu=cortex-a53 --enable-checking=no

# Old Pi2
#../configure --enable-languages=c,c++ --with-cpu=cortex-a7 \
#  --with-fpu=neon-vfpv4 --with-float=hard --build=arm-linux-gnueabihf \
#  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

#
#  Now build GCC which will take a long time.  This could range from
#  4.5 hours on a Pi3B+ up to about 50 hours on a Pi Zero.  It can be
#  left to complete overnight (or over the weekend for a Pi Zero :-)
#  The most likely causes of failure are lack of disk space, lack of
#  swap space or memory, or the wrong configure section uncommented.
#  The new compiler is placed in /usr/local/bin, the existing compiler remains
#  in /usr/bin and may be used by giving its version gcc-6 (say).
#
if make -j `nproc`
then
  echo
  read -p "Do you wish to install the new GCC (y/n)? " yn
  case $yn in
   [Yy]* ) sudo make install ;;
     * ) exit ;;
  esac
fi

#
# An alternative way of adding swap
#
#sudo dd if=/dev/zero of=/swapfile1GB bs=1G count=1
#sudo chmod 0600 /swapfile1GB
#sudo mkswap /swapfile1GB
#sudo swapon /swapfile1GB


And there is another issue with that level of optimization, the neeed of a 1GB swapfile just to compile the compiler? Something is very wrong with that picture.

Look at the old newsgroup topics about compiler optimization, from the 1990's. You will find that there was always debate of how much optimization is worth having the compiler do, and when it is wasteful for the compiler to implement a given optimization. There was a reasonable agreement reached on this topic, that everyone followed (the debate being if a compiler optimization could be within what had been agreed as reasonable).

The agreement was that it is worth having the compiler do the optimization so long as it takes less time to compile a large project 3 times from clean than it takes to hand optimize 2 loops of the form used in a line drawing algorithm that was common at that time, and the compiler never used more than 20% of the available RAM of the system on which it was to run, when the system is in its minimal configuration. It turns out that even some that just tinker in assembly could hand optimize the example loop within half an hour, this means that taking more than 10 minutes to compile the large example project is unacceptable.

As I recall the example project was a neat open source game that (as memory serves) had around 200 source files, and something like 50000 (fifty-thousand) lines of code not including comments, blank lines, lines containing only a preprocessor directive, or lines containing only a brace and possibly comment. The game was feature rich as it was written for testing compiler optimization, so had to have a lot to it.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Fri Nov 16, 2018 10:28 am

jamesh wrote:
Fri Nov 16, 2018 10:18 am
DavidS wrote:
Thu Nov 15, 2018 11:32 pm
Remember that there is a fairly big price we are paying for compilers to be capable of doing that level of code analysis, as it can get rather complex. The code analysis is eating RAM, CPU Time, and disk space. This is why a simple compiler like TCC is so much faster, smaller, and more effecient in its own use of resources, it does not attempt to do any significant analysis.
Which is why people have build machines and build servers. You are slowing down the compile process in order to improve the final result, which is the important bit. You only need to buy a big build rig once, but the improvements to the final output are seem by ALL the machines running the resulting code.

Almost anything you can push in to the compile process from the development and execution processes is worth doing.
No you are speeding up the compile process. If you compiler with a less optimizing compiler in situations where the compile time with the better optimizing compiler takes to long, and you are only optimizing one or two loops (which is usually all that is needed) then you are taking a 2+ hour build and turning it into a 1 hour or less build.

And it is well known that always needing a faster system to compile is a very poor excuse.

For more see my other replies.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Fri Nov 16, 2018 10:35 am

While I have enjoyed this quite off topic conversation, and have learned a lot from it that may be of use, I think it about time to slow down on this thread, it is taking around 10 minutesper day time awayfrom coding :) , slowing down my development process :) .
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

Re: Divide By Depth??

Fri Nov 16, 2018 10:49 am

DavidS wrote:
Fri Nov 16, 2018 10:26 am
And there is another issue with that level of optimization, the need of a 1GB swapfile just to compile the compiler? Something is very wrong with that picture.
What do you mean "just to compile the compiler"!
GCC is 17 million lines of code and it builds it all three times.
When you do "make -j4" it compiles four C files at the same time and obviously that uses lots of memory - but it reduces the build time by a factor of four! Its only an issue on the Pi with its small memory.
If you use "make -j2" to compile only two things at once, you don't need the extra swap - but then you have two CPU cores idle!!!!

User avatar
DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Divide By Depth??

Fri Nov 16, 2018 10:55 am

jahboater wrote:
Fri Nov 16, 2018 10:49 am
DavidS wrote:
Fri Nov 16, 2018 10:26 am
And there is another issue with that level of optimization, the need of a 1GB swapfile just to compile the compiler? Something is very wrong with that picture.
What do you mean "just to compile the compiler"!
GCC is 17 million lines of code and it builds it all three times.
When you do "make -j4" it compiles four C files at the same time and obviously that uses lots of memory - but it reduces the build time by a factor of four! Its only an issue on the Pi with its small memory.
If you use "make -j2" to compile only two things at once, you don't need the extra swap - but then you have two CPU cores idle!!!!
Fair enough, if the RPi were that limited on memory. We have a full GB to work with. As the system the compiler runs on in this case is the Raspberry Pi, that makes it so that each instance it takes more than 20% of the "Available RAM" to compile. And while I have not built gcc in a long time, I am willing to bet it takes quite a bit of time to compile with the default level of optimization when compiled using gcc.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

Return to “Graphics programming”