BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Register allocation (assembly language)

Wed Aug 02, 2017 3:57 pm

Does anyone have any tips on when and which registers to allocate in a routine?
In my 68HC11 compiler it was really easy: there are so few registers that I had to load and store variables like crazy! (Though an optimisation pass got rid of some of the more stupid sequences of instructions, e.g. loading a certain variable into the accumulator when it must already be there.)
It's more complicated on ARM since
a) Routines are allowed to stomp on R0 to R3, and
b) Routines are NOT allowed to stomp on R4-R11.
Every time I think up suitable code I find that instruction A depends on instruction B, which depends on instruction C, which... and I soon lose track of what order decisions have to be made in.
E.G. Suppose I want to code this C in assembler: void A(int B, int C) { D(B); E(C); }
I might produce:
STMFD SP!, {R4, LR}
MOV R4, R1
BL D
MOV R0, R4
BL E
LDMFD SP!, {R4, PC}
The 1st instruction is there because we need to save R4 because it's used later (instruction 2) and we need to save LR because we use a BL (instruction 3 (and 5)).
The 2nd instruction is there because the BL is allowed to stomp on R0-R3 but C, which is in R1, is used later (instruction 4). (R4 could be any of R4 to R11.)
The 4th instruction is there to put E's parameter in the right place (C currently being in R4).
The 6th instruction complements the 1st.
It's all doing my head in!
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

jahboater
Posts: 1891
Joined: Wed Feb 04, 2015 6:38 pm

Re: Register allocation (assembly language)

Wed Aug 02, 2017 4:43 pm

Wait till you use the Pi in 64-bit mode, there are 31 general purpose registers, plus the handy zero register!

What you describe is just normal assembler programming, if you don't like it, use C instead and the C compiler's register allocator and scheduler will do a great job, probably better than us (IMO).

Other than that, I just constantly think of ways of reducing the number of registers in use.

User avatar
bitbank
Posts: 203
Joined: Sat Nov 07, 2015 8:01 am
Location: Sarasota, Florida
Contact: Website

Re: Register allocation (assembly language)

Wed Aug 02, 2017 5:02 pm

Part of the 'art' of writing assembly language is making those choices. I choose a register for each variable in my algorithm starting with r0-r3. It's just the way you have to work. If you need more variables than can fit in registers, then you need to manage working with the stack or storing them somewhere else in memory. If this is unpleasant for you, write the code in C. It's rare that you can't do the same work in C that can be done in assembly language. I've spent the last 38 years writing assembly language for various processors, so it's as natural as breathing. Your mileage may vary :)
The fastest code is none at all :)

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Register allocation (assembly language)

Wed Aug 02, 2017 5:04 pm

jahboater wrote:
Wed Aug 02, 2017 4:43 pm
Wait till you use the Pi in 64-bit mode, there are 31 general purpose registers, plus the handy zero register!

What you describe is just normal assembler programming, if you don't like it, use C instead and the C compiler's register allocator and scheduler will do a great job, probably better than us (IMO).

Other than that, I just constantly think of ways of reducing the number of registers in use.
I'm only interested in producing code that works on all models of Pi. (If they bring out a "Pi Subzero" which doesn't even support ARMv6, I shall sulk!)

I could output C code, but where's the fun in that :D

I did think of this algorithm:

Code: Select all

Generate crappy code;
do
   Fix and/or improve the code
until Can't think of anything else to do.
but I'm sure there's a better algorithm!
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Register allocation (assembly language)

Wed Aug 02, 2017 5:08 pm

bitbank wrote:
Wed Aug 02, 2017 5:02 pm
Part of the 'art' of writing assembly language is making those choices. I choose a register for each variable in my algorithm starting with r0-r3.
Hmm, but the first BL to, say, a libc function can potentionally stomp on all of R0-R3, in which case wouldn't it have been better to allocate any variables which are still live at that point somewhere in R4-R11?
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

User avatar
bitbank
Posts: 203
Joined: Sat Nov 07, 2015 8:01 am
Location: Sarasota, Florida
Contact: Website

Re: Register allocation (assembly language)

Wed Aug 02, 2017 5:11 pm

BMardle wrote:
Wed Aug 02, 2017 5:08 pm
bitbank wrote:
Wed Aug 02, 2017 5:02 pm
Part of the 'art' of writing assembly language is making those choices. I choose a register for each variable in my algorithm starting with r0-r3.
Hmm, but the first BL to, say, a libc function can potentionally stomp on all of R0-R3, in which case wouldn't it have been better to allocate any variables which are still live at that point somewhere in R4-R11?
You need to understand which variables are volatile and which are not. You didn't provide any context, so my answer doesn't make sense to you. If you're working with nested calls, then register allocation becomes more complicated. Are you calling C functions from asm code? Are you calling someone else's code or your own. If it's all within your control, then you can manage how the registers are used.
The fastest code is none at all :)

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Register allocation (assembly language)

Wed Aug 02, 2017 5:24 pm

bitbank wrote:
Wed Aug 02, 2017 5:11 pm
You need to understand which variables are volatile and which are not. You didn't provide any context, so my answer doesn't make sense to you. If you're working with nested calls, then register allocation becomes more complicated. Are you calling C functions from asm code? Are you calling someone else's code or your own. If it's all within your control, then you can manage how the registers are used.
Yes and yes. Let's assume that the only thing I can assume about nested calls is that they obey the ARM Procedure Call Standard.
I'm not sure what you mean by 'volatile'. In the C sense of the word, let's assume that nothing is volatile.
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

User avatar
Paeryn
Posts: 1673
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Register allocation (assembly language)

Wed Aug 02, 2017 5:25 pm

It gets easier as you get used to it, you can always sprinkle comments in your code to remind you what you moved into which register (and what offsets from the stack you use for local variables when you run out of free registers).
She who travels light — forgot something.

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Register allocation (assembly language)

Wed Aug 02, 2017 5:45 pm

Paeryn wrote:
Wed Aug 02, 2017 5:25 pm
It gets easier as you get used to it, you can always sprinkle comments in your code to remind you what you moved into which register (and what offsets from the stack you use for local variables when you run out of free registers).
Well, I'm writing a C program that generates the assembler, so I'm not sure it'll have a reason to output comments, but the C code will have to keep track of what's in what register, yes.
I'm going to start by assuming that all parameters are passed in R0-R3 and I never have to spill local variables or temporaries onto the stack. It'll be interesting to see how often that bombs!
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Register allocation (assembly language)

Wed Aug 02, 2017 5:52 pm

PS: Thanks for not trying to talk me out of it, Paeryn!
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

jahboater
Posts: 1891
Joined: Wed Feb 04, 2015 6:38 pm

Re: Register allocation (assembly language)

Thu Aug 03, 2017 7:24 am

Are you interested in performance?

If so you also need to worry about avoiding dependencies and allowing multiple issue and so on.

Some ARM cpu's with out-of-order execution can deal with dependencies themselves to a degree, but the Pi cpu's are low power, or old designs, so cannot.

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Register allocation (assembly language)

Thu Aug 03, 2017 7:32 am

jahboater wrote:
Thu Aug 03, 2017 7:24 am
Are you interested in performance?

If so you also need to worry about avoiding dependencies and allowing multiple issue and so on.

Some ARM cpu's with out-of-order execution can deal with dependencies to a degree, but the Pi cpu's are low power, or old designs, so cannot.
Moderately intersted! As long as the code is valid and not totally dire I'll be happy.
I don't know if you read my thread about multiplication, but I came to the conclusion that in
MUL R0,R1,R2
<do something with R0>
I can squeeze 2 simple instructions in the middle without slowing it down. Off the top of my head I can't remember if that's on the Pi Zero and 2 or not.
Actually, I think optimisations like that would be way down the bottom of the (imaginary) to-do list for my compiler.
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

timrowledge
Posts: 1073
Joined: Mon Oct 29, 2012 8:12 pm
Location: Vancouver Island
Contact: Website

Re: Register allocation (assembly language)

Thu Aug 03, 2017 11:47 pm

Name your registers.

First, just name them something actually meaningful; almost anything is better than "MOV R0, R4, R3" when it comes to somebody else trying to work out what on earth your code is trying to do. How important points here
a) 'somebody else' includes you, six months later, unless you have perfect memory
b) 'what your code is trying to do is rarely perfectly mapped to 'what your code actually does'. See also a) above.

Then as you get nearer trying to assemble it you can start working out which register names don't overlap in scope. You can start to adjust the names to add the register number you think they might map to. It's easy to bulk change 'pixelBase5' to pixelBase6' and a lot less dangerous than having used R5 for the base address of a bitmap and then needing to change to use R6 just in that area of code where R5 is(was) meant to have the address. Once you have things apparently clean, you can formally declare all those names to be suitable numbers.

You might be thinking, "hmm, I could write some clever sed rules to do all that for me" but honestly, I think you might find that it's been done and they call it something like 'cc'.
Making Smalltalk on ARM since 1986; making your Scratch better since 2012

jahboater
Posts: 1891
Joined: Wed Feb 04, 2015 6:38 pm

Re: Register allocation (assembly language)

Fri Aug 04, 2017 7:10 am

timrowledge wrote:
Thu Aug 03, 2017 11:47 pm
You might be thinking, "hmm, I could write some clever sed rules to do all that for me" but honestly, I think you might find that it's been done and they call it something like 'cc'.
I think the OP is writing a compiler.

It is hard - the current register allocator in gcc was some 20 years in development before it was released on a few platforms a couple of years ago!

Perhaps do some research - google "compiler register allocation" it comes up with lots of interesting stuff.

https://en.wikipedia.org/wiki/Register_allocation

jahboater
Posts: 1891
Joined: Wed Feb 04, 2015 6:38 pm

Re: Register allocation (assembly language)

Fri Aug 04, 2017 11:12 am

BMardle wrote:
Wed Aug 02, 2017 5:04 pm
I did think of this algorithm:

Code: Select all

Generate crappy code;
do
   Fix and/or improve the code
until Can't think of anything else to do.
but I'm sure there's a better algorithm!
Thats not a bad idea if you are writing the compiler just for fun.

Perhaps just allocate the registers one by one until you run out, then fail the compilation.
Quite a few small test programs or "hello world" programs should be OK, especially on a RISC machine with a large uniform register file. It would get you started. (Of course in Aarch64 mode with 31 registers, a lot more programs will succeed).

Next stage is to try and detect when a register is no longer in use (no longer "live") and re-allocate it (much harder).

An optimization to avoid lots of register moves, might be to predict at an early stage if certain values need to be in certain registers (to conform to an ABI perhaps).

Sometimes it can be proven that two variables have the same value for a period, in which case there is no need to allocate a new register. (const int a = b; generally doesn't use a new register until later when, and if, "b" is altered).

And so on.

The point is, as you say, improve it in stages - but start with something that actually works for really simple programs.

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Register allocation (assembly language)

Fri Aug 04, 2017 2:32 pm

Thanks for the replies, timrowledge and jahboater.

I can well believe it took 20 years to write the register allocator for GCC. I suspect it'd take me 20 years to understand it too!

I've got 4 books on compiler constructions on my shelves, but I don't think any of them is less than 20 or 30 years old! I've been reading the relevant chapter in 'The Dragon Book' (aka 'Aho, Sethi and Ullman') but I'm scared of reading their sample allocator because their model processor bears so little resemblance to ARM. (I remember not understanding much of the book on writing 'advanced' compilers. I still have nightmares about dominance frontiers :) )

I thought I had an algorithm in mind that might work... but it's just occurred to me that it doesn't take multiple basic blocks (BBs) into account. Phooey! Maybe I'd better start with code that allocates room on the stack for all parameters and local variables then, later, code that tries to optimise starting from that. (I wrote some code that allocates regs for a BB but it doesn't take the R0-3 vs. R4-11 thing into account. Should 'BL' end the current BB? Strictly speaking, as a branch, it should, but semantically it's like a non-branching instruction that potentially does a lot.)

I don't know why, but it never occurred to me to try Wikipedia! Maybe I should do a lot more reading before I start coding!

"Hello, World" is too complicated :) I'm going to start with a program that just returns a given exit code!
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

jahboater
Posts: 1891
Joined: Wed Feb 04, 2015 6:38 pm

Re: Register allocation (assembly language)

Fri Aug 04, 2017 3:27 pm

BMardle wrote:
Fri Aug 04, 2017 2:32 pm
(I wrote some code that allocates regs for a BB but it doesn't take the R0-3 vs. R4-11 thing into account. Should 'BL' end the current BB? Strictly speaking, as a branch, it should, but semantically it's like a non-branching instruction that potentially does a lot.)
I'm not sure what you mean by BB in this context? end of the HLL block {}, end of the live range for a variable? Given that functions cannot alter r4-r11, BL probably should not end things.
BMardle wrote:
Fri Aug 04, 2017 2:32 pm
"Hello, World" is too complicated :) I'm going to start with a program that just returns a given exit code!
Sounds a great idea!

From experience I suggest not getting slowed down by optimizations and complicated stuff too early.

Here is an example of how I would do it.

Take (in C)

int array[50];
x = array[n]

Now you could start coding array access by multiplying the offset (n) by the size of the array elements (in this case 4) before adding it to the array start address. That would work for all types of array.

Of course then you have a pointless multiply by one for character arrays. You can rely on your peephole optimizer to remove "multiply by ones", or perhaps make a special case of it.

Next you spot that most basic items are 2, 4 or 8 bytes, so the multiply can be replaced by a shift.

Next you see that ARM (and x86) cpu's can deal with this as part of the addressing, so no separate multiply or shift is needed at all, and the addition is covered too.

ldr r6, [r3, r7, lsl #2]

All the the time you code works, and you can stop at any time.

You haven't said what language your compiler is for (or have you created a new one!).

Have fun!

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Register allocation (assembly language)

Fri Aug 04, 2017 8:31 pm

A basic block is a set of instructions such that if 1 is executed then all are. Pretty easy to generate:
The first instruction is the start of a basic block.
Any labelled instruction is the start of a basic block (preferably a label that's actually used!)
A branch is the end of a basic block.

Thanks for the suggestions about coding.
I was thinking about situations where I could combine instructions. At first I thought that I could encode an integer constant in my source with
LDR Rd,=number
which would be OK in a sequence like
LDR R0,=42000000
ADD R1,R2,R0
but you wouldn't want to output
LDR R0,=42
ADD R1,R2,R0
because the assembler would treat that as
MOV R0, 42
ADD R1,R2,R0
which could be combined to
ADD R1,R2,42
... so at some point I'll probably have to write a routine to identify numbers which can be encoded in 12 bits.

It is my own language I'm compiling. It's changed a few times over the years but I call it 'Bungol' :)
Much as I like C, it has some malfeatures and I figure I don't want to have to spend time implementing features I don't like!
For instance, C has no 'endif' (not counting the preprocessor!). Neither does C++. Neither does Java... though it does add a rather bizarre rule about what an 'if' statements are legal.
Bungol's mostly simplified C combined with very simplified Algol68.
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Shortcomings of C

Fri Aug 04, 2017 9:05 pm

<ramble>
I was browsing through the gcc documentation recently and found that gcc had an extension that addresses another thing I regard as a shortcoming of C: the lack of compound expressions containing declarations. E.G., in gcc you can write
while (({int i; scanf("%d", &i) && i!=42;}));
which will loop till you enter something that's not a number or you enter '42'.
In Bungol (at the moment!) this is
while s32 i:=scans32; i<>42 do skip od
</ramble>
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

jahboater
Posts: 1891
Joined: Wed Feb 04, 2015 6:38 pm

Re: Shortcomings of C

Fri Aug 04, 2017 9:43 pm

BMardle wrote:
Fri Aug 04, 2017 9:05 pm
the lack of compound expressions containing declarations. E.G., in gcc you can write
while (({int i; scanf("%d", &i) && i!=42;}));
BCPL had these. I don't know why they were dropped in B and C. Very useful.
There are quite a few handy gcc extensions.

case '0' ... '9': // saves a whole lot of case values

a = n ? : 42; // sets a to n, or 42 if n is zero.

nested functions

Most of them seem to be supported by some other compilers such as clang/llvm (apart from nested functions). My guess is some will make it into future standards.

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Re: Shortcomings of C

Fri Aug 04, 2017 10:35 pm

An old friend of mine used to code in BCPL. He said he wrote a program which called `random` a lot but, having programmed a lot in Algol68, he'd forgotten that you have to put '()' after the name of a parameterless procedure to call it. Figuring he'd consistently got it wrong he did a global replace of "random" with "random()". Apparently it took him a long time to figure out why the program behaved so strangely: one place had "random()()"!

I don't know much about BCPL except that its compiler was apparently particularly easy to port.

Ooh! That 'case' extension does sound useful!
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

Heater
Posts: 7925
Joined: Tue Jul 17, 2012 3:02 pm

Re: Register allocation (assembly language)

Fri Aug 04, 2017 10:40 pm

BMardle,

while (({int i; scanf("%d", &i) && i!=42;}));

What you say may be useful. But most places I have worked would fire you if you insisted on writing such inscrutable code. It's hard for humans to parse and it's unnecessary.

Could someone please tell me what possible use nested functions are in C?

Apart from limiting their scope of visibility I don't see the point.

jahboater
Posts: 1891
Joined: Wed Feb 04, 2015 6:38 pm

Re: Register allocation (assembly language)

Sat Aug 05, 2017 12:16 am

Heater wrote:
Fri Aug 04, 2017 10:40 pm
while (({int i; scanf("%d", &i) && i!=42;}));

What you say may be useful. But most places I have worked would fire you if you insisted on writing such inscrutable code. It's hard for humans to parse and it's unnecessary.
It is just a compound statement that can return a value (like a serial clause in Algol 68 if you ever used that).

https://gcc.gnu.org/onlinedocs/gcc-4.9. ... ment-Exprs

You can make them readable, or not, as you wish - just like any language feature.
Here is another example, a correct implementation of the popular min() and max() macro's:-

Code: Select all

/*
 *  These evaluate their operands once.
 *  Return type is that of the first argument.
 */
#define min(a,b) ({ const typeof(a) t1 = (a), t2 = (b); (typeof(a))(t1 < t2 ? t1 : t2); })
#define max(a,b) ({ const typeof(a) t1 = (a), t2 = (b); (typeof(a))(t1 > t2 ? t1 : t2); })
Heater wrote:
Fri Aug 04, 2017 10:40 pm
Could someone please tell me what possible use nested functions are in C?

Apart from limiting their scope of visibility I don't see the point.
Exactly that.
The usual scope rules apply - the nested function can use its parent's local variables directly.
This may sometimes lead to cleaner or slightly faster code.
Other languages like Algol and Pascal have always had nested functions, and C++ has something similar (I think they are called "named lamda's").
https://gcc.gnu.org/onlinedocs/gcc-4.9. ... -Functions

If a large function has one or two small support functions that only it uses, it just seems cleaner to nest them.

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Language features

Sat Aug 05, 2017 9:00 am

Heater wrote:
Fri Aug 04, 2017 10:40 pm
BMardle,

while (({int i; scanf("%d", &i) && i!=42;}));

What you say may be useful. But most places I have worked would fire you if you insisted on writing such inscrutable code. It's hard for humans to parse and it's unnecessary.
I think it's largely a matter of what you're used to. Algol 68 doesn't have a "do ... while' (that I recall; neither does Bungol), so that's (analogous to) the only way to write it.
I'd never put an empty loop body on the same line as a while in real C coding; that's a classic source of bug. In Algol 68 (and Bungol) the empty statement is 'skip', not empty text.
Heater wrote:
Fri Aug 04, 2017 10:40 pm
Could someone please tell me what possible use nested functions are in C?

Apart from limiting their scope of visibility I don't see the point.
Bungol has nested functions (though I might not implement them in my ARM compiler) but that's mostly out of consistency: the overall program is regarded as a procedure body. If I couldn't declare procs in a proc body then I couldn't declare them anywhere!
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

BMardle
Posts: 92
Joined: Wed Feb 13, 2013 4:00 pm
Location: Isle of Wight

Language features

Sat Aug 05, 2017 1:17 pm

Heater wrote:
Fri Aug 04, 2017 10:40 pm
BMardle,

while (({int i; scanf("%d", &i) && i!=42;}));
:
Could someone please tell me what possible use nested functions are in C?
:
I was just pondering this further. Strangely enough, compound expressions containing declarations (and/or void statements) and local functions would both be ways of solving a 'problem' I often have: in standard C I'd write the above as something like:
int funcOnlyUsedOnce(void) {int i; return scanf("%d", &i) && i!=42;}
.. someOtherFunc(..) {
:
while (funcOnlyUsedOnce())
;
:
}

(Actually, both bits of code above have a bug: scanf returns -1 if it can't read anything, so they should have "1==scanf(...)".)

I see gcc also allows C code to have declarations after statements (as in C++). Hmm. Now to decide whether to use all these gcc features or keep my code slightly more portable. (Though the last time I tried compiling something under Microsoft Visual Studio, it failed due to the absence of lstat.)
Bruce Mardle. "You know I yearn for a simpler time of barn dances and buggy rides before life was cheapened by heartless machines."

Return to “Other languages”

Who is online

Users browsing this forum: No registered users and 5 guests