Can someone give an explanation of 'hard float' in simple terms?

From what I understand arm based processors function more effectively using hard float...

But how does this work?

## What is hard float?

8 posts

A CPU like the ARM can do calculations. In most programs most of these calculations are of the "whole numbers" (integer) type, as these are very simple to do electronically.

Some programs also do "floating point" calculations, and these programs also are expected to work on CPU's that do not have the hardware to do floating point calculations. In such cases these calculations are automatically routed, by the operating system, to a library of calculation routines that do these calculations using integer calculus. For example a simple division like 2/3 is done with hundreds of integer calculations. This is called "software floating point calculus", or short "soft float"

But the ARM chip has a CPU that can also do floating point calculations directly in hardware!

This happens very much faster, as a floating point calculation in hardware is almost as fast as a integer calculation. hardware floating point calculus is shortened to "hard float".

In the past the R-PI's operating systems did not "know" the R-PI's CPU could do floating point calculus, so all floating point calculations were done using the software library. With the latest OS's they became "aware" of the hard float capability of the PI and began using it, which means a very big speed increase for programs using a lot of floating point calculations.

Some programs also do "floating point" calculations, and these programs also are expected to work on CPU's that do not have the hardware to do floating point calculations. In such cases these calculations are automatically routed, by the operating system, to a library of calculation routines that do these calculations using integer calculus. For example a simple division like 2/3 is done with hundreds of integer calculations. This is called "software floating point calculus", or short "soft float"

But the ARM chip has a CPU that can also do floating point calculations directly in hardware!

This happens very much faster, as a floating point calculation in hardware is almost as fast as a integer calculation. hardware floating point calculus is shortened to "hard float".

In the past the R-PI's operating systems did not "know" the R-PI's CPU could do floating point calculus, so all floating point calculations were done using the software library. With the latest OS's they became "aware" of the hard float capability of the PI and began using it, which means a very big speed increase for programs using a lot of floating point calculations.

Your simple processors: PICs and AVRs, the 6502 and Z80 from the BBC Micro and ZX81, and even the 8088 and 80286 from the original IBM PC and PC/AT dealt only with whole numbers. There were 8 bits to a byte and 16 bits to a word. With a byte you could count from 0 to 255, or if you looked at them another way, -128 to +127. With a word you could count 0 to 65536 or -32768 to +32767. If you put two words together you could count up to 4 thousand million. But you could not count fractions very easily. You could decide to move the decimal (or binary) point a bit, making a byte maybe 0.0 to 25.5, but that cuts your range down a lot.

So people invented floating point. It has a "mantissa" which is all the numbers, and an "exponent" which is where to put the (binary) point. Using 4 bytes for one floating point number you can count up to over 100,000,000,000,000,000,000,000,000,000,000,000,000 and down to 0.00...1 (with the same number of zeros).

The problem is that to do that you need to write a lot of software. To add two floating point numbers you need to shift the mantissas and adjust the exponents until the exponents are the same as each other and only then can you add the mantissas. Multiplication is easier and you only have to multiply the mantissas together and add the exponents together. All that software takes time to run; it is ten or a hundred times slower than adding or multiplying whole numbers, which the processors already know how to do in hardware.

So a processor that knows how to do floating point itself is obviously going to be faster. The PC got that at about the time of the 80386. The ARM got it in the v7 variant (IIUC), but there is an option to have it in the v6 variant. The RaspPi uses a ARMv6 that has hardware floating point support. However the existing Linux distributions all assume that an ARMv6 has not got hardware floating point support and therefore they don't use it. All of the floating point work is done in software (soft floating point). It was necessary for the Raspbian team to build a Linux variant that did the floating point work using the hardware (hard floating point).

That's basically it; there's another few wrinkles in there to do with exactly why a complete new distribution is needed, but that gives you the idea.

So people invented floating point. It has a "mantissa" which is all the numbers, and an "exponent" which is where to put the (binary) point. Using 4 bytes for one floating point number you can count up to over 100,000,000,000,000,000,000,000,000,000,000,000,000 and down to 0.00...1 (with the same number of zeros).

The problem is that to do that you need to write a lot of software. To add two floating point numbers you need to shift the mantissas and adjust the exponents until the exponents are the same as each other and only then can you add the mantissas. Multiplication is easier and you only have to multiply the mantissas together and add the exponents together. All that software takes time to run; it is ten or a hundred times slower than adding or multiplying whole numbers, which the processors already know how to do in hardware.

So a processor that knows how to do floating point itself is obviously going to be faster. The PC got that at about the time of the 80386. The ARM got it in the v7 variant (IIUC), but there is an option to have it in the v6 variant. The RaspPi uses a ARMv6 that has hardware floating point support. However the existing Linux distributions all assume that an ARMv6 has not got hardware floating point support and therefore they don't use it. All of the floating point work is done in software (soft floating point). It was necessary for the Raspbian team to build a Linux variant that did the floating point work using the hardware (hard floating point).

That's basically it; there's another few wrinkles in there to do with exactly why a complete new distribution is needed, but that gives you the idea.

@rurwin

@mahjongg

Thank you for your input.

I think I get the basic idea now.

@mahjongg

Thank you for your input.

I think I get the basic idea now.

In order to be thorough someone should at least mention the calling convention differences. Quake 3 on rpi demonstrates that hardware fp can be easily used with "soft fp", and without a whole new distro just to leverage it.

I'm not sure how to explain a calling convention in an easy way though

I'm not sure how to explain a calling convention in an easy way though

Its possible a program like quake is optimized for speed, and just probes the CPU itself to see if it supports hardware floating point, then bypasses the OS when doing such calculations, which gives a speed advantage.

More info about using the floating point hardware of the RI's ARM CPU read the "bare metal section", especially this:

http://www.raspberrypi.org/phpBB3/viewtopic.php?f=72&t=11183

More info about using the floating point hardware of the RI's ARM CPU read the "bare metal section", especially this:

http://www.raspberrypi.org/phpBB3/viewtopic.php?f=72&t=11183

Here's the more advanced version.

If you are compiling your own program, Quake 3 for example, you can do whatever you like, use soft float or hard float, it doesn't matter. So long as your program is self-contained. Quake 3 is probably very self-contained since all the floating point stuff is highly optimised and it was originally written for almost bare-metal MS-DOS.

However if you are writing a program to run under an advanced OS such as Linux, then you expect to be able to use advanced floating point functions provided by the OS to do such things as take the tangent of angles or calculate square-roots, or input and output floating point numbers. Those functions are provided by a module within the OS called a library. There is only one version of that library (or those libraries) in the OS, and that library will be built for either hard float or soft float. There are three alternatives:

You can build your program in soft float mode, and call a soft float library. That's the slowest.

You can build your program in hard float mode, and call a hard float library. That's the fastest.

You can build your program in hard float mode, and call a soft float library. In effect you are pretending to the library that your program is soft float. That's intermediate speed.

But the one thing you can't do is to build a program in soft float mode and call a hard float library. That's because calling a hard float library depends on the hardware floating point support that you have told the compiler does not exist. So if a library is compiled for hard float, then every program that uses it must be compiled for hard float. And remember there is only one library, so if it is hard float, then every program on the distribution must be compiled for hard float. And that's not just every program on the initial SD card; it's every program that you can install with apt-get -- tens of thousands of them. But of course the bonus is that they all run faster.

If you are compiling your own program, Quake 3 for example, you can do whatever you like, use soft float or hard float, it doesn't matter. So long as your program is self-contained. Quake 3 is probably very self-contained since all the floating point stuff is highly optimised and it was originally written for almost bare-metal MS-DOS.

However if you are writing a program to run under an advanced OS such as Linux, then you expect to be able to use advanced floating point functions provided by the OS to do such things as take the tangent of angles or calculate square-roots, or input and output floating point numbers. Those functions are provided by a module within the OS called a library. There is only one version of that library (or those libraries) in the OS, and that library will be built for either hard float or soft float. There are three alternatives:

You can build your program in soft float mode, and call a soft float library. That's the slowest.

You can build your program in hard float mode, and call a hard float library. That's the fastest.

You can build your program in hard float mode, and call a soft float library. In effect you are pretending to the library that your program is soft float. That's intermediate speed.

But the one thing you can't do is to build a program in soft float mode and call a hard float library. That's because calling a hard float library depends on the hardware floating point support that you have told the compiler does not exist. So if a library is compiled for hard float, then every program that uses it must be compiled for hard float. And remember there is only one library, so if it is hard float, then every program on the distribution must be compiled for hard float. And that's not just every program on the initial SD card; it's every program that you can install with apt-get -- tens of thousands of them. But of course the bonus is that they all run faster.

Quake 3 is running on top of an armel ("soft float") OS port, but, it can make its own calls into the ARM CPU's floating-point unit (FPU), which is optional in ARM11v6 CPUs that the Pi uses. However, Quake 3 is almost certainly calling OpenGL ES 2.0 routines that run in the graphics processing unit (GPU) that's a separate co-processor closely linked to the ARM CPU via a shared 32-bit bus. That's how you get very fast texture-mapping and 3-D perspective changes as your character runs around through various areas, along with movements of a number of opposing characters, animated elements, etc.

The best things in life aren't things ... but, a Pi comes pretty darned close!

"Education is not the filling of a pail, but the lighting of a fire." -- W.B. Yeats

In theory, theory & practice are the same - in practice, they aren't!!!

"Education is not the filling of a pail, but the lighting of a fire." -- W.B. Yeats

In theory, theory & practice are the same - in practice, they aren't!!!