Please explain softfloat vs softfp vs hardfp

RichardUK
Posts: 224
Joined: Fri Jun 01, 2012 5:12 pm

Please explain softfloat vs softfp vs hardfp

I am a little confused what is going on.

softfloat - All float done in software.
softfp - Float in hardware values passed on the stack / int registers. ABI compatible with softfloat.
hardfp - Float done in hardware values passed on fpu registers. ABI incompatible with other two.

Is this correct?

Narishma
Posts: 151
Joined: Wed Nov 23, 2011 1:29 pm

Re: Please explain softfloat vs softfp vs hardfp

Yes.

jecxjo
Posts: 158
Joined: Sat May 19, 2012 5:22 pm
Location: Minneapolis, MN (USA)

Re: Please explain softfloat vs softfp vs hardfp

As a very simplistic explanation:

Performing mathematics operations on floating point numbers (decimal numbers, not whole numbers) requires a little more overhead when working with binary values. Everyone knows that data is stored in computers as 1's and 0's, each position in the number being a power of 2 greater than the previous. So to do math with a whole number its quite simple.

Software Based Math: Do the math via pen and paper
When it comes to floating point numbers the steps to perform even a simple two value addition becomes more complicated. This process was originally performed in software, requiring multiple instructions to get the needed result. When you compile with the softfloat option this is what you are doing.

Lets view this as doing some math via pen and paper. Works but kinda slow.

Hardware Based Math: Do the math via your friend's calculator
To speed up floating point math, some smart engineers came up with a Floating Point Unit (FPU) which is a piece of hardware that can take in floating point values and an operator and return a value. This hardware is optimized to just do floating point math so it performs much better than doing the operations in software.

To get the values and operator into the FPU the compiler must add some code to copy this info from your program to the hardware. Typically this is done through a function call which requires some overhead to start the call (copy values from your code into the FPU interface code) and the complete the call (clear up the memory used to do the copy). So in the softfp situation we use the FPU hardware but use the typical function calling method to move the data around. (Note: I'll explain some benefits at the end.)

In this situation your buddy has a calculator. You write down the problem and give it to him to run on his calculator and he writes down the answer and gives it to you.

Hardware Optimized Math: Do the math on your calculator
Next we want to speed things up even more by trying to remove all that overhead of copying data from our code to the FPU interface code. One way to do that is to do the task the FPU interface code does. So if we set the hardfp option when we do an arithmetic operation we now copy the values and the operation directly into the FPU hardware registers. Now we are super fast.

In this situation you have no paper, just a calculator. But think how fast that is since you can just type the values in yourself. No need to write it down, just type it in yourself.

So why so many options?
So why not always write your code with hardfp? Sometimes systems don't have FPU's...even in today's new computers. Remember before how with the softfp option we talked to the FPU but still have the code that copied data to the hardware? What if we could swap out where the copy destination was in cases where we don't have an FPU? Now we could say "Use the FPU if it exists, otherwise copy the data into our Software based calculations (softfloat)." So when you see that softfp is compatible with softfloat it means that the system will decide if it can (and some cases should) use a hardware FPU. If we compile with the hardfp option we have no choice but to use the FPU because the compiler optimizes our system to do no math, just read and write to FPU registers.

In our calculator example, if you sit down and have no paper and only a calculator the only option you have is to use the calculator. If you have a piece of paper you can either give it to your buddy or you can do it by hand. One gives you options and can be slow but practical. The other can be fast but only if the hardware exists.

So hows that for a long winded but hopefully easy to understand explanation?
xmpp: [email protected]
Blog: http://jecxjo.motd.org/code

Burngate
Posts: 5313
Joined: Thu Sep 29, 2011 4:34 pm
Location: Berkshire UK
Contact: Website

Re: Please explain softfloat vs softfp vs hardfp

Makes sense!
Now how about (what I believe the Pi has) vector floating point hardware? Is it the same or different?
And since all Pis have one, why bother with softfloat at all?

jecxjo
Posts: 158
Joined: Sat May 19, 2012 5:22 pm
Location: Minneapolis, MN (USA)

Re: Please explain softfloat vs softfp vs hardfp

Vector Processors allow you to access data in an array format (i.e. pointer to the head of an array and an index into the array) vs a Scalar Processor that just uses direct addressing. Not really a big difference, nothing really to worry about. VFP, FPU, etc all the same thing generally.

Since you know the architecture for all your client systems theres no reason not to just use the hardfp, but incase you wanted the option to compile for a different system, softfp might be a better choice. In my description above I generalized a little on what all takes place when compiling with this option. If you know your system has no FPU its slightly more efficient to select softfloat because the checking of the existance of an FPU takes some overhead, etc. Why ask "Do I have an FPU and is it available?" if you already know the answer?
xmpp: [email protected]
Blog: http://jecxjo.motd.org/code

Burngate
Posts: 5313
Joined: Thu Sep 29, 2011 4:34 pm
Location: Berkshire UK
Contact: Website

Re: Please explain softfloat vs softfp vs hardfp

So now I'm going to prove that I don't know enough to be allowed into the Power-Users forum.
Once long ago I knew something about ARM on my RiscPC. If memory serves, you could put in FPU instructions, which without a FPU would call the instruction exception vector and thence into the Floating-Point Emulator. (The strongarm never had a FPU)
So if the relevent libraries include a FPE, calling it only involves a couple of instruction cycles even if our target hardware doesn't have a FPU
Or should I be ejected from class in shame?

obarthelemy
Posts: 1399
Joined: Tue Aug 09, 2011 10:53 pm

Re: Please explain softfloat vs softfp vs hardfp

I'd venture a guess: *calling* an FP emulator may only be a couple of cycles. *Actually doing* soft FP takes a whole bunch of cycles more than doing hard FP ?

Plus I think the soft/hard variants are not only about actual maths, but also about how parameters are passed when calling functions, with the FP hardware providing a bunch of handy, fast registers which the no-FP CPUs don't have.

AndrewS
Posts: 3625
Joined: Sun Apr 22, 2012 4:50 pm
Location: Cambridge, UK
Contact: Website

Re: Please explain softfloat vs softfp vs hardfp

Burngate wrote:And since all Pis have one, why bother with softfloat at all?
Because the Debian stable version (squeeze) only supports ARM using the ARMv4+ softfloat ABI (armel), and this is the distro that the 'official' Debian image available from http://www.raspberrypi.org/downloads uses.
The Debian unstable version (wheezy) adds an ARMv7+ hardfloat ABI port (armhf), but this won't run on the ARMv6 CPU used by the RaspberryPi. So to "fill in the gap" the Raspbian project http://raspbian.com/RaspbianFAQ is currently in the process of recompiling/porting every Debian package in wheezy to use the ARMv6 hardfloat available on the RaspberryPi

At least, that's the way I understand it

jecxjo
Posts: 158
Joined: Sat May 19, 2012 5:22 pm
Location: Minneapolis, MN (USA)

Re: Please explain softfloat vs softfp vs hardfp

AndrewS wrote:
Burngate wrote:And since all Pis have one, why bother with softfloat at all?
Because the Debian stable version (squeeze) only supports ARM using the ARMv4+ softfloat ABI (armel), and this is the distro that the 'official' Debian image available from http://www.raspberrypi.org/downloads uses.
The Debian unstable version (wheezy) adds an ARMv7+ hardfloat ABI port (armhf), but this won't run on the ARMv6 CPU used by the RaspberryPi. So to "fill in the gap" the Raspbian project http://raspbian.com/RaspbianFAQ is currently in the process of recompiling/porting every Debian package in wheezy to use the ARMv6 hardfloat available on the RaspberryPi

At least, that's the way I understand it
This is actually one of the "good" reasons to have softfp and softfloat support. If hardware changes and no one has ported in the support you can atleast run a slower version using software. Also if you are building a general purpose system you can sacrifice some speed for larger support.

Same goes for UI, you get added speed and flashiness if you build against the video card/opengl/etc but you are only able to run on those systems. Compiling with software based graphics is slower but everyone can support them.
xmpp: [email protected]
Blog: http://jecxjo.motd.org/code

timr
Posts: 22
Joined: Wed May 30, 2012 10:11 am

Re: Please explain softfloat vs softfp vs hardfp

Just a quick note to say that I am using the raspbian development image from http://www.raspbian.org/HexxehImages

It's just what I needed.

I'm using chuck from http://chuck.cs.princeton.edu/ for music synthesis http://www.raspberrypi.org/phpBB3/viewt ... =29&t=7503. With the soft-float image, the system doesn't have enough CPU and the audio is not usable. With hard-float, the audio is ok, at least for the simplle examples I've tried, and CPU load is acceptable, though still high at 30% - 60%

plugwash
Forum Moderator
Posts: 3246
Joined: Wed Dec 28, 2011 11:45 pm

Re: Please explain softfloat vs softfp vs hardfp

As a very simplistic explanation:
Personally I think your analogies are confusing and I also belive your post contains some misconceptions.

I guess I should clear up the confusion and explain things properly.

Floating point on arm has historically been a mess with a number of incompatible floating point units out there. However things have stabalised and nowadays most "applications processors" have started using some version of a floating point unit known as vfp*, specifically the raspberry Pi uses VFPv2. However lower end arm parts still often have either no FPU or a vendor specific FPU.

For any given FPU type selection gcc offers three ways of handling floating point. These are controlled by the -mfloat-abi option.

-mfloat-abi=soft
The code uses integer instructions and/or calls to library routines (depending on the complexity of the operation) to perform floating point maths. No FPU is needed but floating point is slow. The library routines in question are in libgcc which is a static library (so afaict you can't just replace it at runtime). Floating point parameters to functions are passed in integer registers (or on the stack when integer registers run out).

-mfloat-abi=softfp.
The code uses floating point instructions so the FPU is needed but the parameters are still passed in integer registers (or on the stack when integer registers run out). This means the code is compatible with code built with -mfloat-abi=soft and it much faster than doing the floating point in software but it still incurs an overhead moving stuff between CPU and FPU.

-mfloat-abi=hard
The code uses floating point instructions and passes floating point values in floating point registers. This avoids the overhead of moving data arround between integer and floating point registers but also renders the code incompatible with code built with other -mfloat-abi settings (if the parameters to a function call aren't where the function expects them things break horribly).
Once long ago I knew something about ARM on my RiscPC. If memory serves, you could put in FPU instructions, which without a FPU would call the instruction exception vector and thence into the Floating-Point Emulator. (The strongarm never had a FPU)
So if the relevent libraries include a FPE, calling it only involves a couple of instruction cycles even if our target hardware doesn't have a FPU
That was what the old debian arm port (not armel or armhf) did. Unfortunately there were two problems

1: the FPU the instructions were for was an old one (known as FPA) which pretty much no chips had anymore
2: it turns out that trapping into the kernel on an illegal instruction doing the floating point in the kernel and then returning the results to userspace is a LOT slower than just doing the floating calculations in software in userspace.

The result was that floating point performance on most arm hardware at the time was horrifically bad and with the mess of floating point units arround at the time going for software floating point seemed like the best option for a new arm port.

* Note that while vfp stands for "vector floating point" it actually has relatively little in the way of vector functionality . Decent vector support was added with the NEON extensions (which are not supported on the Pi)

gregsmith_to
Posts: 15
Joined: Mon May 14, 2012 1:56 am

Re: Please explain softfloat vs softfp vs hardfp

Indeed much of what's in jexco's post is incorrect. In particular:
So when you see that softfp is compatible with softfloat it means that the system will decide if it can (and some cases should) use a hardware FPU
Not true, code compiled with 'softfp' uses floating-point instructions and can only be run on a system with an fpu. But it can be linked with code (or libraries, be they static or dynamic) that was compiled woth 'soft'. So softfp supplies the best case of interoperability for library linking. If you actually have the fpu, you want as much of your code as possible compiled with 'softfp', but if some of the libraries are generic and are compiled with 'soft' it will still work. There is relatively little speed difference between 'hard' and 'softfp', unless you are calling a lot of very small functions which are passing floating point values as parameters or returning them. In most cases the extra overhead of 'softfp' will be minimal compared to the time actually doing computations.

There is actually some potential for 'run time detection' by the system, but it applies to the 'soft' case: with -mfloat-abi=soft, the compiler assumes no fpu is present and does all float math with function calls. For instance, a double+double add is done with a call to __aeabi_dadd(), which accepts input in (r0,r1) and (r2,r3) and returns in (r0,r1) -- according to the 'soft' abi. This function is located in a shared object library, along with many other such, and it likely takes about 30 instructions to do to the add operation.

However, it's possible to consruct a 'softfp' version of that libary, which has the same __aeabi_dadd() function in it, but only takes 5 instructions to do the add using the fpu [why 5? two to move the inputs to fpu regs, one to do the add, one to move the result back to (r0,r1), and a return].
The system with an fpu can use this much shorter, faster version of the shared library to speed up all code compiled with 'soft'. It's still a lot slower than 'softfp' - each fpu operation is burdened with the overhead of the function call and all the register moves - but you're getting the benefit of the fpu and so it's running a lot faster than the same code would on a system without an fpu.

Return to “Advanced users”

Who is online

Users browsing this forum: No registered users and 12 guests