Debian 'armhf' (hard-float ABI) for Raspberry Pi?


 
27 posts   Page 1 of 2   1, 2
by mpthompson » Mon Mar 05, 2012 9:07 am
Looks like progress is being made on the new hard-float ABI ARM port for Debian called 'armhf': http://wiki.debian.org/ArmHardFloatPort

Unfortunately, it looks like armhf is targeting ARMv7 as the minimum supported hardware and will be leaving ARMv6 behind, even though a system such as the Raspberry Pi could benefit nicely from a hard-float ABI.  At least that is the way it looks from my interpretation of the page linked above.  Could someone knowledgeable about Debian plans with regards to armhf confirm that indeed the Raspberry Pi CPU will be unsuitable for the armhf port of Debian?
User avatar
Moderator
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
by Xark » Mon Mar 05, 2012 10:43 am
Hello,

I don't know anything about Debian's armhf plans except what they posted.  Here is my understanding of the situation in the hopes it may be helpful (please correct me if I am mistaken).

I will also mention I am kind of hoping the Fedora RPi Remix will be ARMv6+hardfloat build as this link seems to indicate: http://www.cnx-software.com/20.....-remix-14/  If it is not, then it won't quite be fully "Raspberry Pi optimized" (yet). The performance difference depends on the applications use of floating point and how many times the floats are passed as parameters.  The performance improvement can be worthwhile for many apps (see some other forum threads for examples).

However, from my understanding of ARMv6 vs ARMv7, I believe a fair amount of the Debian hardfloat work will be useful for the Raspberry Pi, but it will need some changes for these builds to work on RPi (i.e., recompile with different options at the minimum).

It looks like Debian-armhf is targeting armv7+vpf3-d16+Thumb-2 as their baseline.  This will not be compatible with the RPi.

The differences between the ARMv6 and ARMv7 architectures are not "huge" (but there certainly are some instruction set differences).

The biggest feature that the RPi CPU is lacking is Thumb-2 support.  The other thing is no NEON SIMD support but it looks like Debian-armhf isn't going to require NEON by default, so that is probably fine.

RPi does have thumb mode (16-bit instructions for smaller code, but not the much improved thumb2 armhf uses), but this mode will not work with hardfloat (to my understanding - since it can't directly access the FPU registers).  Since the RPi has 32-bit memory (I believe) there is no huge reason to not always use 32-bit arm code instead of thumb (except for the most memory space constrained applications perhaps).  I am not sure, but I believe hardfloat will either preclude using thumb mode (16-bit instructions) or maybe require parallel libraries for those applications.  On the RPi where speed is likely to be a larger concern than code-size I don't see thumb making too much sense anyways (but as always it probably "depends").

The RPi FPU is VFPv2 and has 32 registers (up from the 16 minium assumed by Debian-armhf). So this should be fine except for a few missing instructions from VFPv3 (probably only an issue for assembly language code). See http://infocenter.arm.com/help.....dejjh.html

For most C/C++ applications they probably just need to be recompiled with different options for the RPi (for armv6-vfpv2).  This can be a substantial task though for all the packages in a distribution (and fixing the inevitable build "hiccups").

Any ARM assembly language files that have already been converted to use the newer UAL assembly syntax should be able to be reassembled using arm (32-bit) instructions (but often some tweaks are needed).   Any assembly code using non-ARMv6 features will need some attention.

I could also see there being issues if the Broadcom user-mode libraries can't be recompiled with a matching options (but a "shim" or wrapper could perhaps be used in the interm, however that is not ideal).  Hopefully Broadcom could be coaxed to help produce compatible libraries. :)

Another concern, unless I am mistaken, this may make this a "custom" ARM dialect for just RPi and we may not be able to run "normal" arm5tel binaries (or maybe there is Linux "magic" for compatibility with multiple ABIs).  This may not be a concern (and it is not like there won't be other RPi Linux versions to run these apps).  Also, not positive but I think RPi armv6-hardfloat applications will probably be compatible with the normal armhf (because ARMv7 is backwards compatible to ARMv6).  Obviously, RPi won't be compatible with the normal Debian-armhf apps either (or we wouldn't be having this conversation).

Since some of this work could be done on emulators (or other ARM systems), this may be a good thing to work on while we all await the "ding" from the oven (or doorbell) and our fresh Pi is ready. :)

Perhaps it is already underway somewhere?
Posts: 5
Joined: Mon Mar 05, 2012 9:20 am
by plugwash » Mon Mar 05, 2012 3:33 pm
Basically the situation for debian armhf is the same as the situation for ubuntu. You would have to change the compiler defaults (probablly not too hard), then rebuild EVERYTHING then do lots of testing to see if any amdv7 code accidently slipped through the net.

Debian is always natively compiled so to rebuild everything you will need to buy a number of reasonablly fast arm boards with as much ram as possible*. The current reccomendation seems to be the freescale i.mx53 quickstart board. The panda has a faster CPU but lacks native SATA and native ethernet which apparently causes stability problems under the heavy storage/network load that a buildd encounters. A group of 6 i.mx53 boards (the number used to bootstrap the debian armhf port though more have been added since) plus hard drives and mounting hardware will probablly cost you arround £1K.

The distro you produce will still be armhf in the sense it will use the same calling convention but obviously official debian armhf binaries won't run on the pi. It will not be directly compatible with debian armel though multiarch may allow both to the mixed on the same system.

Once you have bought the hardware you will then need to learn how to set up the various debian archive management and autobuilding tools on it.

I'm not aware of anyone doing this, if you intend to do so then PLEASE post your intentions to debian-arm@lists.debian.org to avoid dupliciated effort.

* Which right now basically seems to mean 1 gigabyte.
Moderator
Posts: 1964
Joined: Wed Dec 28, 2011 11:45 pm
by mpthompson » Mon Mar 05, 2012 7:20 pm
Thank you for the detailed responses to my question.  They both gave me a much better understanding of the situation regarding Debian armhf and the Raspberry Pi.

Supporting a one-off version of Debian and all associated packages on armhf sounds like quite a project.

I do have another question.  With Raspberry Pi using Debian armel, it is my understanding the hardware FPU will still automatically be used for floating point operations, but that floating point parameters will be passed via integer registers on the CPU rather than the registers in the FPU.  This means that the Raspberry Pi will automatically gain a speed increase with floating point operations being performed in hardware, it's just not as optimal as armhf where floating point parameters can be passed in the FPU registers as well.  Is this perception correct on my part?  Or, does the Raspberry Pi under Debian armel continue to use software floating point operations until something special is done?
User avatar
Moderator
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
by dom » Mon Mar 05, 2012 8:22 pm
mpthompson said:


Or, does the Raspberry Pi under Debian armel continue to use software floating point operations until something special is done?


Unfortunately debian uses software emulated floating point for its packages.

There is a big speed advantage for the first distribution to be built with hardware floating point enabled.
Moderator
Posts: 3858
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by Xark » Mon Mar 05, 2012 8:41 pm
Yes, thanks plugwash for the info.  That was  veryinformative.

Mpthompson, I believe all the RPi Linux distributions will use the hardware floating point unit for the actual float operations.  It is just the extra overhead of moving the parameters to and from the integer registers for each function call that a hard-float port addresses.

I believe it was done this way so that ABI is compatible regardless of the presence of a FPU (on systems without a FPU the library code can do "soft-float" if a FPU is not available).  Lots of small ARM systems have no FPU and this way the binaries are compatible.

Non hard-float isn't the end of the world, but is a bit irksome to "waste" performance just because your ABI doesn't match your hardware properly.

I am interested in the RPi as an open "homebrew" video game console (without the "grey area" on commercial consoles) and I could see hardfloat making a difference for some games.

I wonder if there are enough "optimization enthusiasts" to make a RPi hardfloat distribution a reality.  As plugwash says, it is big undertaking but maybe there is enough demand on Raspberry Pi for it to make sense.  I know I would be willing to donate time and money to help if other people are also interested.
Posts: 5
Joined: Mon Mar 05, 2012 9:20 am
by dom » Mon Mar 05, 2012 9:16 pm
Xark said:

Mpthompson, I believe all the RPi Linux distributions will use the hardware floating point unit for the actual float operations.  It is just the extra overhead of moving the parameters to and from the integer registers for each function call that a hard-float port addresses.


No.

You can compile code to use hardware floating point instructions (-mfpu=vfp). Otherwise it will emulate them with integer instructions.

You are allow arguments to be passed in floating point registers (-mfloat-abi=hard). Otherwise it will have to pass them in integer registers.

The first of these options has a huge effect on floating point performance. The second a smaller effect.

Debian, and the other distrubutions (Fedora and ArchLinux) use neither of these.

There is a Gentoo distribution that does enable both these options, and can in theory be much faster. However packages get built from source on your own machine (which is very slow) and you are more likely to have build issues than on a prebuilt, pretested distribution.

-mfpu=vfp -mfloat-abi=hard
Moderator
Posts: 3858
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by Xark » Mon Mar 05, 2012 9:32 pm
Wow, that is really surprising.  So you mean the current RPi distributions use all software floating point emulation for all the applications?

That is much worse than just not using hardfloat ABI and will seriously kill any floating point performance (as you mention). :(

However, like you said, at least for individual apps you can always compile them "properly" for the RPi (to use vfp) and get most of the speed back (other then the using integer registers for parameter passing).  I would kind of hope the RPi distros would at least do this for the large apps that use floating point (or libraries).

Thanks for the heads-up!
Posts: 5
Joined: Mon Mar 05, 2012 9:20 am
by shirro » Mon Mar 05, 2012 9:54 pm
Xark said:


Wow, that is really surprising.  So you mean the current RPi distributions use all software floating point emulation for all the applications?


I wouldn"t mind seeing a minimal RPi specific distro put together just aimed as a base for demos or standalone OpenGL apps (eg games) that just had a kernel, blob, OpenGL, busybox, libc, 128/128 split and not much else.
Posts: 248
Joined: Tue Jan 24, 2012 4:54 am
by plugwash » Mon Mar 05, 2012 10:04 pm
mpthompson said:


Supporting a one-off version of Debian and all associated packages on armhf sounds like quite a project.


It's not a trivial project but I think it's within reach of a small group of sufficiently motivated and knowlagable people. Many of the official debian ports seem to be run by pretty small groups of people.
Moderator
Posts: 1964
Joined: Wed Dec 28, 2011 11:45 pm
by Xark » Mon Mar 05, 2012 10:33 pm
shirro said:


Xark said:


Wow, that is really surprising.  So you mean the current RPi distributions use all software floating point emulation for all the applications?


I wouldn"t mind seeing a minimal RPi specific distro put together just aimed as a base for demos or standalone OpenGL apps (eg games) that just had a kernel, blob, OpenGL, busybox, libc, 128/128 split and not much else.


Yes, this is very similar to what I was envisioning for a "game console" run-time also.  However, I would still think a matching "development distribution" with game related "SDK" (SDL, compiler & tools etc.) to be used to develop and test the games would be desirable.  Ideally this would be native but it would be nice to allow a Linux cross-dev setup too (use RPi as "target" with remote debugging).
Posts: 5
Joined: Mon Mar 05, 2012 9:20 am
by dom » Mon Mar 05, 2012 10:41 pm
shirro said:


Xark said:

I wouldn"t mind seeing a minimal RPi specific distro put together just aimed as a base for demos or standalone OpenGL apps (eg games) that just had a kernel, blob, OpenGL, busybox, libc, 128/128 split and not much else.

I think openELEC looks a lot like that. I believe it is built with hardware fp (but not the ABI), is busybox, supports openGL, 128/128 split and has the libs needed by xbmc (quite a lot..)
Moderator
Posts: 3858
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by mpthompson » Tue Mar 06, 2012 3:01 am
plugwash said:

It's not a trivial project but I think it's within reach of a small group of sufficiently motivated and knowlagable people. Many of the official debian ports seem to be run by pretty small groups of people.

[Ugh!!! Connection errors in these forums make replies difficult.  Hopefully this gets through.]

Plugwash, thanks again for all the useful information.  You seem to know more than a bit about managing a Debian port.  Do you have a link to a document that describes what it takes to initiate such a project? I guess directing my questions to debian-arm@lists.debian.org would also yield answers in this regard.  There is a chance I would have an interest in funding and spearheading such a project.

Looks like a i.MX53 Quick Start board and a 2 terabyte SATA hard drive can be had for a little under $300 each (hard disk prices are terrible right now).  Add mounting hardware and other stuff and a cluster of six could be built for less than $2500, perhaps under $2000.  Of course, I imagine the bandwidth required to support such a cluster building all the Debian packages would be a significant expense very likely require something more than just a home cable Internet connection.

My preference would be for a Debian distro tuned to the RPi as that is what I'm most familiar with.  I know there are other fine Linux distributions, but the Debian community hasn't let me down yet.  If the Debian has things set up so that a RPi tuned dist could be made in essentially a turnkey fashion, that would further my preference.
User avatar
Moderator
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
by plugwash » Tue Mar 06, 2012 10:57 am
The hard drives don't need to be as big as 2TB, the official armhf buildds are running on 160-250GB laptop drives.

http://blog.einval.com/debian/.....s.comments

I don't know exactly how high the package churn in the debian repositries is but i'm pretty sure it's low enough that you could keep-up with it using a normal DSL/cable connection.

The first step would be to set up a package archive and import the debian source packages (but not the binary packages), This is the part i'm not sure on as i've never done it personally and I didn't get involved with the armhf port until it was already in the officicial archive.

Then you would have to import sufficient binaries to your repro to make it possible to bootstrap a chroot and install build-essential. You wan't to keep this set as small as possible because every binary in it is a possible contamination risk for your new archive.

Then you would have to upload modified compiler packages with new defaults to your repro.

Then you would have to setup wanna-build and buildd to first "binnmu"* the binary packages you imported and then start building the rest of the archive.

You would also probablly want to set up some system to import changes to your repro from the official repros but ONLY for packages you have not manually modified. I know this is possible because ubuntu do it but I don't know the details of how.

You also have to decide which debian distribution to target, wheezy or sid. Wheezy obviously has lower churn and is what will become the next stable release but it puts you further from the bleeding edge.

If you are serious about this please do join the debian-arm mailing list and make contact there. I know roughly how this stuff works but I wouldn't class myself as an expert on it, there are people FAR more knowlageable about it than me on that mailing list.

* a "binnmu" is a rebuild with the same source code and a +b<something> suffix on the binary package name to identify that it has been rebuilt for some reason.
Moderator
Posts: 1964
Joined: Wed Dec 28, 2011 11:45 pm
by mpthompson » Tue Mar 06, 2012 6:41 pm
plugwash said:

You also have to decide which debian distribution to target, wheezy or sid. Wheezy obviously has lower churn and is what will become the next stable release but it puts you further from the bleeding edge.

If you are serious about this please do join the debian-arm mailing list and make contact there. I know roughly how this stuff works but I wouldn't class myself as an expert on it, there are people FAR more knowlageable about it than me on that mailing list.


Thanks again for the detailed information.  I joined the debian-arm mailing list last night and will start to ask some questions there.  It looks like it will still take another month to two months to get my hands on actual RPi hardware so it gives me time to study up on what might be involved and think more about this.
User avatar
Moderator
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
by Steve-o » Tue Mar 06, 2012 8:27 pm
Sorry if this is a stupid question, but what is wrong with cross-compiling packages on a standard PC? I have been using OpenWRT for my MIPSel router a while and cross-compiling has never caused any problems for me.
Posts: 15
Joined: Tue Mar 06, 2012 8:24 pm
by plugwash » Tue Mar 06, 2012 8:41 pm
Cross building requires support from the packages build system. The build system must maintain a careful distinction between "host archicture" and "build architecture" and if anything has to be run during the build it will have to be built twice.

Debian has always built their packages natively, while there has been some attempt to add cross building support there is no requirement for packages in the official archive to support it so if you want to cross-build debian you are going to have to do a lot of buildsystem patching.

Finally cross building means you can't run build time test suited.
Moderator
Posts: 1964
Joined: Wed Dec 28, 2011 11:45 pm
by mpthompson » Wed Mar 07, 2012 1:32 am
plugwash said:

Debian has always built their packages natively, while there has been some attempt to add cross building support there is no requirement for packages in the official archive to support it so if you want to cross-build debian you are going to have to do a lot of buildsystem patching.

Does QEMU ARM count as cross-compiling?  I believe QEMU supports VFP and the other necessary CPU features to run armhf code and getting a straight armel version of Debian is pretty easy under QEMU.  If so, QEMU it could potentially be useful to compile at least a preliminary kernel and packages that would be required to bootstrap the build servers to minimize the risk of ARMv7 stuff making it's way into the RPi ARMv6 tuned armhf packages.

Of course, QEMU runs like a dog even on my quad-core desktop system so it certainly wouldn't want to be used for anything other than to build might be needed at a minimum to bootstrap things.
User avatar
Moderator
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
by PuFFiN » Fri Mar 09, 2012 9:51 am
Is it possible to just add support at compile time for hardware floats, but individually for apps?

(dom suggested code just need certain options to support fp)

Quote "No. You can compile code to use hardware floating point instructions (-mfpu=vfp). Otherwise it will emulate them with integer instructions."

Is there a huge benefit to be gained for a whole O.S. to have floats? Perhaps rather focus efforts to enable hardware float on apps that need it? just a thought.

- Gavin
Posts: 3
Joined: Sun Mar 04, 2012 4:59 am
by plugwash » Fri Mar 09, 2012 10:15 am
There are two seperate but interrelated issues, use of the floating point unit by the code generator (which does not in itself affect the ABI) and use of FPU registers to pass floating point parameters (which obviously changes the ABI).

You can enable the former without the latter (-mfpu=vfp -mfloat-abi=softfp) but it means that the compiler must move floating point arguments to integer registers or the stack (depending on parameter position) to pass them to functions. AIUI moving floating point values to integer registers is a relatively expensive operation (still faster than using softfloat though).

OTOH if you enable both (-mfpu=vfp -mfloat-abi=softfp) then the overhead of moving floating point values to integer registers is avoided but because you have changed the ABI the whole distro needs to be rebuilt.
Moderator
Posts: 1964
Joined: Wed Dec 28, 2011 11:45 pm
by mpthompson » Fri Mar 09, 2012 5:06 pm
plugwash said:

OTOH if you enable both (-mfpu=vfp -mfloat-abi=softfp) then the overhead of moving floating point values to integer registers is avoided but because you have changed the ABI the whole distro needs to be rebuilt.

Don't you mean (-mfpu=vfp -mfloat-abi=hard) will change the ABI requiring the whole distro to be rebuilt?  I believe armel already has -mfloat-abi=softfp set which allows individual packages to be recompiled with floating point instructions, but still keeping backwards compatibility via the ABI with non optimized packages.

As an aside, do we know what combination of optimization flags the Fedora Remix is using?  Is it (-mfpu=vfp -mfloat-abi=hard) for maximum code efficiency?  I haven't found a firm answer to this.

BTW, I'm spending my time these days learning about what it takes to build Debian packages and such.  I still have a lot to learn to even begin to form an opinion myself about what the best way for me to proceed might be.  I'm at the point now where I was successful in using pbuilder to backport a needed package so I'm making some progress in understanding the various tools (not that pbuilder would necessarily be used).

I'll admit I'm a bit discouraged as the whole process looks overwhelming from the perspective of someone new to the intricacies of what it takes to create a Linux distribution and package/build management.  Given all the work already done on the Fedora Remix and their commitment to push it forward into mainline Fedora, I am definitely looking into going that route to support the projects I accomplish with the RPi.  It's just a shame that I'm considering abandoning my favorite Linux distribution to do something with the RPi.

-mfpu=vfp -mfloat-abi=hard
User avatar
Moderator
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
by plugwash » Sat Mar 10, 2012 1:59 am
mpthompson said:


plugwash said:


OTOH if you enable both (-mfpu=vfp -mfloat-abi=softfp) then the overhead of moving floating point values to integer registers is avoided but because you have changed the ABI the whole distro needs to be rebuilt.


Don't you mean (-mfpu=vfp -mfloat-abi=hard) will change the ABI requiring the whole distro to be rebuilt?


Yes I do, sorry for the error.

To make things absoloutely clear

-mfloat-abi=soft uses software floating point (regardless of the value of -mfpu(

-mfloat-abi=softfp uses hardware floating point but keeps the softfloat calling convention

-mfloat-abi=hard uses hardware floating point and a calling convention that keeps values in floating point registers.
Moderator
Posts: 1964
Joined: Wed Dec 28, 2011 11:45 pm
by mpthompson » Sat Mar 10, 2012 4:49 am
This weekend I'm probably going to go ahead and order a single Freescale i.MX53 Quick Start Board and a hard drive so I can begin playing with different compiler options and such.

So for to play around with Debian ARM, I've been using an HP MV2120 Media Vault that I recently purchased for $20 on eBay.  The nice things about these is they have a Marvell 88F5182-A2 Orion ARM processor that does a decent job of running the armel version of Debian.  The MV2120 isn't the fastest box, but it's a definite upgrade from my lowly Slug.  However, I really need something that I can work with the different floating point options.

With the i.MX53 Quick Start on order, I'll then need to start looking to get Debian armhf installed and running on it.  Then I can begin starting to turn theory into practice with the various approaches to creating an RPi optimized version of Debian.

Question: with the previously released version of Debian for the RPi, there were a number of non-free libraries for accessing the VideoCore GPU.  Is the source available for these libraries so they can be recompiled?  I certainly hope so.
User avatar
Moderator
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
by dom » Sat Mar 10, 2012 9:55 am
No, the source is not available, but GitHub has prebuilt versions with soft and hard ABIs, so it"s quite possible to create a hard ABI distribution.
Moderator
Posts: 3858
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by mpthompson » Sun Mar 11, 2012 3:52 am
dom said:


No, the source is not available, but GitHub has prebuilt versions with soft and hard ABIs, so it"s quite possible to create a hard ABI distribution.


Terrific.  I'll cross my fingers that if/when I get to need them they'll work with the hard ABI without problem.
User avatar
Moderator
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA