Consequences of softfloat


21 posts
by Chromatix » Sun Jun 24, 2012 9:35 am
I've had an R-Pi running well enough to be able to try several distros confidently, and currently it is shamefully clear that Debian is *far* too slow in comparison to most.

A direct comparison between Gentoo (running with hardfloat) and Debian (softfloat) can be made using Abiword. The Windows version of Abiword lists a 486 as a minimum requirement, so a 700MHz ARM should be entirely sufficient. I tried it on an old ThinkPad with a 700MHz Pentium-3, which gave very promising results.

And indeed under Gentoo, everything seems to be fine. I can type things and they appear immediately. I haven't tried making any kind of complex document yet, but I have a ready-made project to try later.

But under Debian, I type and it takes a while to catch up with me. Every keystroke is registered, but it takes 2-3 times as long as my typing speed for it all to appear on screen. It is essentially unusable.

My theory here is that Abiword is using a lot of floating-point to perform the text layout and display. Because this is done without the aid of the FPU under Debian, it takes hundreds of times longer than it should - much more than the difference between a 700MHz ARM and a basic 486.

Try it and see. You will be horrified.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by mdewey » Sun Jun 24, 2012 9:42 am
Dear Chromatix

What happens if you try Raspbian which does use hardware floating point?

Michael
Posts: 37
Joined: Wed Dec 07, 2011 10:47 am
Location: UK
by mrlinux2u » Sun Jun 24, 2012 5:27 pm
@OP

Just tried abiword in a fully updated Raspbian (Piceses edition) and it too is painfully slow (too slow to use properly).

It look likes a debian issue in general, so I'm gonna to try it in arch and see how that fairs.

cheers

mrlinux2u
Posts: 174
Joined: Sat Sep 24, 2011 8:38 pm
by jamesh » Sun Jun 24, 2012 5:34 pm
Have you got Abiword set up the same in both (for example, does one has auto spell checking turned on?)?
Soon to be employed engineer - Hurrah! Volunteer at the Raspberry Pi Foundation, helper at PiAcademy September 2014.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 11930
Joined: Sat Jul 30, 2011 7:41 pm
by Chromatix » Sun Jun 24, 2012 5:40 pm
I didn't get around to trying Raspbian yet, but I was able to type and edit a thousand-word essay on my Gentoo version, after rebuilding it with spellchecking (etc.) support. It's still slow and clunky in places, but it worked sufficiently well to not be totally embarrassing, and in particular it still kept up with my typing. The Debian version has spellchecking support, but I hadn't turned it on.

Mind you, I had to borrow a keyboard from another computer to do it. The first one I tried doesn't work at all under X11, the second is reliable but too small for serious typing, and this third one is one of my extra-nice ones for serious work (a Cherry G83). This also means that the typing speed it was keeping up with was higher than I was using for Debian.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by dom » Sun Jun 24, 2012 11:16 pm
Chromatix wrote:I didn't get around to trying Raspbian yet, but I was able to type and edit a thousand-word essay on my Gentoo version,


I'd be interested in the results of this. I'd expect performance to be on a par between Gentoo and Raspbian, but if Gentoo is noticably quicker, then that would be worth investigating.

I did try a very brief test of Abiword and Gnumeric under Raspbian and both seemed usable (albeit slow compared to a powerful PC).
But maybe it gets slower after typing a thousand words, maybe I'm more patient, or maybe I'd overclocked substantially...
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4042
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by Chromatix » Mon Jun 25, 2012 9:47 pm
Okay, after spending quite a lot of time getting Raspbian set up, a preliminary test suggests the situation is little improved over plain Debian. Specifically, it does not keep up with my typing when given a trivial couple of sentences to process.

There are another couple of nasty bugs visible, but those are probably due to my haste in setting up Raspbian. I might take a second run at it to confirm more reliably, but not tonight.

Spellcheck is turned on here, but so is it in Gentoo. It should not make this much of a difference anyway.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by jamesh » Mon Jun 25, 2012 9:57 pm
Have you tried any other apps as a comparison, to see if this slow down on Debian is limited to Abiword, or is more general?
Soon to be employed engineer - Hurrah! Volunteer at the Raspberry Pi Foundation, helper at PiAcademy September 2014.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 11930
Joined: Sat Jul 30, 2011 7:41 pm
by Chromatix » Mon Jun 25, 2012 10:15 pm
Well, the usual maths-heavy suspects (eg. XaoS) are obviously slow on Debian, as a direct consequence of not using the FPU. XaoS copes relatively well in autopilot mode, but it clearly has less horsepower to work with than on a properly optimised system. Abiword just surprised me a lot when I tried it, because it is specifically supposed to be lightweight, and people are actively recommending it as a more appropriate alternative to Open/LibreOffice.

I haven't specifically compared very many things with Gentoo yet, partly because building things into Gentoo tends to take a while. For example, if I start building Firefox now, it probably won't be finished by the time I get back from the office tomorrow. That would however make a very good point of comparison, since it's easy to run a few page loads and a JavaScript benchmark. A difference in performance on the same order as with Abiword would be very easy to spot.

Other worthwhile things to try, I think, are Gnumeric and Inkscape to round out the typical office-type applications. Inkscape comes with a few example pictures which are complex enough to make a good workout. These two would probably even take less time to build in Gentoo than Firefox.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by dom » Mon Jun 25, 2012 10:33 pm
@Chromatix

What's your theory on why Gentoo runs better than raspbian (for Abiword).

It's clear why Gentoo is faster than standard Debian, but comparing to raspbian is more interesting.

I would imagine that compiler options will be similar for Gentoo and Raspbian.
I assume Gentoo runs a newer source tree (more like Debian Sid) ? That may be significant (Wheezy is significantly faster than Squeeze running Midori due to improvements in source tree).
Anything else different ? Is Gentoo using LXDE? Is there more free RAM on Gentoo? I assume everything else is the same, e.g. memory split, config.txt options?
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4042
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by Chromatix » Tue Jun 26, 2012 2:20 pm
The most explainable difference might be that I'm using LXDE on both Debian and Raspbian, but Blackbox on Gentoo. Does LXDE use a compositor?

If that's not it, I'll have to poke at it with a profiler.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by Chromatix » Thu Jun 28, 2012 7:55 am
I retested using Blackbox on both distros and videoed the results:

http://youtu.be/3Sfk2rT2KzI
http://youtu.be/1UbnsKUVA5E

Note that Raspbian is showing some graphical artefacts as well as being much slower. I still need to investigate using a profiler.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by Chromatix » Sat Jun 30, 2012 3:31 pm
Well, both oprofile and valgrind are missing in action. I found time to do a different kind of analysis though - looking at the selection of instructions used. My hunch turns out to be correct.

Code: Select all
objdump -d /usr/lib/arm-linux-gnueabihf/libabiword-2.9.so | cut -s -f3 | sort -u | less


This command lists all the instructions used in the given binary. It's a *long* list, but near the bottom are most of the FPU instructions - and VCVT jumped out at me. That's a VFPv3 instruction, for Cortex series CPUs, not available on the VFPv2 that we have here.

So we've got code compiled for the wrong CPU architecture. It still works thanks to emulation of the bad instructions, but emulation is inefficient which is why it's slow in some cases.

It's not just Abiword that is affected, libm.so also has VCVT and even VMOV.U8 which is a NEON instruction.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by dom » Sat Jun 30, 2012 3:53 pm
Chromatix wrote:It still works thanks to emulation of the bad instructions, but emulation is inefficient which is why it's slow in some cases.


Are you sure about this? armv7 code typically aborts with an illegal instruction exception. I wasn't aware of any emulation of unsupported instructions.

However this is certainly interesting information, and mpthompson/plugwash should double check their compile flags.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4042
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by Chromatix » Sat Jun 30, 2012 4:01 pm
I'm getting more detailed information now which says that *some* forms of VCVT are already in VFPv2. This is vexing, as the differences between the forms that are allowed and the ones that aren't are not revealed by the above command.

I'm still looking at this, but double-checking the flags is a good idea anyway. If they can get oprofile built, that would be even better.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by Chromatix » Sat Jun 30, 2012 7:18 pm
Okay, now I *really* want oprofile.

I just happened to look at top when typing, and nearly all the CPU time was being used by Xorg, not by Abiword itself. In my experience this usually has to do with pixman, but it will be nearly impossible to determine exactly what's going on without a decent profile.

However I did manage to get an X11 protocol trace. There are some humongous Trapezoids calls being made there, though that's not a watertight case against them being the culprit. Generally, trapezoids are not very fast, and most likely an ARMv6-relevant optimisation I once made to that routine isn't there.
The key to knowledge is not to rely on people to teach you it.
User avatar
Posts: 430
Joined: Mon Jan 02, 2012 7:00 pm
Location: Helsinki
by dom » Sat Jun 30, 2012 11:27 pm
Chromatix wrote:Okay, now I *really* want oprofile.


I just downloaded orprofile source from http://oprofile.sourceforge.net/download, and:
Code: Select all
sudo apt-get install libpopt-dev binutils-dev
./configure --with-kernel-support
make
sudo make install


and no problems. Took a couple of hours.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4042
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by Max » Sun Jul 01, 2012 3:31 am
http://packages.debian.org/wheezy/abiword
Package: abiword (2.9.2+svn20120603-1)
2012-06-03 sounds like last month's experimental development tree?

http://packages.gentoo.org/package/app-office/abiword
2.8.6-r2
2.8.6 is stable version from 2010?


Suggest you compile the exact same Abiword version on both distributions first, and see if there are still significant differences in performance before anything more time consuming.
by cleverca22 » Sat Aug 25, 2012 5:51 am
did you ever get oprofile to work? ive got it compiled, but it seems to not record any samples
Posts: 168
Joined: Sat Aug 18, 2012 2:33 pm
by dom » Sat Aug 25, 2012 11:14 am
cleverca22 wrote:did you ever get oprofile to work? ive got it compiled, but it seems to not record any samples

I've had it working. Did you try the timer=1 option?
http://oprofile.sourceforge.net/doc/det ... eters.html
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 4042
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge
by cleverca22 » Sat Aug 25, 2012 5:11 pm
yep, forcing the timer intertupt does work, but i kinda wanted to use the dozens of performance counters the cpu had
i can work with this for now :)
Posts: 168
Joined: Sat Aug 18, 2012 2:33 pm