hglm
Posts: 30
Joined: Fri May 31, 2013 8:24 pm

Experimental enhanced X driver (rpifb)

Fri May 31, 2013 8:59 pm

Hello,

I've made available the initial version an experimental slightly enhanced X graphics driver for the Raspberry Pi, based on the standard fbdev driver used by Raspbian and the sunxifb driver for the Allwinner platform.

This driver doesn't try anything fancy like using DMA, the improvements are purely due to a few CPU optimizations. Because the speed of X applications on the RPi is generally CPU-bound, without even considering graphics operations, don't expect miracles. However, with the shadow framebuffer disabled this driver should show smoother performance when dragging windows. The merits of disabling the shadow framebuffer are complicated; there exists a kernel patch to optionally enable write-through caching for the framebuffer which would improve performance if enabled when not using the shadow framebuffer.

The driver is available at https://github.com/hglm/xf86-video-rpifb.git

The framework is in place to add accelerated blit or fill functions using DMA, but this probably requires an enhanced kernel DMA driver to be stable. This driver could serve as basis for experimentation in this area, being less ambitious and less complicated than previous efforts for an accelerated x driver (see Accelerated X driver thread).

Refer to the README for installation instructions.

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Tue Jun 04, 2013 3:23 pm

Hi hglm,

Having a better optimized Xorg driver for Raspberry Pi would be definitely nice. Unfortunately after running a few benchmarks, it looks like any solution involving reading the uncached framebuffer with the CPU (for dragging and scrolling windows workload) is going to be something between 3 to 4 times slower than Cortex-A class processors (TI OMAP, Allwinner A10, Samsung Exynos, ...). And this is not enough for really smooth user experience.

BTW, if anyone wants to try the write-through cached framebuffer with shadowfb disabled, here is a patch for the current rpi-3.6.y branch: https://github.com/ssvb/linux-rpi/commi ... c3ecd3922b
Still IMHO the best solution for Raspberry Pi would be a non-busylooping, IRQ-aware dmaer kernel module rewrite. So that scrolling and dragging the windows is not pegging the CPU. Just start where @teh_orph left it, implement only what is really beneficial and optimize the hell out of it :)

And if you are up to trying it, a proper X11 EGL support would be definitely the most useful feature. Right now the developers are forced to use non-portable hacks to make use of GLESv2 acceleration in their applications. This results in some waste of efforts, a bit of vendor lock-in effect and fragmentation because the 3D accelerated applications developed for Raspberry Pi can't be compiled for the other devices without modifications and the other way around.

hglm
Posts: 30
Joined: Fri May 31, 2013 8:24 pm

Re: Experimental enhanced X driver (rpifb)

Tue Jun 04, 2013 10:54 pm

Reading from the framebuffer may be relatively slow, but I am nevertheless seeing a reasonable improvement
over the default fbdev driver, which is not optimized in this regard. A byte transfer rate of 100-150MB/s can
be attained when copying windows within the framebuffer. This should help the user experience a little when dragging windows. Also, by default the fbdev driver seems to use a slow path for rightwards overlapping screen blits, which are speeded up respectably using the two-pass approach from the sunxifb driver.

Thanks for posting the link for the write-through cached framebuffer patch, I might experiment with it. As for DMA, it seems to be fairly difficult challenge, having looked at @teh_orph's code, I am not even sure robust IRQ-driven DMA has been implemented anywhere in the RPi kernel. Developing would also be hard with risk of SD card corruption etc. But once a robust kernel driver is available, it would be relatively easy to fill in the stubs for accelerated fills and blits that are already present in my current driver.

As for EGL, it would be nice to have X11 support although the number of applications that use it is relatively limited. Integrating EGL support into an X driver is not really my area of expertise though. That said, if it can be done on the sunxi platform then why not for RPi.

I have updated the driver which adds another small improvement in framebuffer blit speed.

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Wed Jun 05, 2013 1:36 am

hglm wrote:Reading from the framebuffer may be relatively slow, but I am nevertheless seeing a reasonable improvement
over the default fbdev driver, which is not optimized in this regard. A byte transfer rate of 100-150MB/s can
be attained when copying windows within the framebuffer.
I can see the following performance results on different ARM devices: https://github.com/ssvb/xf86-video-sunx ... e512bf5c58
Raspberry Pi can only get ~114MB/s in a 700MHz non-overclocked setup. Which translates to just ~30FPS when scrolling 1280x720 screen with 32bpp color depth. For FullHD resolution the results are going to be even worse. The other ARM hardware is penalized significantly less when doing blits which rely on uncached reads by the CPU.
As for DMA, it seems to be fairly difficult challenge, having looked at @teh_orph's code, I am not even sure robust IRQ-driven DMA has been implemented anywhere in the RPi kernel. Developing would also be hard with risk of SD card corruption etc. But once a robust kernel driver is available, it would be relatively easy to fill in the stubs for accelerated fills and blits that are already present in my current driver.
For DMA in general, it's better to refer the documentation. Then https://github.com/raspberrypi/linux/bl ... 2708/dma.c and https://github.com/raspberrypi/linux/bl ... -bcm2708.c sources. The SDHCI code is probably the best (and apparently the only!) example of the use of DMA IRQ.

I don't know if dmaer avoids using IRQ for the DMA transfer completion notification because the DMA hardware is not robust, or just because it intentionally uses a different design (prefers chaining multiple DMA requests?). Maybe it's better to ask the author.

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Wed Jun 05, 2013 1:42 am

hglm wrote:The merits of disabling the shadow framebuffer are complicated
I'm probably going to write a detailed blog post about the shadow framebuffer performance implications and summarize possible solutions.

hglm
Posts: 30
Joined: Fri May 31, 2013 8:24 pm

Re: Experimental enhanced X driver (rpifb)

Thu Jun 06, 2013 9:18 pm

I've noticed the default LXDE/openbox window manager configuration in Raspbian will show lag when dragging windows regardless of low-level graphics operations speed improvements. It seems to do a lot of shadowing or simply continuously issues new redraw requests when dragging windows. Does anyone know whether there are configurations options that can be tried to improve the "feel" of the default environment?

In any case, it is possible to run openbox in a standard configuration by creating the file .xinitrc in your home directory containing the single line:

openbox

In this configuration, things like dragging window directly uses graphics operations such as blits and more accurately reflects the graphics driver speed (dragging windows is much smoother).

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Thu Jun 13, 2013 9:27 pm

hglm wrote:I've noticed the default LXDE/openbox window manager configuration in Raspbian will show lag when dragging windows regardless of low-level graphics operations speed improvements. It seems to do a lot of shadowing or simply continuously issues new redraw requests when dragging windows. Does anyone know whether there are configurations options that can be tried to improve the "feel" of the default environment?
Yes, it's related to the generation of expose events, which ask clients to redraw the lost content: http://www.df.unipi.it/~moruzzi/xlib-pr ... nts_expose
X11 has been in use since ancient times and is designed to work while consuming very little RAM. The applications may still provide backing-store and save-under hints to tell the X server that caching some content may be beneficial for performance at the expense of some memory waste.

Right now save-under attribute is not supported anymore, but backing-store is still emulated via window redirection in automatic mode by the composite extension. Actually in a composited desktop, every window is redirected to offscreen pixmap. The applications render content on these offscreen pixmaps and it's a job of a compositing window manager to present them on screen, optionally adding some fancy effects such as transparency. But we don't want to have a full compositing manager overhead, at least not when performance does matter.

I have tried to implement the backing-store allocation heuristics in the ddx driver itself: https://github.com/ssvb/xf86-video-sunx ... 225aa2900f
There may be other strategies for deciding whether to use backing-store or not. Maybe enable backing store only for the windows which generate too many expose events? Maybe disable backing store for all top level non-obscured windows and not just the one having input focus? The heuristics probably still can be tweaked.

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Thu Jun 13, 2013 9:39 pm

Now DMA is really important for the perfection of windows dragging and scrolling. Even without waiting for IRQ completion notification, DMA should be a big improvement in terms of memory bandwidth in spite of the CPU pointlessly spinning while DMA is active. Unfortunately looks like http://www.raspberrypi.org/phpBB3/viewt ... 63&t=28294 is already dead, so no assistance is probably expected there.

hglm
Posts: 30
Joined: Fri May 31, 2013 8:24 pm

Re: Experimental enhanced X driver (rpifb)

Thu Jun 13, 2013 10:04 pm

I have tried to implement the backing-store allocation heuristics in the ddx driver itself.
There may be other strategies for deciding whether to use backing-store or not. Maybe enable backing store only for the windows which generate too many expose events? Maybe disable backing store for all top level non-obscured windows and not just the one having input focus? The heuristics probably still can be tweaked.
Yes, there could be room for further optimization there, to avoid the overhead of backing store for more windows especially when hardware acceleration is available. In any case, your implementation already provides a very nice improvement in the "feel" of an X environment like LXDE, and that includes the default Raspbian set-up.

To anyone reading this considering trying the rpifb driver mentioned in this thread: the rpifb driver has now more or less been obsoleted by new features in the xf86-video-sunxifb driver (https://github.com/ssvb/xf86-video-sunxifb.git) on which it was based. The sunxifb driver compiles and installs out-of-the-box on the RPi using the fairly straightforward installation instructions from the rpibfb driver README....while providing a readily felt speed improvement for Raspbian's LXDE desktop.

IMHO this proves that the "try wayland because X is bad" talk is a bit of a red herring, because it is now proven that optimizations within the X driver can make X on the Pi more palatable.

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Fri Jun 14, 2013 9:00 am

hglm wrote:IMHO this proves
Well, it's too early to claim that it is already "proven" to the rest of the forum readers ;) Many people would prefer an easily installable package for raspbian or maybe even a whole SD card image with everything preinstalled. A youtube video with a demonstration would be also nice.

But first we need to come up with a better driver name. It's indeed controversial to use the driver named "sunxifb" (which implies Allwinner) on Raspberry Pi. Maybe something like xf86-video-armfbdev or xf86-video-fbturbo would work better for the unified optimized driver? Any other suggestions are welcome.
that the "try wayland because X is bad" talk is a bit of a red herring, because it is now proven that optimizations within the X driver can make X on the Pi more palatable.
Dragging windows is only a part of the driver functionality. But a fair re-match with Wayland in the windows dragging contest would be very interesting. Which is still better to be done only after DMA support is added.

asb
Forum Moderator
Forum Moderator
Posts: 853
Joined: Fri Sep 16, 2011 7:16 pm
Contact: Website

Re: Experimental enhanced X driver (rpifb)

Fri Jun 14, 2013 6:33 pm

ssvb wrote:
hglm wrote:IMHO this proves
But first we need to come up with a better driver name. It's indeed controversial to use the driver named "sunxifb" (which implies Allwinner) on Raspberry Pi. Maybe something like xf86-video-armfbdev or xf86-video-fbturbo would work better for the unified optimized driver? Any other suggestions are welcome.
fbturbo makes sense to me.

User avatar
Jim Manley
Posts: 1600
Joined: Thu Feb 23, 2012 8:41 pm
Location: SillyCon Valley, California, and Powell, Wyoming, USA, plus The Universe
Contact: Website

Re: Experimental enhanced X driver (rpifb)

Sat Jun 15, 2013 8:02 am

hglm wrote:IMHO this proves that the "try wayland because X is bad" talk is a bit of a red herring, because it is now proven that optimizations within the X driver can make X on the Pi more palatable.
This completely misses the point that X cannot be accelerated via the GPU due to its fundamental pixel-oriented nature that goes back three decades for historical reasons that are no longer valid today in an age of hardware-accelerated graphics that go well beyond pixels. The Wayland Weston implementation specifically uses frame acceleration features that the GPU supports and this eliminates the need for the ARM CPU to have to manipulate those elements. That frees up the CPU to shovel more data at higher speeds between the Ethernet/USB bus and the rest of the SoC, which is its primary purpose in life, despite the inexplicably romantic notions of some people.

No amount of optimization of X code running on the CPU can possibly outperform that functionality being shifted to the GPU. This isn't about anyone disliking or bashing X, it's about moving as many graphics manipulations as possible to where they belong, on the GPU, which is not named a graphics processing unit just for fun. It's beyond me why anyone would want to keep wasting execution cycles on the CPU for features that could be better performed on the GPU, especially when it enables those CPU cycles to be used to execute other functionality at higher levels of performance.
The best things in life aren't things ... but, a Pi comes pretty darned close! :D
"Education is not the filling of a pail, but the lighting of a fire." -- W.B. Yeats
In theory, theory & practice are the same - in practice, they aren't!!!

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Sat Jun 15, 2013 12:25 pm

Jim Manley wrote:This completely misses the point that X cannot be accelerated via the GPU due to its fundamental pixel-oriented nature that goes back three decades for historical reasons that are no longer valid today in an age of hardware-accelerated graphics that go well beyond pixels.
That's not quite true, but I'm not going to argue with you here.
The Wayland Weston implementation specifically uses frame acceleration features that the GPU supports and this eliminates the need for the ARM CPU to have to manipulate those elements.
Except that the ARM CPU does not really need to manipulate these elements in the first place, unless you enable compositing for getting window translucency effects. Compositing is a mandatory part of Wayland/Weston architecture, so you get these effects as a "free" payload. But if you just want accelerated dragging and scrolling for the boring rectangular opaque windows, the good old X11 is also fine.

I believe that somebody could even implement some sort of a DispManX based compositing window manager for X11 (with some hacks all over the place and maybe a new X11 extension to make it really work). But I'm not sure that all this eye candy is worth the efforts. In any case, you are better off starting with a proper X11 EGL support, which would allow to run some existing GLES accelerated compositing window managers.
No amount of optimization of X code running on the CPU can possibly outperform that functionality being shifted to the GPU. This isn't about anyone disliking or bashing X, it's about moving as many graphics manipulations as possible to where they belong, on the GPU, which is not named a graphics processing unit just for fun. It's beyond me why anyone would want to keep wasting execution cycles on the CPU for features that could be better performed on the GPU, especially when it enables those CPU cycles to be used to execute other functionality at higher levels of performance.
The point is that there are a bunch of X11 applications around: browsers, text editors, image editors, document viewers, mail clients, terminal emulators, ...

You basically have the following choices:
1. Try to make X11 applications faster today by adding some optimizations and fixing really obvious performance bottlenecks.
2. Alternatively you may want to wait for the whole new Wayland + GLES software ecosystem to emerge and replace all this legacy software.

On the GLES front, the Qt5 demos are quite promising. And the browser efforts to migrate to GLES are also interesting, but they are not fully there yet (I mean full GLES graphics pipeline for everything, not just WebGL and pinch-zoom layers compositing). But the progress is not fast enough to make me really optimistic about it. IMHO the current Raspberry Pi hardware may become obsolete long before this transition is complete.

hglm
Posts: 30
Joined: Fri May 31, 2013 8:24 pm

Re: Experimental enhanced X driver (rpifb)

Sat Jun 15, 2013 11:02 pm

I do realize that the GPU is the most powerful feature on the RPi and mostly wasted at the moment, and that X is not the way forward with regard to fully utilizing it. My comment was aimed at the general notion that the slowness of X on the Pi is completely unaddressable and that Wayland would be the only hope for improvement. I am just glad that some improvements in the feel of the X environment do appear to be possible, and can be gained by simply replacing a driver, instead of having to switch to a new developing ecosystem with its current drawbacks in practical usability.

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Mon Jun 17, 2013 4:23 pm

After thinking a bit about the best way to integrate DMA acceleration, the easiest way seems to be to just add DMA to fbcon copyarea function and introduce a new ioctl for accessing it from the userspace: https://github.com/ssvb/linux-rpi/commi ... a-20130617
It is partially based on the old code from dom: http://www.raspberrypi.org/phpBB3/viewt ... 425#p62425

And it can be used in the Xorg driver in the following way: https://github.com/ssvb/xf86-video-sunx ... 14f40a9d58
Should provide 2-3 times speed up for moving windows (as estimated by running benchmarks).

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Mon Jun 17, 2013 4:30 pm

ssvb wrote:After thinking a bit about the best way to integrate DMA acceleration, the easiest way seems to be to just add DMA to fbcon copyarea function and introduce a new ioctl for accessing it from the userspace: https://github.com/ssvb/linux-rpi/commi ... a-20130617
It is partially based on the old code from dom: http://www.raspberrypi.org/phpBB3/viewt ... 425#p62425
As an additional bonus, the framebuffer console itself may also benefit from it. However it looks like there are some boot logo related bugs which cripple scrolling acceleration (the 'logo_shown' variable misbehaves): http://article.gmane.org/gmane.linux.fbdev.user/677

I tried to debug it a bit, but the logic behind the 'logo_shown' variable is too convoluted for me. In any case, seems like it is enough to just start and stop X server to reset the state of framebuffer console. And guess what? It starts using copyarea for scrolling (and mostly sits on 'bcm_dma_wait_idle', waiting for DMA transfer completion) :)

Code: Select all

After reboot run "perf record cat bigtext.txt > /dev/tty1"

    69.03%      cat  [kernel.kallsyms]  [k] cfb_imageblit                  
     9.33%      cat  [kernel.kallsyms]  [k] fbcon_redraw.isra.10           
     8.33%      cat  [kernel.kallsyms]  [k] bit_putcs                      
     2.96%      cat  [kernel.kallsyms]  [k] _cond_resched                  
     2.38%      cat  [kernel.kallsyms]  [k] bitfill_aligned                
     2.18%      cat  [kernel.kallsyms]  [k] console_conditional_schedule   
     1.03%      cat  [kernel.kallsyms]  [k] fbcon_putcs                    

Then start/stop Xorg and try again:

    78.28%      cat  [kernel.kallsyms]  [k] bcm_dma_wait_idle              
     7.10%      cat  [kernel.kallsyms]  [k] fbcon_redraw_blit.isra.8       
     2.99%      cat  [kernel.kallsyms]  [k] _cond_resched                  
     2.61%      cat  [kernel.kallsyms]  [k] bitfill_aligned                
     2.25%      cat  [kernel.kallsyms]  [k] console_conditional_schedule   
     1.66%      cat  [kernel.kallsyms]  [k] bcm2708_fb_copyarea            
     1.52%      cat  [kernel.kallsyms]  [k] cfb_imageblit                  
     0.64%      cat  [kernel.kallsyms]  [k] do_con_write.part.19           
     0.46%      cat  [kernel.kallsyms]  [k] bcm_dma_start                  

hglm
Posts: 30
Joined: Fri May 31, 2013 8:24 pm

Re: Experimental enhanced X driver (rpifb)

Mon Jun 17, 2013 5:10 pm

That's impressive. Its looks like a nice step forwards for the RPi user experience. The bonus of accelerated console scrolling is very welcome too. Only drawback is it requires a kernel patch, but I guess it should be possible to get it integrated into the standard Raspbian kernel within reasonable time (also because the patch seems to be fairly clean because it only changes the fbconsole related stuff).

I think I am correct when I assume that despite spinning on bcm_dma_wait_idle much of the time, the actual CPU utilization for console framebuffer scrolling goes way down with the patch because the actual scrolling operation completes much faster, so it's big win despite the lack of IRQ-based DMA completion. This also applies to the X driver of course.

I will have to compile the kernel and try it for myself. :)

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Mon Jun 17, 2013 10:04 pm

hglm wrote:I think I am correct when I assume that despite spinning on bcm_dma_wait_idle much of the time, the actual CPU utilization for console framebuffer scrolling goes way down with the patch because the actual scrolling operation completes much faster, so it's big win despite the lack of IRQ-based DMA completion.
There is only a minimal performance improvement for framebuffer console scrolling (just something like ~20% faster for 1280x720 at 32bpp and only marginally faster for 16bpp). My understanding is that fbcon tries to avoid reading back from the framebuffer by default and instead draws text at the new location (fbcon_putcs -> ... -> cfb_imageblit). But if there is a hardware accelerated copyarea function, then it is taken into use. None of these alternative methods has a really decisive advantage. Though if we try to enable IRQ for DMA later, then the reduced CPU load might be a very nice thing to have.

In the case of X server it is entirely different story, because we are replacing the slow CPU reads from the framebuffer by a much faster DMA. This helps a lot.

hglm
Posts: 30
Joined: Fri May 31, 2013 8:24 pm

Re: Experimental enhanced X driver (rpifb)

Mon Jun 17, 2013 10:29 pm

ssvb wrote: There is only a minimal performance improvement for framebuffer console scrolling (just something like ~20% faster for 1280x720 at 32bpp and only marginally faster for 16bpp). My understanding is that fbcon tries to avoid reading back from the framebuffer by default and instead draws text at the new location (fbcon_putcs -> ... -> cfb_imageblit). But if there is a hardware accelerated copyarea function, then it is taken into use. None of these alternative methods has a really decisive advantage. Though if we try to enable IRQ for DMA later, then the reduced CPU load might be a very nice thing to have.
OK, in that case the situation is different. I should have realized that the framebuffer uses imageblit (just writing the text from font data, no copying), which is fairly efficient if done right (especially since the fill/memory write bandwidth of the RPi is pretty good). And actually, I can see a disadvantage of using DMA copies for framebuffer console scrolling: it uses a lot of the total system memory bandwidth (reading + writing). Even with IRQ enabled, concurrently running programs may still be starved for memory bandwidth because the DMA copying takes it away, especially considering the small CPU cache size on the RPi. In that regard, the CPU imageblit implementation is less wasteful.

However, for the X driver it is indeed a pure win.

ssvb
Posts: 112
Joined: Sat May 19, 2012 6:15 pm

Re: Experimental enhanced X driver (rpifb)

Mon Jun 17, 2013 11:11 pm

hglm wrote:And actually, I can see a disadvantage of using DMA copies for framebuffer console scrolling: it uses a lot of the total system memory bandwidth (reading + writing). Even with IRQ enabled, concurrently running programs may still be starved for memory bandwidth because the DMA copying takes it away, especially considering the small CPU cache size on the RPi. In that regard, the CPU imageblit implementation is less wasteful.
That's a good point, but I don't think it directly applies here. The CPU imageblit implementation is unsurprisingly fully occupying the CPU while it is doing its stuff, so the concurrently running programs are totally out of luck and are just waiting to be scheduled. Even if less memory bandwidth is used, it is still going to be wasted. And with IRQ enabled DMA, the concurrently running applications at least are going to still have a chance to do something useful at the same time.

dom
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 5340
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge

Re: Experimental enhanced X driver (rpifb)

Mon Jun 17, 2013 11:38 pm

ssvb wrote:
hglm wrote:And actually, I can see a disadvantage of using DMA copies for framebuffer console scrolling: it uses a lot of the total system memory bandwidth (reading + writing). Even with IRQ enabled, concurrently running programs may still be starved for memory bandwidth because the DMA copying takes it away, especially considering the small CPU cache size on the RPi. In that regard, the CPU imageblit implementation is less wasteful.
That's a good point, but I don't think it directly applies here. The CPU imageblit implementation is unsurprisingly fully occupying the CPU while it is doing its stuff, so the concurrently running programs are totally out of luck and are just waiting to be scheduled. Even if less memory bandwidth is used, it is still going to be wasted. And with IRQ enabled DMA, the concurrently running applications at least are going to still have a chance to do something useful at the same time.
The dma controller has a WAITS field, which can be used to reduce the memory bandwidth (by taking longer), if this were a problem (but I doubt it will be - the DMA is quick and will not saturate the memory bus for long).

hglm
Posts: 30
Joined: Fri May 31, 2013 8:24 pm

Re: Experimental enhanced X driver (rpifb)

Tue Jun 18, 2013 12:47 am

ssvb wrote: That's a good point, but I don't think it directly applies here. The CPU imageblit implementation is unsurprisingly fully occupying the CPU while it is doing its stuff, so the concurrently running programs are totally out of luck and are just waiting to be scheduled. Even if less memory bandwidth is used, it is still going to be wasted. And with IRQ enabled DMA, the concurrently running applications at least are going to still have a chance to do something useful at the same time.
I see, I was wrongly assuming that the kernel CPU imageblit function would be so optimized that it would almost be a pure fill (it which case it would take a shorter time than scrolling via DMA). This would be the case on a much faster CPU and if the imageblit function was more highly optimized (looking at the kernel code there is room for optimization there). Also if there was more than one CPU core the memory bandwidth argument would be more applicable.

Then there's also the point that imageblit causes more CPU cache thrashing than the DMA operating code, which in principle has a very low cache footprint.

User avatar
Jim Manley
Posts: 1600
Joined: Thu Feb 23, 2012 8:41 pm
Location: SillyCon Valley, California, and Powell, Wyoming, USA, plus The Universe
Contact: Website

Re: Experimental enhanced X driver (rpifb)

Tue Jun 18, 2013 6:22 am

ssvb wrote:But if you just want accelerated dragging and scrolling for the boring rectangular opaque windows, the good old X11 is also fine.
As has already been demonstrated, the Weston implementation already handles non-rectangular as well as boring old rectangular windows, as well as providing alpha channel support. Forget that with any existing or prospective impementaton of X on the Pi - "I'm from Missouri, show me." (for the uninitiated, this is an American Midwestern motto used by plain-speaking curmudgeons who are not easiy taken for fools).
ssvb wrote:I believe that somebody could even implement some sort of a DispManX based compositing window manager for X11 (with some hacks all over the place and maybe a new X11 extension to make it really work). But I'm not sure that all this eye candy is worth the efforts. In any case, you are better off starting with a proper X11 EGL support, which would allow to run some existing GLES accelerated compositing window managers.
Your belief system seems to have no problem forecasting that such a possibility (that no one has seen any reason to take up) requires less effort than the alternative that's actually in progress.
ssvb wrote:The point is that there are a bunch of X11 applications around: browsers, text editors, image editors, document viewers, mail clients, terminal emulators, ...
For which very few in the Pi's target demographic (secondary school students, teachers, and parents) have any need. The Pi is not aimed at the general-purpose marketplace dominated by uninformed business drones or the Nerdocracy. Have you ever wondered why X-based technology and applications have never accrued more than a single-digit percentage of market share on desktops and laptops, and virtually nothing in the mobile device arena? How many of the 30,000+ Raspbian packages have or will ever be loaded by the intended Pi users? Earth to Nerdocracy: "The Pi was not developed for you. If you are unhappy with it, please move on, there's nothing more for you to see here."
ssvb wrote:On the GLES front, the Qt5 demos are quite promising. And the browser efforts to migrate to GLES are also interesting, but they are not fully there yet (I mean full GLES graphics pipeline for everything, not just WebGL and pinch-zoom layers compositing). But the progress is not fast enough to make me really optimistic about it. IMHO the current Raspberry Pi hardware may become obsolete long before this transition is complete.
You (among many others, so don't feel bad) still don't understand the purpose of the Pi and why that purpose has nothing to do with the endless, traditional, for-profit race-to-the-bottom where last year's models are tossed on the trash heap so the executives and marketing department can justify rolling out this year's shiny new toys, along with collecting the accompanying shiny new bonus checks and stock earnings. It was only 18 months ago that many were vociferously doubting that the Pi would ever exist, then there were the jeers about production delays, followed by groans about its ARMv6 CPU, and now the continued inappropriate attempts to port things to the Pi that are a waste of time. The longevity of the Pi has been severely underestimated because clock speed isn't the only criterion on which a system should be judged. As I've posted elsewhere, many people forget that the lowly Apple ][ was on the market for 16 (no, that's not a typo, SIXTEEN) years, including the venerable //e model with just a one MHz 6502 CPU and a maximum of 48KB of RAM.

The reason for the longevity of the Apple ][ is that schools (and homes where students could use the same software) buying Apple ][s had no need for the latest-and-greatest-priced shiny baubles (initially S-100 boat anchor systems, then early PCs and clones). A consistent, predictable platform is much more important to severely budget-constrained school districts that can't afford every year to toss out the hardware, software, and most importantly, teacher experience, just to keep some computing industry fat-cats in the lifestyles to which they have become accustomed. The Pi was not developed for the Nerdocracy just so they could load too much junk on it and then complain it lacks performance - that's why the shiny stuff is sold that pays for executives' Ferraris and Citations.

What we can hope is that the Nerdocracy moves on to the Next Next Thing as soon as possible, as millions-selling lightning like the Pi only strikes infrequently, but that won't keep the teeming masses from trying to annoint something else as officially being Cool. Once they do move on, then we can remain bore-sighted on the business of completing the tools and content really needed by students and educators, and that's not where you and others are focused. Educational institutions haven't even started buying the Pi in bulk yet, in large part because the Nerdocracy sucked up so much of the supply, making consistent supply to institutions completely unreliable. There is plenty of time to deliberately design whatever may become the next generation of the Pi, with backward compatibility being the top priority, along with maintaining, and perhaps even reducing the prices. The concept has been proven valid beyond the Foundation's wildest dreams, so the only risk going forward would be developing something that makes continued use of existing Pii problematic and fragmenting the Pi market would be death for the platform. Sometimes, "better" really is the enemy of good enough.
The best things in life aren't things ... but, a Pi comes pretty darned close! :D
"Education is not the filling of a pail, but the lighting of a fire." -- W.B. Yeats
In theory, theory & practice are the same - in practice, they aren't!!!

asb
Forum Moderator
Forum Moderator
Posts: 853
Joined: Fri Sep 16, 2011 7:16 pm
Contact: Website

Re: Experimental enhanced X driver (rpifb)

Tue Jun 18, 2013 9:31 am

Jim, maybe your post gives the wrong impression accidentally (or I am reading it wrong), but I really don't understand why you appear to be discouraging developments such as this. Apologies if I misinterpreted your point.
Jim Manley wrote: Earth to Nerdocracy: "The Pi was not developed for you. If you are unhappy with it, please move on, there's nothing more for you to see here."
The pi was developed for learners, educators, hackers, *doers*. This sort of comment is appropriate for people moaning about perceived Raspberry Pi limitations, but I feel it's totally misplaced when directed towards those who have a problem and take steps to fix it. This is of course a perfect example of the hacker ethos to "scratch your own itch", and without such hackers the Pi would never exist nor would we have the continued success and growing community.

I think we can all agree X has some fundamental limitations and difficulties that simply wouldn't be worth fixing. I believe targeted optimisations to a derivative of the xorg fbdev DDX is pragmatic, sensible, and valuable to many users regardless of whether in the future we move to a primarily Weston/Wayland world.

Jim, if you want to continue to discuss matters separate to the technical details of rpifb, perhaps you could create a new topic to do so?

User avatar
Jim Manley
Posts: 1600
Joined: Thu Feb 23, 2012 8:41 pm
Location: SillyCon Valley, California, and Powell, Wyoming, USA, plus The Universe
Contact: Website

Re: Experimental enhanced X driver (rpifb)

Tue Jun 18, 2013 7:09 pm

asb wrote:I think we can all agree X has some fundamental limitations and difficulties that simply wouldn't be worth fixing. I believe targeted optimisations to a derivative of the xorg fbdev DDX is pragmatic, sensible, and valuable to many users regardless of whether in the future we move to a primarily Weston/Wayland world.
Jim, if you want to continue to discuss matters separate to the technical details of rpifb, perhaps you could create a new topic to do so?
I would strongly suggest folks look up the concept of "opportunity cost" - it's what can't be accomplished because resources are being expended on other things that aren't along the path of greatest overall benefit. People keep antique vehicles running for the love of it and that's fine, but you would be very ill-advised to drive an original Ford Model T out in the high-speed lane of a superhighway. As some may know, I help maintain, operate, and present the Babbage Difference Engine in Silicon Valley, but I would be a complete fool to think that anything practical could be accomplished on it beyond the educational value of demonstrating the embodiment of the most basic computing fundamentals in ways that are impossible to see in modern electronic products.

X was developed to solve a problem that no longer exists - network distribution of bit-mapped screen elements at its very core and is a very complicated mess because there have been so many chefs involved in making the special sauce, each with their own agenda (whether conscious or not). It's way past the point in time to expend all available effort on developing software that takes full advantage of GPUs since hardware-accelerated graphics have been available for going on 30 years (ironically, nearly as long as X has been around, albeit exhorbitantly expensive for the first decade, or so). If people are genuinely interested in learning computing graphics technology other than for historical edification, they should be working with vectors, not bit-maps, and preferably in 3-D because, for those who may not have noticed, that's the kind of world in which we actually live. We only started using 2-D metaphors in GUIs because that's all the hardware would support with reasonable responsiveness decades ago. Virtually every computer with integrated video output manufactured within the last 10 years (and most made within the last 15 ~ 20 years) contains the hardware necessary to support advanced user interface capabilities.

This is analogous to continuing to scan printed word processor documents into fax machines connected to wired plain old telephone system (POTS) lines in a world dominated by fully electronic documents with encrypted signatures, fiber optics, and packet-routing networks. Do we want to have fax machines in museums hooked up to demonstrate what life was like in the Goode Olde Dayes? Sure, but do we need to continue figuring out ways to make fax machines communicate faster? I'll leave you to answer that question and then reflect on whether improving three decades old technology by an incremental fraction is a worthwhile expenditure of time, or whether it's time to move on to take advantage of hardware acceleration that's been around for many years and routinely going to waste. I completely understand the desire to work with the familiar rather than the unknown, but sometimes you just need to reinvent your world. Otherwise, we'd all still be sitting around in cold, dank caves shared with all manner of nasty critters while trembling at the lightning outside and not taking advantage of something called fire that occasionally starts as a result of those scary electrical discharges.

In any case, enjoy and I won't rain on your parade any more. Goodness knows there are much more questionable expenditures of effort going on that there aren't enough hours in the day to become aware of, much less call out for rational consideration. This project just happens to be in an area where I've been doing work for over 40 years and it bothers me to see mistakes being made in a strategic sense while the participants believe the tactical gains are meaningful. It gets to my challenge made in other threads for everyone to think so far outside the box that you need the Hubble Space Telescope to look back and see the box.
The best things in life aren't things ... but, a Pi comes pretty darned close! :D
"Education is not the filling of a pail, but the lighting of a fire." -- W.B. Yeats
In theory, theory & practice are the same - in practice, they aren't!!!

Return to “General discussion”