User avatar
rpiMike
Posts: 966
Joined: Fri Aug 10, 2012 12:38 pm
Location: Cumbria, UK

Re: Thread from Pi4 discussion

Mon Jul 01, 2019 4:13 pm

pik33 wrote:
Mon Jul 01, 2019 3:58 pm
rpiMike wrote:
Mon Jul 01, 2019 3:51 pm
With the FKMS GL Driver enabled, glxgears synchronises to the vertical refresh rate. With the legacy driver glxgears does not.
And this is only difference. 40 fps hardware accelerated instead of 45 fps non accelerated due to vblank synchronization. No traces of acceleration.
Look at your CPU utilisation with GL Driver enabled (0%) vs legacy (80%).

dom
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 5349
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge

Re: Thread from Pi4 discussion

Mon Jul 01, 2019 4:39 pm

PeterO wrote:
Mon Jul 01, 2019 4:12 pm
Is this fix available ? That page mentions "bump to 4.19.56 #1" but that doesn't seem to be available yet,
PeterO
Where are you looking? It's currently only in rpi-update firmware (this update)
It will appear in apt firmware soon.

User avatar
PeterO
Posts: 5086
Joined: Sun Jul 22, 2012 4:14 pm

Re: Thread from Pi4 discussion

Mon Jul 01, 2019 4:43 pm

dom wrote:
Mon Jul 01, 2019 4:39 pm
PeterO wrote:
Mon Jul 01, 2019 4:12 pm
Is this fix available ? That page mentions "bump to 4.19.56 #1" but that doesn't seem to be available yet,
PeterO
Where are you looking? It's currently only in rpi-update firmware (this update)
It will appear in apt firmware soon.
How "safe" is rpi-update at the moment ? (I've not had cause to use it for many years).

Kernel bump is mentioned on https://github.com/raspberrypi/firmware/issues/1154
PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

pik33
Posts: 183
Joined: Thu Sep 10, 2015 4:26 pm

Re: Thread from Pi4 discussion

Mon Jul 01, 2019 4:52 pm

rpiMike wrote:
Mon Jul 01, 2019 4:13 pm

Look at your CPU utilisation with GL Driver enabled (0%) vs legacy (80%).
Yes, it is!!! This means the VC6 works and there is a bottleneck somewhere else. I also tested SDL2 with similar results: 37 seconds for 600 1920x1200 empty frames regardless of the driver with SW driver being slightly faster.

jahboater
Posts: 4785
Joined: Wed Feb 04, 2015 6:38 pm

Re: Thread from Pi4 discussion

Mon Jul 01, 2019 4:54 pm

PeterO wrote:
Mon Jul 01, 2019 4:43 pm
How "safe" is rpi-update at the moment ? (I've not had cause to use it for many years).
I did it recently with no issues. YMMV
It cured the "over_voltage=-1" problem.

User avatar
PeterO
Posts: 5086
Joined: Sun Jul 22, 2012 4:14 pm

Re: Thread from Pi4 discussion

Mon Jul 01, 2019 5:26 pm

glxgears still only managing 40 fps in a full screen window.
But
Tearing on vertical lines in my demo code is gone :-)
PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

Brian Beuken
Posts: 177
Joined: Fri Jan 29, 2016 12:51 pm

Re: Thread from Pi4 discussion

Tue Jul 02, 2019 7:36 am

dom wrote:
Mon Jul 01, 2019 1:32 pm
Brian Beuken wrote:
Sun Jun 30, 2019 6:23 pm
The bad news is, what ran at 60fps on a Pi3 runs at 5fps on a Pi4 ....ermmm does that mean the Mesa libs are emulating everything?
You will get software 3d rendering if you have the legacy firmware driver enabled.
You should have fkms driver enabled (dtoverlay=v3d-fkms-vc4 in config.txt)
It is enabled by default on Pi4, and can be enabled with raspi-config on Pi2/3.
This little nuggest passed me by before, and indeed was the cause of most of my issues, when doing OGLES2.0 on older Pi's you can't have theOpenGL running, it causes failed to add service error. I routinely ensured I had disabled OpenGL drivers on my older Pi's

But the Pi4, and its new mesa drivers depend on the OpenGL fake KMS drivers to be in place... So stay away from raspi-config, install mesa libs, use the usr\lib versions of EGL and GLES2 , and you can build OpenGLES projects again. This is in line with every other SBC's standard Linux builds. Which is a shame as I liked the fact the Raspberry was a little different.

One small thing though, on dispmanx it was possible to set the src rect to a fraction of the screen size (typically 1/2) that reduced the amount of work the GPU had to do and let the display stretch to the full size. I can't do that with the X window...can I?
Last edited by Brian Beuken on Tue Jul 02, 2019 8:33 am, edited 2 times in total.
Very old computer game programmer, now teaching very young computer game programmers, some very bad habits.
Wrote some book about coding Pi's and SBC's, it's out now...go get it!
http://www.scratchpadgames.net/

techyian
Posts: 63
Joined: Mon Jan 22, 2018 11:40 am

Re: Thread from Pi4 discussion

Tue Jul 02, 2019 8:02 am

I can see that people are having success with GLES 2.0 which is great, however has anyone managed to get GLES 1.1 working with SDL2 yet? What's confusing me a bit is which headers to use and also what configuration SDL2 requires. GLES 1.1 firmware headers can still be found in /opt/vc/include, and I can see that GLESv1_CM (I think this is Mesa?) is available in /usr/lib/, but Debian Buster no longer supports the "libgles1-mesa-dev" package to get GLES 1.1 mesa headers.

Using the GLES 2.0 headers isn't correct here as the fixed function pipeline stuff isn't supported in 2.0. Also linking to the "libbrcmGLESv2" library is no longer correct either as Videocore 6 requires the mesa libs.

When configuring SDL2, the commonly documented way of setting up GLES is as follows:

Code: Select all

--host=armv7l-raspberry-linux-gnueabihf --disable-pulseaudio --disable-esd --disable-video-mir --disable-video-wayland --disable-video-x11 --disable-video-opengl
This will force SDL2 to use its "RPI" video driver, which I assume configures it to use EGL -> dispmanx. As dispmanx is not the correct display manager to use for the Pi 4, does this mean SDL2 is going to need some work? Which video driver should SDL2 use?

I will carry on testing with various configurations and see if I make any progress.
MMALSharp - C# API for the Raspberry Pi camera module

https://github.com/techyian/MMALSharp

dom
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 5349
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge

Re: Thread from Pi4 discussion

Tue Jul 02, 2019 1:37 pm

PeterO wrote:
Mon Jul 01, 2019 4:43 pm
How "safe" is rpi-update at the moment ? (I've not had cause to use it for many years).
It's always advisable to have a backup of sdcard just in case, or run it on a non-critical Pi.
We're not currently aware of any issues in rpi-update firmware/kernel that aren't in apt firmware/kernel.
Regressions are very rare, but can happen in any testing software.

techyian
Posts: 63
Joined: Mon Jan 22, 2018 11:40 am

Re: Thread from Pi4 discussion

Tue Jul 02, 2019 7:18 pm

Hi all,

I've now got OpenGLES 1.1 working with SDL2. The setup is slightly different compared to previous Pi models.

1. As "libbrcmGLESv2" should not be used anymore, you should now link to "GLESv1_CM" found in "/usr/lib/".
2. The GLES 1.1 headers found in "/opt/vc/include" should still be used.
3. SDL should be configured without any additional flags, simply "./configure".

Using the above setup, Wolfenstein Enemy Territory is running very well within X11.

Hope that helps someone.
MMALSharp - C# API for the Raspberry Pi camera module

https://github.com/techyian/MMALSharp

User avatar
Paeryn
Posts: 2709
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Thread from Pi4 discussion

Tue Jul 02, 2019 8:47 pm

techyian wrote:
Tue Jul 02, 2019 7:18 pm

2. The GLES 1.1 headers found in "/opt/vc/include" should still be used.
Are you sure about this step? I've not got an RPi4 yet, nor have I used the Mess drivers but I'm pretty sure if you are using Mesa then you don't want to be including the legacy driver's headers, you should be including the Mesa driver's headers.

The headers will largely be the same but they will differ in regards of what extensions they say they support (more so in glext.h than gl.h)
She who travels light — forgot something.

techyian
Posts: 63
Joined: Mon Jan 22, 2018 11:40 am

Re: Thread from Pi4 discussion

Tue Jul 02, 2019 8:51 pm

The issue as I've mentioned before is that "libgles1-mesa-dev" can no longer be installed in Raspbian Buster so the 1.1 Mesa headers can't be used. You can't use the GLES 2.0 headers because they don't contain the fixed function pipeline bits and therefore fail to compile a GLES 1.1 program. This alternative seems to work well even though it's not technically the correct way of doing things.
MMALSharp - C# API for the Raspberry Pi camera module

https://github.com/techyian/MMALSharp

jcgamestoy
Posts: 2
Joined: Wed Mar 05, 2014 7:30 pm

Re: Thread from Pi4 discussion

Wed Jul 03, 2019 7:11 pm

PeterO wrote:
Mon Jul 01, 2019 5:26 pm
glxgears still only managing 40 fps in a full screen window.
But
Tearing on vertical lines in my demo code is gone :-)
PeterO
Disable the compositor with the A8 option of raspi-config...

Here 60fps with glxgears running in 2 screens at 1920x1080 :o

User avatar
PeterO
Posts: 5086
Joined: Sun Jul 22, 2012 4:14 pm

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 6:20 am

jcgamestoy wrote:
Wed Jul 03, 2019 7:11 pm
PeterO wrote:
Mon Jul 01, 2019 5:26 pm
glxgears still only managing 40 fps in a full screen window.
But
Tearing on vertical lines in my demo code is gone :-)
PeterO
Disable the compositor with the A8 option of raspi-config...

Here 60fps with glxgears running in 2 screens at 1920x1080 :o
:shock: So what are the consequences of disabling this as using it seems to be a serious performance hit :o
PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

pik33
Posts: 183
Joined: Thu Sep 10, 2015 4:26 pm

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 8:36 am

Yes, it is... composite AND 4k60p has to be disabled to get the glxgears works at 60 fps fullscreen, but then what it means and what changes are made with these two options?

And then SUCCESS: got Atari800 emulator working at 100% speed after switching the composition off. The first retroemulator working with my RPi4! I had to recompile it for "normal Linux" instead of "RPi" (using old OpenGL ES which doesn't work on RPi4) but before the composition trick it runs at 77% of Atari speed (=unusable)

User avatar
rpiMike
Posts: 966
Joined: Fri Aug 10, 2012 12:38 pm
Location: Cumbria, UK

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 9:49 am

So what does the 'xcompmgr composition manager' do?

User avatar
PeterO
Posts: 5086
Joined: Sun Jul 22, 2012 4:14 pm

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 10:03 am

rpiMike wrote:
Thu Jul 04, 2019 9:49 am
So what does the 'xcompmgr composition manager' do?
I've been googleing that very question, and the most informative thing I've found so far s this
https://github.com/freedesktop/xcompmgr wrote: xcompmgr is a sample compositing manager for X servers supporting the
XFIXES, DAMAGE, RENDER, and COMPOSITE extensions. It enables basic
eye-candy effects.
My emphasis.... So nothing useful or important so not sure why it is enabled by default if it causes such a performance hit ?

PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7457
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 10:09 am

rpiMike wrote:
Thu Jul 04, 2019 9:49 am
So what does the 'xcompmgr composition manager' do?
https://en.wikipedia.org/wiki/Compositi ... ow_manager

Without it every single X update has to complete before the next can be run because the next update may rely on the previous output (eg when dragging a window).
AIUI xcompmgr renders each element (window) as individual textures, therefore it is generally faster to update as all the source textures are still available.
It can also shift a load of the processing to the X server rather than in the X client context. This allows a few optimisations within the render process.

Generally we'd seen significant improvements with xcompmgr. There were a few discussions on scheduling, as the window composition stage is using also the 3D hardware, therefore your GL app has slightly less resource to do what it wants, but I hadn't previously noted a significant difference.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

User avatar
PeterO
Posts: 5086
Joined: Sun Jul 22, 2012 4:14 pm

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 10:17 am

6by9 wrote:
Thu Jul 04, 2019 10:09 am
Generally we'd seen significant improvements with xcompmgr. There were a few discussions on scheduling, as the window composition stage is using also the 3D hardware, therefore your GL app has slightly less resource to do what it wants, but I hadn't previously noted a significant difference.
So far I've only seen a significant improvement by turning it off ! If it's going to adverslry effect openGL(ES) performance (and there seems to be an increasing interest in those) then maybe the decision needs rethinking ? Not being able to run glxgrears full screen at 60Hz is pretty disappointing performance.

Is there a way to tell it not to get in the way of updates to hardware rendered windows ?

PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

User avatar
rpiMike
Posts: 966
Joined: Fri Aug 10, 2012 12:38 pm
Location: Cumbria, UK

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 10:26 am

Turning it off certainly seems to improve Minecraft performance.

Brian Beuken
Posts: 177
Joined: Fri Jan 29, 2016 12:51 pm

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 11:46 am

rpiMike wrote:
Thu Jul 04, 2019 10:26 am
Turning it off certainly seems to improve Minecraft performance.
Also slightly improves my GLES2.0 high poly demo, but only by a few FPS.
Very old computer game programmer, now teaching very young computer game programmers, some very bad habits.
Wrote some book about coding Pi's and SBC's, it's out now...go get it!
http://www.scratchpadgames.net/

jcgamestoy
Posts: 2
Joined: Wed Mar 05, 2014 7:30 pm

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 12:13 pm

PeterO wrote:
Thu Jul 04, 2019 10:17 am
6by9 wrote:
Thu Jul 04, 2019 10:09 am
Generally we'd seen significant improvements with xcompmgr. There were a few discussions on scheduling, as the window composition stage is using also the 3D hardware, therefore your GL app has slightly less resource to do what it wants, but I hadn't previously noted a significant difference.
So far I've only seen a significant improvement by turning it off ! If it's going to adverslry effect openGL(ES) performance (and there seems to be an increasing interest in those) then maybe the decision needs rethinking ? Not being able to run glxgrears full screen at 60Hz is pretty disappointing performance.

Is there a way to tell it not to get in the way of updates to hardware rendered windows ?

PeterO
I am the author of the emulator Retro Virtual Machine (www.retrovirtualmachine.org), and when I ported it to Linux, one of the things that most affected the performance are the "compositors", there are better and worse for example the xfce has a very high cost, although it is a good idea for 2d applications that render by software is a bad for 3d applications because it increases much the load of the GPU.

If you activate or deactivate it you will see that with the "composer" activated the scroll in Chromium is much smoother and there is no Tearing.

Anyway yesterday I was testing with this disabled and for example with Pico-8 the performance of the Rpi4 is excellent. Congratulations to the foundation. ;)

pik33
Posts: 183
Joined: Thu Sep 10, 2015 4:26 pm

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 6:12 pm

Some time ago i started to write a bare metal GUI (still work in progress - https://github.com/pik33/ultibo_retro_gui ) and I had an idea of writing an OpenGL ES compositor for it (using RPi3). This means I had to update textuers every frame, which was way too slow to make it working at 60 fps.

This is another memory copy and the texture format in RPi3/VC4 is weird. I reverse engineered it so I was able to write a putpixel function which puts a pixel diectly in the texture memory. It looks like this (Pascal/asm):

// remark: I decided to use 2048x2048 32bit texture as 8192x2048 8-bit texture, but this means I only had to add some lines to compute a position of byte in the 32-bit word. The problem was the texture organization is not linear and looks rather as a fractal. X and Y bits have to be shifted, xored, etc, to compute an addres in the texture memory, which represents the pixel. Why?

Code: Select all

procedure TTexturebitmap.putpixel(x,y:cardinal;color:byte);  // test procedure

// for 2048x2048 32bit

var  aa:cardinal;
 //    debug1:cardinal;

begin
aa:=address;

       asm
       ldr r0,x
       ldr r1,y
       and r2,r0,#0b00001111
       and r3,r0,#0b00110000
       orr r2,r2,r3,lsl #2
       and r3,r1,#0b00000011
       orr r2,r2,r3,lsl #4               //a

       and r3,r1,#0b00001100
       orr r2,r2,r3,lsl #6               // 10 bits in r2

       and r3,r0,#0b01000000
       and r4,r1,#0b00010000
       eor r4,r4,r3,lsr #2
       orr r2,r2,r4,lsl #6              // bit 11 - xor

       mov r3,r0,lsl #5
       tst r1,#0b00100000
       mvnne r3,r3
       and r3,#0b111111100000000000
       orr r2,r3

       and r3,r1,#0b11111100000
       orr r2,r2,r3,lsl #13
       ldr r3,aa
       add r2,r3
       ldrb r3,color
       strb r3,[r2]

       end  ['R0','R1','R2','R3','R4']    ;


end;
If things didn't change in the new VC6, the copy-rectangle-to-texture function has to be slow, and the compositing video manager has to do this every frame.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7457
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: Thread from Pi4 discussion

Thu Jul 04, 2019 7:41 pm

Vc6 has an even weirder texture format (uif) however it also has a hardware block called the texture formatter unit (tfu) which converts various formats into uif.

When x is generating the frame, if its generated by gl (eg neverball) then it creates a uif frame so composition can just swallow it. Anything from a planar source will need converting.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

User avatar
Paeryn
Posts: 2709
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Thread from Pi4 discussion

Fri Jul 05, 2019 1:54 am

pik33 wrote:
Thu Jul 04, 2019 6:12 pm
This is another memory copy and the texture format in RPi3/VC4 is weird. I reverse engineered it so I was able to write a putpixel function which puts a pixel diectly in the texture memory. It looks like this (Pascal/asm):

// remark: I decided to use 2048x2048 32bit texture as 8192x2048 8-bit texture, but this means I only had to add some lines to compute a position of byte in the 32-bit word. The problem was the texture organization is not linear and looks rather as a fractal. X and Y bits have to be shifted, xored, etc, to compute an addres in the texture memory, which represents the pixel. Why?

If things didn't change in the new VC6, the copy-rectangle-to-texture function has to be slow, and the compositing video manager has to do this every frame.
I wouldn't say the texture format is weird, it's just not linear, there is a very good reason that linear formats aren't (generally) used for textures, the 3D hardware needs to read the texture at all sorts of angles and usually in small clusters. The format used will have been chosen so that nearby texels will be (ideally) close together in the cache according to how the rasteriser is expected to require them in the majority of cases.

I thought the VC4 did support linear textures... (Quick read of docs)... Ah, not totally linear, just that you can have the micro-tiles in linear order rather than snaking.
She who travels light — forgot something.

Return to “OpenGLES”