okenido
Posts: 19
Joined: Thu Aug 02, 2018 11:47 am

Bare metal graphics : hardware acceleration ?

Wed Oct 10, 2018 2:18 pm

Hello

I write directly my graphics into the RPI's framebuffer using my custom functions (drawRect, drawLine...), it's working pretty well but since I have very few CPU time to spend drawing the screen, I'm trying to find better ways of accomplishing this.

- Does the GPU provides some hardware rectangle blitting ? I spend a lot of time in nested X/Y loops, filling rectangles with pixels... quite inefficient. I was looking at the mailboxes to talk to the GPU but it doesn't appear to have such functions.

- Is it possible to use bare-metal OpenGL to draw things into the framebuffer (quads + basic shader to draw 2D rectangles), while still having access for doing software rendering over it ?

- I get flickering and tearing when redrawing the screen, since I can't control when the framebuffer data is sent to the display. Is there a way to do that using something like " begindraw() ... drawing stuff... .enddraw()" ? Or some way of doing hardware double buffering ?

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 20510
Joined: Sat Jul 30, 2011 7:41 pm

Re: Bare metal graphics : hardware acceleration ?

Wed Oct 10, 2018 3:08 pm

Nothing easy to use IIRC.

Have you written the blitting functions in NEON? That will give a huge improvement in speed. There are probably quite a few examples already out there of NEON based blitting functions, so you might get away with copy and paste.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Please direct all questions to the forum, I do not do support via PM.

okenido
Posts: 19
Joined: Thu Aug 02, 2018 11:47 am

Re: Bare metal graphics : hardware acceleration ?

Wed Oct 10, 2018 3:48 pm

No i'm using the naive way of doing it. However, looking at this page shows the NEON improvements aren't significant at all : http://infocenter.arm.com/help/index.js ... 13544.html

Word by Word memory copy 100%
Load-Multiple memory copy 111%
NEON memory copy 100%
Word by Word memory copy with PLD 76%
Load-Multiple memory copy with PLD 98%
NEON memory copy with PLD 149%
Mixed ARM and NEON memory copy 112%


Except the 149% (+49%) which is quite good but i thought it would make an even bigger difference.

It's for ARM A8 so maybe it's not relevant for the RPI ?

Found a code for NEON-blitting : https://github.com/tranthamp/neon_test


Since I'm using a 16-bit framebuffer, i was thinking about casting the pointers to uint32 then do the copy, so it would copy two pixels at the same time.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 20510
Joined: Sat Jul 30, 2011 7:41 pm

Re: Bare metal graphics : hardware acceleration ?

Wed Oct 10, 2018 7:38 pm

okenido wrote:
Wed Oct 10, 2018 3:48 pm
No i'm using the naive way of doing it. However, looking at this page shows the NEON improvements aren't significant at all : http://infocenter.arm.com/help/index.js ... 13544.html

Word by Word memory copy 100%
Load-Multiple memory copy 111%
NEON memory copy 100%
Word by Word memory copy with PLD 76%
Load-Multiple memory copy with PLD 98%
NEON memory copy with PLD 149%
Mixed ARM and NEON memory copy 112%


Except the 149% (+49%) which is quite good but i thought it would make an even bigger difference.

It's for ARM A8 so maybe it's not relevant for the RPI ?

Found a code for NEON-blitting : https://github.com/tranthamp/neon_test


Since I'm using a 16-bit framebuffer, i was thinking about casting the pointers to uint32 then do the copy, so it would copy two pixels at the same time.
NEON is 16 way SIMD, so you get 16 operations for the price of one normal operations. So 16x faster. Approximately.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Please direct all questions to the forum, I do not do support via PM.

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: Bare metal graphics : hardware acceleration ?

Wed Oct 10, 2018 8:58 pm

If you are interested in OpenGL ES 2 LdB is doing some really cool stuff check out this link: https://www.raspberrypi.org/forums/view ... 2&t=192440

If you aren't ready to do this the NEON way should be faster.

okenido
Posts: 19
Joined: Thu Aug 02, 2018 11:47 am

Re: Bare metal graphics : hardware acceleration ?

Thu Oct 11, 2018 5:11 pm

Very nice, i'll take a look at it if I need even more performance.

I wrote this little code and it works very well :

Code: Select all

asm volatile
			(
				"   vdup.16 q8, %1\n\t"
				"   vst1.16  {d16-d17}, [%0]!\n\t"

				: "=r"(pDest), "=r"(color)
				: "0"(pDest), "1"(color)
			);

okenido
Posts: 19
Joined: Thu Aug 02, 2018 11:47 am

Re: Bare metal graphics : hardware acceleration ?

Fri Oct 12, 2018 6:56 pm

It doesn't work that well finally. I get random color corruptions. If I move the vdup call before the x/y loop and use only vst1 to fill the screen I got even more colour corruptions. What I'm doing wrong ?
Is neon core/thread safe ? I was thinking about interrupts in my program, that could make use of neon registers (automatically generated by GCC) writing unwanted things to the register I use for my drawing purposes.

User avatar
Paeryn
Posts: 2146
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Bare metal graphics : hardware acceleration ?

Fri Oct 12, 2018 11:21 pm

okenido wrote:
Fri Oct 12, 2018 6:56 pm
Is neon core/thread safe ? I was thinking about interrupts in my program, that could make use of neon registers (automatically generated by GCC) writing unwanted things to the register I use for my drawing purposes.
That all depends on whether your task switching / interrupt handling is correctly saving and restoring the VFP/NEON state just like it has to for the vanilla ARM registers. If you don't then another thread using VPF/NEON will trash them when run on the same core. Each core has its own VFP/NEON unit so using NEON on one core won't affect NEON registers on another.
She who travels light — forgot something.

Return to “Bare metal, Assembly language”

Who is online

Users browsing this forum: No registered users and 4 guests