I think the GPU has floating point hardware: from http://www.raspberrypi.org/faqs
The GPU is capable of 1Gpixel/s, 1.5Gtexel/s or 24 GFLOPs of general purpose compute and features a bunch of texture filtering and DMA infrastructure.
GPGPU on the R-PI.
60 posts
Page 2 of 3 1, 2, 3
- Posts: 944
- Joined: Tue Nov 22, 2011 11:51 pm
This is straight from JamesH: http://www.raspberrypi.org/phpBB3/viewtopic.php?p=14633#p14633
- Posts: 18
- Joined: Sat Dec 10, 2011 4:49 pm
jdobmeier wrote:This is straight from JamesH: http://www.raspberrypi.org/phpBB3/viewtopic.php?p=14633#p14633
Wow, interesting thread. An integer vector lib would be really cool though.
- Posts: 248
- Joined: Tue Jan 24, 2012 4:54 am
Floats can be done just emulated with integer arithmetic, there are pretty good options out there for converting the question is whether the speed increase from the GPU isn't completely nullified in the process. My guess is there is some headroom left over, consider for n floats if it takes 4*n operations (wild guess) to convert so 8*n both ways, I thought jamesh said 16 cores on the GPU leaves (8n)/(16n) = 1/2 or 2X speedup. I'm trying to get those tutorials from gpgpu.org but new to GLSL. Have opencl experience so it makes some sense
- Posts: 18
- Joined: Sat Dec 10, 2011 4:49 pm
Huh? The GPU most definitely has native float support. In fact, the lack of proper integer support in GLSL ES is problematic. You can emulate integer-like behavior with floats, but it's so slow it becomes uninteresting.
But in any case, GPGPU on the Pi is generally not viable with the existing API support. And it wouldn't be a panacea either, even if we had great OpenCL support.
But in any case, GPGPU on the Pi is generally not viable with the existing API support. And it wouldn't be a panacea either, even if we had great OpenCL support.
- Posts: 190
- Joined: Sat Jan 28, 2012 8:07 pm
lb are you possibly referring to the lack of fixed point real numbers on the iphone?
http://iphonedevelopment.blogspot.com/2009/04/opengl-es-from-ground-up-part-1-basic.html
if not I would certainly be interested in where to find a reference to this missing functionality, or perhaps a test case to demonstrate it...
GLfixed: Fixed point numbers are a way of storing real numbers using integers. This was a common optimization in 3D systems used because most computer processors are much faster at doing math with integers than with floating-point variables. Because the iPhone has vector processors that OpenGL uses to do fast floating-point math, we will not be discussing fixed-point arithmetic or the GLfixed datatype.
http://iphonedevelopment.blogspot.com/2009/04/opengl-es-from-ground-up-part-1-basic.html
if not I would certainly be interested in where to find a reference to this missing functionality, or perhaps a test case to demonstrate it...
- Posts: 18
- Joined: Sat Dec 10, 2011 4:49 pm
No, I am referring to the fact that GLSL ES (the stripped down version of the OpenGL Shader Language) does not have proper support for integers. See paragraph 4.1.3 in the GLSL ES specification. Integers are only supported as a programming aid (in loops, for instance). They have very loosely defined semantics and precision, and many typical integer operations are not supported. Most OpenGL ES implementations simply map int to float.
Applications that rely on integers simply won't work on the GPU. Good examples for that are bitcoin mining, cryptography in general and compression.
Applications that rely on integers simply won't work on the GPU. Good examples for that are bitcoin mining, cryptography in general and compression.
- Posts: 190
- Joined: Sat Jan 28, 2012 8:07 pm
... there is no requirement that integers in the language map to an integer type in hardware. It is not expected that underlying hardware has full support for a wide range of integer operations. An OpenGL ES Shading Language implementation may convert integers to floats to operate on them.
this is hardly a close and shut case about what the capabilities of the broadcom chip actually are. Since I for one care little about portability to other chip designs I will continue to move forward.
- Posts: 18
- Joined: Sat Dec 10, 2011 4:49 pm
It doesn't matter what the GPU hardware is actually capable of. The GLSL ES restrictions are the same, no matter what hardware you have. Even if you can assume int maps to a native integer type with certain semantics, you'll find that there are no bitwise operators or modulo available in GLSL ES. Oh, and there's no unsigned int either.
- Posts: 190
- Joined: Sat Jan 28, 2012 8:07 pm
how is the standard restrictive? On the contrary, it seems if anything more flexible in that both native floats and ints are not even required as long as one or the other is present. If there is hardware support for both then all the better performance wise.
As far as the illegal operations of 5.1 in the standard, yeah the lack of hardware modulo is going to hurt the crypto guys for sure but again that is no guarantee there is no support, it is just not required by the implementation in order to conform to the standard. Really my application does not need the bitwise operators or modulo arithmetic anyway.
As far as the illegal operations of 5.1 in the standard, yeah the lack of hardware modulo is going to hurt the crypto guys for sure but again that is no guarantee there is no support, it is just not required by the implementation in order to conform to the standard. Really my application does not need the bitwise operators or modulo arithmetic anyway.
- Posts: 18
- Joined: Sat Dec 10, 2011 4:49 pm
jdobmeier wrote:how is the standard restrictive? On the contrary, it seems if anything more flexible in that both native floats and ints are not even required as long as one or the other is present. If there is hardware support for both then all the better performance wise.
Uh... so basically you're saying, lots of undefined and platform-specific behavior is *good*? That's crazy.
As far as the illegal operations of 5.1 in the standard, yeah the lack of hardware modulo is going to hurt the crypto guys for sure but again that is no guarantee there is no support, it is just not required by the implementation in order to conform to the standard. Really my application does not need the bitwise operators or modulo arithmetic anyway.
Most OpenGL ES implementations stick to the standard as strictly as possible, and implement few or no extras. The implementations I know all require constant loop expressions, for example. I say it's quite unlikely the VideoCore IV OpenGL ES supports native integers, non-constant loop expressions or any extra operators in GLSL ES. Maybe some of the people from Broadcom can enlighten us...
Anyway, if GLSL ES is good enough for your application, that's fine. However, generally GPGPU is not something that is a viable to do with the Pi. It's too restrictive, much beyond just being inconvenient. And the GPU isn't that fast anyway.
- Posts: 190
- Joined: Sat Jan 28, 2012 8:07 pm
Hi,
so as far as i understood the previous discussion, the only viable option to be able to use GPGPU on the Raspberry Pi would be to have Broadcom port OpenCL to this GPU. Is this correct?
Does anyone have connections to these guys? Any chance to have them discuss this issue with us? We could initiate a Kickstarter project to raise some money to make this possible!
Anyone with more info?
Greetings, Chris
so as far as i understood the previous discussion, the only viable option to be able to use GPGPU on the Raspberry Pi would be to have Broadcom port OpenCL to this GPU. Is this correct?
Does anyone have connections to these guys? Any chance to have them discuss this issue with us? We could initiate a Kickstarter project to raise some money to make this possible!
Anyone with more info?
Greetings, Chris
- Posts: 2
- Joined: Sun Jun 03, 2012 9:49 pm
apparently GPGPU is a controversial term and some purists believe it is out of reach for all mobile devices so I propose to define what I am doing as PiGPU, which is not to be confused with GPGPU proper. However, for the purposes of this forum I would like to stipulate that whenever GPGPU is mentioned it is understood to mean PiGPU.
That said, I have ported tutorial 0 from http://gpgpu.org/developer/legacy-gpgpu-graphics-apis to the RPi. I started with the code examples from the OpenGL ES 2.0 Programming Guide which were kindly provided by Ben O'Steen's blog: http://benosteen.wordpress.com/ The updated code can be found here: http://pastebin.com/mKW0YbE0 for the source code and http://pastebin.com/m2XWQmWD for the new makefile. Also you could compile like this:
I just added a directory /Raspi/Chapter_9/helloPiGPU where I put the source. Makefile goes in Raspi directory. btw, I'm using Raspian.
That said, I have ported tutorial 0 from http://gpgpu.org/developer/legacy-gpgpu-graphics-apis to the RPi. I started with the code examples from the OpenGL ES 2.0 Programming Guide which were kindly provided by Ben O'Steen's blog: http://benosteen.wordpress.com/ The updated code can be found here: http://pastebin.com/mKW0YbE0 for the source code and http://pastebin.com/m2XWQmWD for the new makefile. Also you could compile like this:
- Code: Select all
gcc -DRPI_NO_X ./Common/esShader.c ./Common/esTransform.c ./Common/esShapes.c ./Common/esUtil.c ./Chapter_9/helloPiGPU/helloPiGPU_GLESSL.c -o ./Chapter_9/helloPiGPU/helloPiGPU_GLESSL -I./Common -I/opt/vc/include -lGLESv2 -lEGL -lm -lbcm_host -L/opt/vc/lib
I just added a directory /Raspi/Chapter_9/helloPiGPU where I put the source. Makefile goes in Raspi directory. btw, I'm using Raspian.
- Posts: 18
- Joined: Sat Dec 10, 2011 4:49 pm
Something I'm not following, and I'm possibly being slow here.
OpenGL ES is effectively a standard API to embedded GPU
The RPi GPU supports OpennGL ES, and OpenCL hooks into the Open GL and Open GL ES.
Logically, if you're having to do a thousand man hours patching the port to get it to run, then one of the things in the loop is NOT following published standard. Otherwise, is should be little more than a cross compile, surely?
I admit, I'm not a low level hardware chap, and it's well over a decade since I touched API work (inter platform operability - PC to AS400 DB2 to Lotus Notes and SmartSuite,) but I always assumed that it sort of worked the same way...
I
OpenGL ES is effectively a standard API to embedded GPU
The RPi GPU supports OpennGL ES, and OpenCL hooks into the Open GL and Open GL ES.
Logically, if you're having to do a thousand man hours patching the port to get it to run, then one of the things in the loop is NOT following published standard. Otherwise, is should be little more than a cross compile, surely?
I admit, I'm not a low level hardware chap, and it's well over a decade since I touched API work (inter platform operability - PC to AS400 DB2 to Lotus Notes and SmartSuite,) but I always assumed that it sort of worked the same way...
I
- Posts: 10
- Joined: Mon May 28, 2012 4:00 pm
rodonn wrote:The RPi GPU supports OpennGL ES, and OpenCL hooks into the Open GL and Open GL ES.
That's your misunderstanding. You can't implement openCL on top of OpenGL.
- Moderator
- Posts: 3229
- Joined: Wed Aug 17, 2011 7:41 pm
- Location: Cambridge
Could Broadcom not give creating OpenCL for the GPU to some bright summer students on some internship. Or if its a bigger project than that, get a University involved and have people sign the appropriate None Disclosure Agreements. Surely would make a good PhD research project at least.
I would love to run World Community Grid stuff on a Pi. Low cost, low energy and doing good for humanity.
I would love to run World Community Grid stuff on a Pi. Low cost, low energy and doing good for humanity.
- Posts: 53
- Joined: Tue Dec 27, 2011 9:09 pm
KeithSloan wrote:Could Broadcom not give creating OpenCL for the GPU to some bright summer students on some internship. Or if its a bigger project than that, get a University involved and have people sign the appropriate None Disclosure Agreements. Surely would make a good PhD research project at least.
I would love to run World Community Grid stuff on a Pi. Low cost, low energy and doing good for humanity.
I'm not sure exactly, but to implement OpenCL on a device (any device) is quite a few man years of work, unless you have code you can base a port off, and even then is a big job.
- Moderator
- Posts: 6400
- Joined: Sat Jul 30, 2011 7:41 pm
I'm not sure exactly, but to implement OpenCL on a device (any device) is quite a few man years of work, unless you have code you can base a port off, and even then is a big job.
So would would require say a University to get involved, rather than just a Summer project
I see they have an internship that would like OpenCL experience https://sjobs.brassring.com/1033/ASP/TG ... l.asp?SID=^XTOdWZVHG6t0DPfq26RpdS3iornjpCUau2MIBU2hHHqG7VxhZxx1F2e8i/xf63fX&jobId=924836&type=search&JobReqLang=1&recordstart=1&JobSiteId=5482&JobSiteInfo=924836_5482&GQId=0
I would have thought they could ask a local university research department to see if they were interested. Isn't that the sort of thing that industrial cooperation of universities is supposed to do.
- Posts: 53
- Joined: Tue Dec 27, 2011 9:09 pm
This would be a lot of work for little more than the learning experience. As good as the VideoCoreIV is for its size, it will not beat a desktop PC graphics card for GPGPU on flops/dollar. It likely will not beat a laptop GPU chip on flops/watt either.
If you want to build an efficient compute platform, you do not start by designing an embedded chip with adequate HD/3D performance, then use lots of them. Performance is mostly limited by the number of gates/transistors. So you must design a GPGPU chip near the chip size where the cost per transistor is the lowest.
If you want to build an efficient compute platform, you do not start by designing an embedded chip with adequate HD/3D performance, then use lots of them. Performance is mostly limited by the number of gates/transistors. So you must design a GPGPU chip near the chip size where the cost per transistor is the lowest.
lb,
For crypto, its not the floating point math that makes GPUs attractive, its the fact that you have a large number of cores. Check out http://hashcat.net/oclhashcat-lite/ for a crypto cracking lib that uses OCL to take advantage of GPU cores with impressive results.
With oclHashcat I can distribute the processing across many GPU cards in the same box. I can see a low cost approach with a single 19" tray full of headless RPi cranking out hashes against a distributed password database. Hmmm. Has anyone considered a RPi based on NVIDIA Tegra instead of Broadcom? Could be a case the Raspberry Mu.
Applications that rely on integers simply won't work on the GPU. Good examples for that are bitcoin mining, cryptography in general and compression.
For crypto, its not the floating point math that makes GPUs attractive, its the fact that you have a large number of cores. Check out http://hashcat.net/oclhashcat-lite/ for a crypto cracking lib that uses OCL to take advantage of GPU cores with impressive results.
With oclHashcat I can distribute the processing across many GPU cards in the same box. I can see a low cost approach with a single 19" tray full of headless RPi cranking out hashes against a distributed password database. Hmmm. Has anyone considered a RPi based on NVIDIA Tegra instead of Broadcom? Could be a case the Raspberry Mu.
- Posts: 1
- Joined: Tue Jan 22, 2013 1:30 am
it will not beat a desktop PC graphics card for GPGPU on flops/dollar. It likely will not beat a laptop GPU chip on flops/watt either.
But my x86 dual core does not have a GPU that will run World Community Grid. I know some PC's do but a lot do not. So I don't see it competing with a modern graphics GPU just has to make worthwhile running BOINC stuff on a Pi .
- Posts: 53
- Joined: Tue Dec 27, 2011 9:09 pm
jojopi wrote:This would be a lot of work for little more than the learning experience. As good as the VideoCoreIV is for its size, it will not beat a desktop PC graphics card for GPGPU on flops/dollar. It likely will not beat a laptop GPU chip on flops/watt either.
If you want to build an efficient compute platform, you do not start by designing an embedded chip with adequate HD/3D performance, then use lots of them. Performance is mostly limited by the number of gates/transistors. So you must design a GPGPU chip near the chip size where the cost per transistor is the lowest.
I agree about flops/dollar. Not so sure on flop/watt.
How many flops/watt do you get with a current spec desktop graphics card? The Raspi has about 24Gflops total performance (not all at same time), so let's say half that accessible (made up number) at 500mA on 5v = 2.5W = about 4.8GFlops/watt. (let me reiterate, made up numbers just to get a rough idea)
- Moderator
- Posts: 6400
- Joined: Sat Jul 30, 2011 7:41 pm
jamesh wrote:jojopi wrote:This would be a lot of work for little more than the learning experience. As good as the VideoCoreIV is for its size, it will not beat a desktop PC graphics card for GPGPU on flops/dollar. It likely will not beat a laptop GPU chip on flops/watt either.
If you want to build an efficient compute platform, you do not start by designing an embedded chip with adequate HD/3D performance, then use lots of them. Performance is mostly limited by the number of gates/transistors. So you must design a GPGPU chip near the chip size where the cost per transistor is the lowest.
I agree about flops/dollar. Not so sure on flop/watt.
How many flops/watt do you get with a current spec desktop graphics card? The Raspi has about 24Gflops total performance (not all at same time), so let's say half that accessible (made up number) at 500mA on 5v = 2.5W = about 4.8GFlops/watt. (let me reiterate, made up numbers just to get a rough idea)
Green500 record is 2.5Gflops/watt (but in fairness you don't really have the memory bandwidth to offer 4.5Gflops/watt on a range of industry standard kernels).
I might also pipe in a subset of OpenCL can be done for a lower effort than people imagine. And as a counter point to jojopi, efficiency can be measured many ways - if a large number of kids (big and little) already have the Pi then a mini OpenCL implementation could be an efficient way of educating the next generation on SPMD.
- Posts: 56
- Joined: Sat Jul 07, 2012 11:21 pm
- Location: Zero Page
I think this Library https://github.com/BradLarson/GPUImage could be a good starting point to apply GPGPU image manipulation primitives. It is written in Objective C but the OpenGL ES code is exactly the same that should run in the Raspberry PI.
- Posts: 17
- Joined: Sun Sep 16, 2012 1:48 pm
What about this, (sorry for the ignorance) does it mean anything regarding GPGPU?:
http://www.khronos.org/conformance/adop ... cts#opencl
Broadcom Corporation 2011-11-11 OpenGL_ES_2_0
BCM7346 (big endian) CPU: MIPS (big endian)
OS: Linux 2.6.37
API pipeline:
GL_VENDOR "Broadcom"
GL_RENDERER "VideoCore IV HW"
GL_VERSION "OpenGL ES 2.0"
GL_SHADING_LANGUAGE_VERSION "OpenGL ES GLSL ES 1.00"
Display: 1920x1080, 32bpp
http://www.khronos.org/conformance/adop ... cts#openvg
Broadcom Corporation 2011-04-03 OpenVG_1_1
CPU: VideoCore IV, OS:Threadx, Pipeline: Broadcom VideoCore IV HW/OpenVG 1.1, Display: 64x64 32bpp
http://www.khronos.org/conformance/adop ... cts#opencl
Broadcom Corporation 2011-11-11 OpenGL_ES_2_0
BCM7346 (big endian) CPU: MIPS (big endian)
OS: Linux 2.6.37
API pipeline:
GL_VENDOR "Broadcom"
GL_RENDERER "VideoCore IV HW"
GL_VERSION "OpenGL ES 2.0"
GL_SHADING_LANGUAGE_VERSION "OpenGL ES GLSL ES 1.00"
Display: 1920x1080, 32bpp
http://www.khronos.org/conformance/adop ... cts#openvg
Broadcom Corporation 2011-04-03 OpenVG_1_1
CPU: VideoCore IV, OS:Threadx, Pipeline: Broadcom VideoCore IV HW/OpenVG 1.1, Display: 64x64 32bpp
- Posts: 5
- Joined: Thu Jan 24, 2013 8:20 am