User avatar
paddyg
Posts: 2555
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Pi3d: Conways game of life 1024x1024 at 10FPS

Sun Mar 24, 2013 11:09 pm

After a couple of comments following Amy's talk I pondered on the possibility for pi3d to utilise the power of the gpu (https://github.com/tipam/pi3d/tree/integration = opengl es2)

This does indeed seem to be possible. I have used image pixels for cells and it seems to be running between 5 and 15 FPS, as it has to check the surrounding cells that's 1024x1024x9x10 texture lookups per second = 95,000,000/s

http://youtu.be/PsZvsrXkSVw
(filmed with phone I'm afraid and random music from youtube (selected by length alone))

Paddy
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

User avatar
PeterO
Posts: 5951
Joined: Sun Jul 22, 2012 4:14 pm

Re: Pi3d: Conways game of life 1024x1024 at 10FPS

Mon Mar 25, 2013 9:00 am

What a coincidence , I was only thinking about this yesterday. Google found this for me:
http://translate.google.co.uk/translate ... ife&anno=2
The google translation is OK, but make sure you're looking at the original version when looking at the shader code :-)

I've now looked at your code, and I wonder if your 3x3 loop in your fragment shader is quicker or slower than the series of conditionals in the other version ? I'll have a play later on.

PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

User avatar
rurwin
Forum Moderator
Forum Moderator
Posts: 4258
Joined: Mon Jan 09, 2012 3:16 pm
Contact: Website

Re: Pi3d: Conways game of life 1024x1024 at 10FPS

Mon Mar 25, 2013 9:19 am

If you want an interesting starting state, you can't beat the r-pentomino. Five easy to remember cells give over a thousand generations before stability:

Code: Select all

 XX
XX
 X
Since gliders only move at 1/3 c, you should be able to contain the whole thing on a 1024x1024 grid.

User avatar
paddyg
Posts: 2555
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: Pi3d: Conways game of life 1024x1024 at 10FPS

Mon Mar 25, 2013 10:05 am

Peter, It would be interesting to see what you find. When I was trying to get the pi3d ES2 version to go faster than the ES1 version I discovered (Actually Tim@tipam's brother tipped me off) that ifs are very expensive in shaders as seem to be divisions (by variables). That's why I use the step() function in the loop, add all 9 then subract the middle value, and put the division in the vertex shader. Couldn't think how to make the final ifs into step()s but might be possible. However the compiler certainly does do some optimisation and might have converted everything for me anyway.

Richard, I'll have a go with the pentomino you suggest. To save adding another large image to the demos I used the roof of you silo as the seed in the version I pushed to github. That's 'only' 800x800 so runs at 25FPS and you can see more clearly that I'm inadvertently moving the image up and right one pixel each frame. Not sure why or where the new edges appear from (it doesn't seem to wrap which I thought it would) It's been running now for half an hour or so with no discernible repetition (it occasionally produces very distinctive regular patterns and lines so ought to be possible to tell)
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

User avatar
rurwin
Forum Moderator
Forum Moderator
Posts: 4258
Joined: Mon Jan 09, 2012 3:16 pm
Contact: Website

Re: Pi3d: Conways game of life 1024x1024 at 10FPS

Mon Mar 25, 2013 10:13 am

It probably does wrap then. 1000 generations is good, and I doubt that a random pattern would produce many times better. On the other hand gliders will be thrown off at intervals. In an infinite universe they just fly away to infinity, but if the universe is wrapped, they will just impact it on the other side.

User avatar
paddyg
Posts: 2555
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: Pi3d: Conways game of life 1024x1024 at 10FPS

Mon Mar 25, 2013 1:10 pm

I think the drift (and not quite working properly either) is due to some small rounding error or mismatch between the number of pixels, gl_FragCoord values and uv locations. By making the divisor 1/(w-1) and 1/(h-1) it stops the drift and wraps properly but still not right as r-pentomino behaviour depends on location in image and only breeds near the middle!!!!!!!!!
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

arexxk
Posts: 1
Joined: Fri May 24, 2013 11:23 pm

Re: Pi3d: Conways game of life 1024x1024 at 10FPS

Sat May 25, 2013 12:20 am

Hi Paddy

I'm an undergraduate working on a research project at the San Diego Super Computing Center. The project's goal is to create engaging and informative projects on educating children about parallel computing, using Raspberry Pis as a teaching tool. My research coordinator and I have been trying figure out ways to parallelize Conway's Game of Life while off-loading vector code onto the Pis' GPU to improve efficiency, when we stumbled across your implementation using pi3d.

I would like to use your implementation, if that is acceptable, but I do not completely understand the code. From what I've reasoned so far, I can see that Conway.py is the staging area where you set up the display, shader, texture, and shape objects/arrays and draw them to the screen. I believe the conway.fs shader is where you perform the calculations on each pixel's evolution, and I am not to sure what the conway.vs shader is doing. Please correct me if I am misguided.

In order to parallelize Conway's, we are letting each Pi have it's own grid, with cells in it, and each Pi would display it's grid on 1 out of a 15 screen display. Each calculation on a single grid would pretty much be local until the edge cells need to be calculated, which requires the state of the corresponding edge cells belonging to the other Pis' grid. In this case we use an MPI request to receive 1d arrays of the edge cells belonging to the other Pis.

In a more vanilla implementation of Conway's, each Pi's grid would be a 2d matrix which I would take chunks of and pass them into the MPI comm channel when needed. But in your implementation, and after reading the documentation, I am assuming the grid would correspond to the sprite's "buffer" (containing the state of the live and dead cells). My main question, upon others, is how is buffer initially populated and how can I directly access/manipulate the buffer, so I can pass it to other Pis? Any help would be greatly appreciated. More info on our project can be found here: http://sdsc-sandbox.blogspot.com/

User avatar
paddyg
Posts: 2555
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: Pi3d: Conways game of life 1024x1024 at 10FPS

Sat May 25, 2013 8:44 am

arexxk, glad to help if I can and you are welcome to use whatever you want (can).

The basis of my system is to do the absolute minimum in python and the maximum in the shader. I use an array of two textures both initially filled with a starting image. In the loop I toggle an index from one picture to the other. The shader draws to the display using the live image and the conway algorithm. The displayed image is then saved to the non-live image and the process repeated.

In OpenGL ES2 and above you have to provide your own shaders to do the graphics work. These consist of two parts: a vertex shader and a fragment shader. In pi3d we keep them as two files with vs and fs extensions. The shaders go off and sort out what to do in a slightly boggling way but essentially the vs runs for each vertex (so only six times for a rectangle made from two triangles) and generates an interpolation formula for all the points between the vertices. The fs runs for each pixel. You can pass information to each part of the shader in the 'attribute' or 'uniform' variables and the vs sends info to the fs via the 'varying' variables. Search around on the internet for some explanations and play around with the code.

I'm not sure that you can get at the image info from the cpu without getting the gpu to provide it for you via opengles.glReadPixels (as in pi3d/util/Screenshot.py) I've not checked but I think the original image array in Texture.image will remain unchanged (otherwise why have a glReadPixels function?), but it's worth looking into. You will also have to put the modified texture back into the gpu using glTexImage2D. Whatever you do in terms of accessing/manipulating the image buffer will have *catastrophic* consequences on the speed of the program. Even if you do it all in C I doubt that 15 Pis will be as fast as one Pi doing everything in the gpu (numpy is pretty fast at the array stuff anyway), but I suppose that's not the point!

There's also some sorting out needed on my program to do with edges and scaling. You can test this by using a starting image that consists of one of the classic conway 'starters'. It will behave differently depending on where it is, which it obviously shouldn't!

Let me know how you get on, or if you have further questions.

Paddy
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

Return to “Python”