Ravenous
Posts: 1956
Joined: Fri Feb 24, 2012 1:01 pm
Location: UK

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 8:00 am

Supercomputing-RPi wrote:I am planning on using FCOE (Fiber (In UK, Fibre?) Channel over Ethernet).
10 Gbit/s connectivity SFP+ Optical Cables and a $5,100 Cisco 16 port FCOE Switch...
Just an incidental comment - for 100 CPUs does this mean you plan to buy seven switches at 5K each? Do the boards you're using support that transfer rate? Will simple cheap switches give you almost the same performance for much less money?

You see, to old farts like me who actually have the money, something like the above sounds like concentrating too much on the wrong side of the project. You could just buy seven high-end PCs for that sort of cash.

User avatar
AndrewS
Posts: 3625
Joined: Sun Apr 22, 2012 4:50 pm
Location: Cambridge, UK
Contact: Website

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 9:48 am

Supercomputing-RPi wrote:I am planning to have 4 different sectors...
Astronomy/Physics
Microbiology/Chemistry
Artificial Intelligence
Ecology

I am planning to do the neuroscience in conjuction with the Ai and MicroBio sectors...
Each sector would have a group of researchers all running processes at the same time..
Have you asked your researchers what features they'd find useful in a supercomputer cluster?

Some tasks might benefit from fast CPUs, other tasks might benefit from fast storage, other tasks might benefit from fast memory, other tasks might benefit from massive amounts of (less fast) storage, other tasks might benefit from fast networking, other tasks might benefit from fast GPUs.... it becomes a bit of a balancing act, deciding which feature(s) are most important, and then optimising for those.

User avatar
r3d4
Posts: 967
Joined: Sat Jul 30, 2011 8:21 am
Location: ./

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 10:13 am

Supercomputing-RPi wrote: I am trying to rapidify this because I have less time for this (I want to do more in less time).
I might be mad as a hatter but they may be something to be said for doing less in more time :roll:

There is a Linux distro called CHAOS...
6mb in size, this package could convert any linux compatible device to a clustered supercomputer....
CHAOS stands for Clustered High Availability Operating System.
https://code.google.com/p/chaos-release/
This looks intresting thanks for the link . :)

also i feel it would be wrong to miss this opportunity to wish you
"A Very Merry Unbirthday "
Real life is, to most, a long second-best, a perpetual compromise between the ideal and the possible.
-
Meanwhile, the sysadmin who accidentally nuked the data reckons "its best not run anything more with sudo today"
-
what about spike milligan?

Ravenous
Posts: 1956
Joined: Fri Feb 24, 2012 1:01 pm
Location: UK

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 10:31 am

Some interesting info here (note there's a very simple example at the end of the introductory lecture):
https://computing.llnl.gov/tutorials/agenda/index.html

Also the introductory lecture has some stuff on scalability, which I was going to talk about. They say it better than me though. :?

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 4:22 pm

Thanks Jim KLA!

I appreciate your admiration but along the way I am also finding things in the RPi that makes it of no receommended use in HPC environments..

Right now I found from my calculations that if I were to put a FCOE (10 Gbit/s Fiber optical) switch in the system my project would go 13K USD off the 20k planned.

Attacking the CPU issues now
Ravenous wrote: Really serious supercomputers need both fast processors, and fast network access to each other and to storage.
Serious HPCs do have fast processors, Tianhe-2 has Intel Xeon (Its about 2k dollars for one).

That is why I am choosing to go with AMD A10-7850K w/ 12 compute cores (856 GFLOP/chip)
and Nvidia GTX Titan Z (6.5 TFLOPS/GPU).

Couldn't NAS boxes solve the backuping issues?

I am not going to use FCOE w/ RPi (least not yet)....

I am considering an addition to my list of sectors...
Supercomputer Development and Management

In this sector I am going to look into RPis, find all of it's disadvantages and build a RPi equivalent.
I am also going to look into parts we could use for the supercomputer...

Buying 5 of those cisco routers would be a big financial mistake..

As ravenous said that I need serious networking...

I need to get those top speed stuff out there...

I did look at LLNL and their recommendations for clustering software, I saw CRAY offering Lustre solutions...
Didn't know that they offered courses!! Thnx!

Looks like Lustre is famous in usage...

About the CHAOS...
It is being used in some Austrailian University to password crack computers.
Why not we alter it to make it a BOINC alternative?

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 4:38 pm

I just looked into who is using this (supercomputers)

It fits the purposes of all my sectors!!!

Image

Another similar and increasingly popular example of a hybrid model is using MPI with GPU (Graphics Processing Unit) programming.
GPUs perform computationally intensive kernels using local, on-node data
Communications between processes on different nodes occurs over the network using MPI
Last edited by Supercomputing-RPi on Thu Aug 07, 2014 4:39 pm, edited 1 time in total.

User avatar
AndrewS
Posts: 3625
Joined: Sun Apr 22, 2012 4:50 pm
Location: Cambridge, UK
Contact: Website

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 4:39 pm

Supercomputing-RPi wrote:As ravenous said that I need serious networking...
Again, it all depends on the particular workload (algorithm) you're running, and whether it needs fast networking. Some will, some won't, all depends on the application...
About the CHAOS...
It is being used in some Austrailian University to password crack computers.
Why not we alter it to make it a BOINC alternative?
BOINC is distributed computing, which is quite different to what's usually associated with cluster computing.

Ravenous
Posts: 1956
Joined: Fri Feb 24, 2012 1:01 pm
Location: UK

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 5:06 pm

Supercomputing-RPi wrote:Couldn't NAS boxes solve the backuping issues?
Oh, I mentioned that backup statistic just to highlight some of the unexpected problems that emerge with these huge systems.

The problem isn't storing the data when they backup, it's EVERY processor writing ALL its data out to central disk (or maybe even tape?) units as soon as possible. If it's thousands of machines each with a few GB of memory, it becomes a large job.

So the Network itself, not the storage, slows down the operation in that case. (This might also be a problem if the 100 processors need "priming" with a lot of special data at the start of the job. If so, that will slow things down as most of the supercomputer will be sitting idle for a long time while the initial data is copied out.)

They might even have a fancy network topology where machines are connected to several networks, to prevent bottlenecks. For example in the pic you just posted above, there are lots of CPUs sitting on one network cable. If those CPUs have to transfer a lot of data to each other while the job is running - or each need a lot of data to start the job - the network will slow it down.

User avatar
AndrewS
Posts: 3625
Joined: Sun Apr 22, 2012 4:50 pm
Location: Cambridge, UK
Contact: Website

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 5:45 pm

Ravenous wrote:The problem isn't storing the data when they backup, it's EVERY processor writing ALL its data out to central disk (or maybe even tape?) units as soon as possible. If it's thousands of machines each with a few GB of memory, it becomes a large job.
http://www.quickmeme.com/meme/3s4t7c
;)

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 6:24 pm

So Ravenous,

You are saying that I need to write bandwidth tolerant code of somekind to be programmed in to the FCOE switch to maintain and balance network speeds, right?

Or are you saying that I might need more memory in the central HDD or SSD?

About my distributed server memory research...

I just found a open sourced software called memcache....
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. (memcached.org)
http://memcached.org/

User avatar
AndrewS
Posts: 3625
Joined: Sun Apr 22, 2012 4:50 pm
Location: Cambridge, UK
Contact: Website

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 6:30 pm

Supercomputing-RPi wrote:You are saying that I need to write bandwidth tolerant code of somekind to be programmed in to the FCOE switch to maintain and balance network speeds, right?
Or are you saying that I might need more memory in the central HDD or SSD?
http://www.raspberrypi.org/forums/viewt ... 03#p594003
http://www.raspberrypi.org/forums/viewt ... 56#p594256

:roll:

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 6:41 pm

I also found two high speed data store softwares....

Druid http://www.druid.io
High Capacity/Speed/Scalability
Used by Netflix (They ingest about 2TB/hr with this)..

Hypertable http://www.hypertable.com
Same as above, no need for lots of hardware, throttles hardware to alltime high, a bigtable (google) opensourced alternative
Used by Baidu (China)....

I think Druid would be the best solution for this...
Image

Right click open in new tab to see enlarged...
Last edited by Supercomputing-RPi on Thu Aug 07, 2014 6:48 pm, edited 2 times in total.

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 6:46 pm


User avatar
AndrewS
Posts: 3625
Joined: Sun Apr 22, 2012 4:50 pm
Location: Cambridge, UK
Contact: Website

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 7:07 pm

Supercomputing-RPi wrote:@Andrew S
http://www.raspberrypi.org/forums/viewt ... 16#p588316
sentences 7-9
You need more detail than a one-line summary of your project's aims (i.e. you need to do detailed in-depth analysis) in order to work out what type of cluster computing setup will best suit your application (e.g. no point spending thousands of dollars on networking hardware, if your application would be better improved by spending that money on extra RAM).
But that obviously doesn't fit in with your plans of "trying to rapidify this because I have less time for this (I want to do more in less time)" :|

I can't fault your enthusiasm though ;)

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 8:07 pm

I have the schematics done!!!!! :D :D :D :D :D :D :D :D :D :D :D :D :mrgreen: :) :) :) :) :) :)
https://bubbl.us/?h=21e08e/44ded6/22XNq ... 1794637702

I am making edits so please refresh tab to see changes..

drgeoff
Posts: 9763
Joined: Wed Jan 25, 2012 6:39 pm

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 8:49 pm

If you think that "slideware" is a schematic, you have a long way to go.

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 9:00 pm

drgeoff wrote:If you think that "slideware" is a schematic, you have a long way to go.
It is supposed to be only used for planning purposes...

That is how I visualize my ideas...
It's not meant to be professional with official blueprints...

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 11:11 pm

I have a question..

Should I buy chasis for each node?
What do you think?

User avatar
AndrewS
Posts: 3625
Joined: Sun Apr 22, 2012 4:50 pm
Location: Cambridge, UK
Contact: Website

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 11:27 pm

Supercomputing-RPi wrote:Should I buy chasis for each node?
That obviously depends entirely on how you're planning to mount and cool them... :? With so many CPUs (and/or GPUs) packed into such a small space, supercomputers generate a lot of heat.

I agree with drgeoff that the diagram you've linked to is only a very high-level overview, and is something very different to what most people would count as "schematics".
For example while "Nvidia Jetson TK1 Cluster" and "ITX Compute Cluster" are both linked to "GigE Hub/Switch", they're also (somehow) directly linked to each other?!

Supercomputing-RPi
Posts: 52
Joined: Tue Jul 29, 2014 5:01 pm
Location: Inside the Foundation's Servers :)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Thu Aug 07, 2014 11:38 pm

ITXs would have a PCi E Gbit card added into select ITX nodes.

These ports would be directly wired to the GigE Hub/Switch.

The Gig E Hubs/Switches would have direct connections to parallela boards and Nvidia TK1s..

Thanks for Input!!
I am going to edit it based off your inputs!!

P.S. It's getting harder to map this (notice how close two adjacent arrows are in incoming connections into modules, that is how complex this thing is going be)..

Ravenous
Posts: 1956
Joined: Fri Feb 24, 2012 1:01 pm
Location: UK

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Fri Aug 08, 2014 7:07 am

Supercomputing-RPi wrote:So Ravenous,

You are saying that I need to write bandwidth tolerant code of somekind to be programmed in to the FCOE switch to maintain and balance network speeds, right?
I didn't say that - I don't even know what it means :?

I am suggesting you scale down. Do not spend this weekend researching - build something, right now, even if it's just downloading a simple C compiler and writing a multi-threaded program on whatever PC you happen to have.

The sooner you build something, even if it's not quite the right thing, the sooner you'll actually achieve the project. Believe me I know!

Ravenous
Posts: 1956
Joined: Fri Feb 24, 2012 1:01 pm
Location: UK

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Fri Aug 08, 2014 7:15 am

AndrewS wrote:
Ravenous wrote:The problem isn't storing the data when they backup, it's EVERY processor writing ALL its data out to central disk (or maybe even tape?) units as soon as possible. If it's thousands of machines each with a few GB of memory, it becomes a large job.
http://www.quickmeme.com/meme/3s4t7c
;)
:lol:

And I do sound a bit like the grumpy old man don't I! Funnily enough Star Wars special effects led to GPU parallelism of the '90s and dropped us in the current very complicated development environment :)

User avatar
AndrewS
Posts: 3625
Joined: Sun Apr 22, 2012 4:50 pm
Location: Cambridge, UK
Contact: Website

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Fri Aug 08, 2014 12:14 pm

Supercomputing-RPi wrote:P.S. It's getting harder to map this (notice how close two adjacent arrows are in incoming connections into modules, that is how complex this thing is going be)..
Once you start getting down into the proper details you'll find it becomes much more complex... :ugeek:

Can you see yet why we're all trying (in vain!) to suggest that you start small and then scale up, rather than aiming high right at the beginning? ;)

W. H. Heydt
Posts: 10774
Joined: Fri Mar 09, 2012 7:36 pm
Location: Vallejo, CA (US)

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Fri Aug 08, 2014 5:33 pm

[quote="Ravenous"
The sooner you build something, even if it's not quite the right thing, the sooner you'll actually achieve the project. Believe me I know![/quote]

That rather depends on whether you work like Thomas Edison or Nikola Tesla.

(E.g. Edison tried--what?--1000 different ways to make a light bulb filament before he found one that worked...and that one was superceded by something else entirely when the bulbs went into everyday use. Tesla conceived of his AC motor and gave the machinists the specs for each of the parts. When he got the parts for the very first time, he assembled the motor (all the parts fit together properly on the first try) and the motor worked.)

drgeoff
Posts: 9763
Joined: Wed Jan 25, 2012 6:39 pm

Re: 100 RPi supercomputer w/ 20 GPU boards, appreciate for h

Fri Aug 08, 2014 6:20 pm

W. H. Heydt wrote:
Ravenous wrote: The sooner you build something, even if it's not quite the right thing, the sooner you'll actually achieve the project. Believe me I know!
That rather depends on whether you work like Thomas Edison or Nikola Tesla.

(E.g. Edison tried--what?--1000 different ways to make a light bulb filament before he found one that worked...and that one was superceded by something else entirely when the bulbs went into everyday use. Tesla conceived of his AC motor and gave the machinists the specs for each of the parts. When he got the parts for the very first time, he assembled the motor (all the parts fit together properly on the first try) and the motor worked.)
Yes, but Tesla had a reasonable understanding of what he was doing, whereas ....

Return to “Off topic discussion”