manili
Posts: 6
Joined: Thu Jun 14, 2018 9:59 am

Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 10:23 am

Hello everyone.

Well I'm new to RPi world. I see many people here use their RPis to make a cluster for parallel computing. I'm really wondering to know why don't they use AWS with much less cost and better performance?

Best regards,
Manili

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 19546
Joined: Sat Jul 30, 2011 7:41 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 10:53 am

Because they are generally used for teaching purposes or prototyping rather than real work. For example, the one of the nuclear labs build a Pi cluster simply so they could run and test their code on a cheap distributed device, prior to running it on their huge and very expensive to run supercomputer.

I don't know whether AWS could be used in that way.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Please direct all questions to the forum, I do not do support via PM.

Heater
Posts: 9221
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 1:10 pm

manili,

What services are you using for AWS? And how much are you paying for them?

I ask because....a while back I took up Amazon's offer of a free trial period. I obtained a small instance and put up a simple node.js web server on it which did almost nothing. A sort of "Hello World" test server. Over a year later, I had totally forgotten about Amazon and my AWS instance, until one day the wife asked me who was this "Amazon" and why is she taking 70 Euro per month out of our account? (I shudder to think what the wife was thinking there). I checked my little AWS instance. It was still doing nothing and had never seen any traffic worth noticing. I cancelled that immediately and have not thought about Amazon since.

I'd say that was a very expensive experiment. I could have had a decent Pi cluster for all the money Amazon got for nothing.

As it happens I can get a public IP address on my internet connection so I can run servers at home that I can reach from anywhere in the world. Not much need for Amazon or whatever cloud services.

Besides, I would need very long wires to connect to the GPIO on an AWS instance :)

mattmiller
Posts: 1832
Joined: Thu Feb 05, 2015 11:25 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 2:05 pm

I see many people here use their RPis to make a cluster for parallel computing.
There's not "many" people doing this :)

Almost all the the ones doing it are just doing it as a fun/learning project

droleary
Posts: 139
Joined: Fri Feb 09, 2018 3:45 am
Location: Minneapolis, MN USA
Contact: Website Skype

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 2:24 pm

Amazon is an unethical company (sadly, as are many large tech companies these days), so I wouldn't use them even if they were free. They are probably the largest contributor to the US entries in my firewall, and they don't seem to care at all that they sell their services to attackers. I'd have to open myself up to their mess again if I wanted to use AWS, and then I'd just be offering myself up as a human shield for their bad actors. Large cloud providers all make terrible business partners.

There's also no guarantee you'll actually get a price/performance that beats an RPi. I have a VPS with a smaller provider that offers better deals than Amazon, but it's still not good enough to replace an RPi for prototyping stuff. Yeah, if you were just fiddling around with a "cluster" for a month you could get away with paying $5 for each node, but if you're making longer-term plans on the scale of a year or more, you're better off going with some RPi.

And my RPi are right here with me. They're real things. I can see what they're doing physically. They're on my LAN, not on the Internet (unless I want them talking to the outside world). Trying to figure out where the problem is for cloud services can be a huge waste of resources.

And that hosting/hardware cost itself usually represents the smallest fraction of the costs of a clustering project. The cost of human labor to set up and manage the computers is going to be 1-2 orders of magnitude greater. Even given all the automation that exists for provisioning these days, there is no single way to just flip a switch and get a custom cluster that can be used by anyone for anything.

manili
Posts: 6
Joined: Thu Jun 14, 2018 9:59 am

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 3:00 pm

Thanks a lot guys for the replies and sharing your "expensive experiments" ;) .

1. The learning aspect of clustering RPis which @jamesh mentioned, was something which I did not noticed. When you cluster the tiny computers by your hands, of course you can fully understand how every single part works rather than using an AWS.
2. @Heater, what you shared was something unbelievable!!! So I can not believe how did I jump out of the trap :o ! I my self did not have any experiences with AWS yet (however I was going to test the free trial EC2 service). I had heard about AWS from ads, different people, some famous websites/forums and etc... Maybe I need to search more about AWS prices.
3. @mattmiller, I read (here in the forum) about people who liked to build there own clusters for ML, DL, Big Data, Data Mining, Parallel Computing and etc... . However I admit that the DIY articles (to make a cluster) are mostly written by hobbyists.
4. @droleary, very very good points. thanks so much.

Guys I'm starting this topic because I'd like to know whether there are any benefits for the community if I try to kickstart a project for customizing/creating a cloud OS based upon RPis or not? Could we have our own users despite the existence of cloud services such as AWS? What do you think?

P.S. I did not find any cloud OS for managing a cluster of RPis, did you?

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 19546
Joined: Sat Jul 30, 2011 7:41 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 3:33 pm

droleary wrote:
Thu Jun 14, 2018 2:24 pm
Amazon is an unethical company (sadly, as are many large tech companies these days), so I wouldn't use them even if they were free. They are probably the largest contributor to the US entries in my firewall, and they don't seem to care at all that they sell their services to attackers. I'd have to open myself up to their mess again if I wanted to use AWS, and then I'd just be offering myself up as a human shield for their bad actors. Large cloud providers all make terrible business partners.
No place for politics or this sort of opinion here, please refrain from posting this sort of thing.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Please direct all questions to the forum, I do not do support via PM.

User avatar
fruitoftheloom
Posts: 16597
Joined: Tue Mar 25, 2014 12:40 pm
Location: Bognor Regis UK

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 3:37 pm

""understanding the theoretical, but not always the end goal""

My other Devices are the ChromeBit CS10, ChromeCast & Huawei Mate 10 Pro SmartPhone

manili
Posts: 6
Joined: Thu Jun 14, 2018 9:59 am

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 8:24 pm

Dear @fruitoftheloom,

Thanks for sharing the link. As far as I understood Dramble is a cluster of RPis which is managing by Ansible tool. No type of critical cloud services like virtualization and etc ... Please correct me if I'm wrong.
I'd love to know if you've found a very tiny cloud-center/data-center project based on RPis.
Last edited by manili on Thu Jun 14, 2018 8:59 pm, edited 1 time in total.

ejolson
Posts: 1545
Joined: Tue Mar 18, 2014 11:47 am

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 8:54 pm

mattmiller wrote:
Thu Jun 14, 2018 2:05 pm
I see many people here use their RPis to make a cluster for parallel computing.
There's not "many" people doing this :)

Almost all the the ones doing it are just doing it as a fun/learning project
Almost everything on this forum, whether about parallel processing or something else, is about fun and learning. When an engineer posts a question related to a commercial product, the answer is usually that the Pi was designed as a real computer that children can own and not as a mission critical device for industrial control.

There have been quite a few posts about parallel processing and Pi clusters in the last couple weeks. Maybe those posts are getting in the way of the other posts. With the ongoing forum reorganization, I wonder whether it is time for a "Parallel and Cluster Computing" topic.

User avatar
Z80 Refugee
Posts: 167
Joined: Sun Feb 09, 2014 1:53 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 9:20 pm

ejolson wrote:
Thu Jun 14, 2018 8:54 pm
When an engineer posts a question related to a commercial product, the answer is usually that the Pi was designed as a real computer that children can own and not as a mission critical device for industrial control.
Except, as an engineer, I was criticised for saying that and told that the Compute Module version of the RPi is fully spec'ed for industrial applications. However, for the original RPi and immediate descendants, I maintain that remains the case.

Heater
Posts: 9221
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 10:21 pm

The original Pi and I believe Pi's to this day, do not come with any kind of detailed specification that a typical engineering company would accept for industrial applications.

I think it was jamesh who recently mentioned that such rigorous specifications were being worked on and may be with us soon. Which is good news.

In the meantime, real engineers, doing their own testing and evaluation, not tied by corporate dictate, have been using Pi for all kind of industrial applications.

All of which has nothing to do with building Pi clusters. Unless the cluster is for use industrially.

Me, I experimented with my CockroachDB cluster on a bunch of Pi 3. So cheap, so convenient, so easy. Now we use CockroachDB deployed in 3 different data centers spread around the world.

It's all good.

User avatar
The Traveler
Posts: 135
Joined: Sat Oct 21, 2017 3:48 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 10:28 pm

Almost all the the ones doing it are just doing it as a fun/learning project
This. :)

Cheers.
Retired IT professional, programmer and "beardie weirdie".
RPi interests: Artificial neural networks using clustered devices.
“We are stuck with technology when what we really want is just stuff that works.” ― Douglas Adams

Heater
Posts: 9221
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Thu Jun 14, 2018 10:41 pm

The Traveler,
This.
"This" what exactly?

droleary
Posts: 139
Joined: Fri Feb 09, 2018 3:45 am
Location: Minneapolis, MN USA
Contact: Website Skype

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 2:19 am

jamesh wrote:
Thu Jun 14, 2018 3:33 pm
No place for politics or this sort of opinion here, please refrain from posting this sort of thing.
Erm, which part is the problem? I don't see how there's anything inherently "political" about having an opinion on how a business behaves. What is wrong with being a values-based consumer? Please be more specific about the type of self-censorship you're requesting. Thanks.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 19546
Joined: Sat Jul 30, 2011 7:41 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 6:37 am

droleary wrote:
Fri Jun 15, 2018 2:19 am
jamesh wrote:
Thu Jun 14, 2018 3:33 pm
No place for politics or this sort of opinion here, please refrain from posting this sort of thing.
Erm, which part is the problem? I don't see how there's anything inherently "political" about having an opinion on how a business behaves. What is wrong with being a values-based consumer? Please be more specific about the type of self-censorship you're requesting. Thanks.
You stated a possiby contraversial opinion of Amazon as if factual. It may not be, and is certainly off topic for this thread.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Please direct all questions to the forum, I do not do support via PM.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 19546
Joined: Sat Jul 30, 2011 7:41 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 6:39 am

Z80 Refugee wrote:
Thu Jun 14, 2018 9:20 pm
ejolson wrote:
Thu Jun 14, 2018 8:54 pm
When an engineer posts a question related to a commercial product, the answer is usually that the Pi was designed as a real computer that children can own and not as a mission critical device for industrial control.
Except, as an engineer, I was criticised for saying that and told that the Compute Module version of the RPi is fully spec'ed for industrial applications. However, for the original RPi and immediate descendants, I maintain that remains the case.
We've sold millions of standard Pi's into industrial or commercial applications, but as I have said before, I would not recommend for critical applications without very thorough assessment.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Please direct all questions to the forum, I do not do support via PM.

ejolson
Posts: 1545
Joined: Tue Mar 18, 2014 11:47 am

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 7:19 am

droleary wrote:
Fri Jun 15, 2018 2:19 am
I don't see how there's anything inherently "political" about having an opinion on how a business behaves.
Maybe the Raspberry Pi is being considered for a new type of cloud instance that doesn't melt down.

In my world politics relates to the government or the public affairs of a country. On the other hand Amazon is either a really great river, an online retailer or a timesharing service.

While renting someone else's computer may seem like a good idea, there are often hidden costs that make it operationally more expensive than maintaining your own hardware. In the case of Raspberry Pi, you can set up a super-cheap five-node cluster for around 100 US dollars. The same amount of money will not go far when launching even the most minimally provisioned five-node cluster on Amazon.

Amazon's multiple-nines guaranteed uptime is great for outward facing web services. For inward facing, the weakest link is usually the company's or individual's own upstream connection. Mission critical onsite production tasks are probably best relegated to in-house servers and clusters with proper ECC memory and backed-up redundant storage. For learning and development, a cluster of Pi computers is so attractive that Los Alamos National Laboratory has purchased thousands of them to use as a testbed for developing the methods necessary to achieve exascale supercomputing.

Heater
Posts: 9221
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 8:26 am

ejolson,
In my world politics relates to the government or the public affairs of a country. On the other hand Amazon is either a really great river, an online retailer or a timesharing service.
Don't forget the Amazon women. They were brutal and aggressive, and their main concern in life was war. https://en.wikipedia.org/wiki/Amazons. Make what you will of the connection between the name of that tribe of Greek mythology and the online merchant of today.
Mission critical onsite production tasks are probably best relegated to in-house servers and clusters with proper ECC memory and backed-up redundant storage.
Depends exactly what one means by "mission critical" and what the tasks are. Keeping things in house is fine until the house burns down!

Perhaps what we need is to distribute our system over multiple redundant servers. In different buildings, different cities, perhaps even different countries, then the probability of them all getting taken out or getting disconnected is made much lower.

Hmm... that is a bit tricky to do in house unless you happen to be part of a global, multinational corporation. How about we rent the machines we need from somebody who has such a distributed system spread around the world? Better still, use different providers as well.

Now, the details of EEC memory and such are not so important to us.

To this end I have been experimenting with such things as the NATS distributed messaging system, https://nats.io/, and the Cockroach distributed SQL database https://www.cockroachlabs.com and others.

Enter the Pi... With a bunch of Pi hooked up one can start learning about and experimenting with such things very easily and cheaply. All from the comfort of home with hands on access to all the nodes if need be.

User avatar
bensimmo
Posts: 2709
Joined: Sun Dec 28, 2014 3:02 pm
Location: East Yorkshire

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 11:19 am

How does MythicBeasts run its Pi's?
Clustered or just individual access?
They were used for the PiWheels iirc and as an Apt Pi mirror.

Either way you can rent a Pi in the Cloud.

User avatar
bensimmo
Posts: 2709
Joined: Sun Dec 28, 2014 3:02 pm
Location: East Yorkshire

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 11:24 am

A blog from a bit ago on the RPi front page.
https://www.raspberrypi.org/blog/raspbe ... me-of-age/

and read the comments too.

manili
Posts: 6
Joined: Thu Jun 14, 2018 9:59 am

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 1:21 pm

Thanks you all for participating in this topic. Your posts taught me a lot of new things via reading about cool projects. When I started the topic I did not even think about these cool informations.
However unfortunately I did not find my answer yet. My be it's better to ask my question much more clear:
Are there any benefits to design an IaaS cloud based upon RPis?

Almost all of the examples you guys showed me was about:
1. Ansible as a managing tool
2. Cluster of thousands of Pis for parallelism but no kind of virtualization or something
3. MythicBeasts which can provide you a dedicated RPi with proper size of memory but there is no kind of IaaS based upon RPis
4. NATS was a wonderful idea which can be used in the cloud centers as a tool but as far as I understood it's not a cloud by itself (imagine one of the node shutdown due to some problem.

Please correct me if I did not understand something correctly ...

P.S. @bensimmo thanks a lot for the link about the LANL cluster, very inspiring blog post.

droleary
Posts: 139
Joined: Fri Feb 09, 2018 3:45 am
Location: Minneapolis, MN USA
Contact: Website Skype

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 1:29 pm

jamesh wrote:
Fri Jun 15, 2018 6:37 am
You stated a possiby contraversial opinion of Amazon as if factual. It may not be, and is certainly off topic for this thread.
I'm hard-pressed to see how it can be off topic when the OP specifically mentions AWS. And I'm not sure how anyone could mistake a values-based statement as factual. My negative opinion is indeed backed by evidence, but it seems genuinely off topic to post a list of abuse incidents that have come (and continue to come) from Amazon servers. They are welcome to come after me legally for libel if they dispute my evidence and if they do not value freedom of speech.

And that brings us back to me trying to determine what exactly I'm being asked to self-censor in these forums, because it's still not clear what the actual problem is. Am I not allowed to say anything remotely disagreeable about anyone or anything? Is it just Amazon that gets this special protection? What constitutes "factual" to a degree sufficient to allow an exception?

droleary
Posts: 139
Joined: Fri Feb 09, 2018 3:45 am
Location: Minneapolis, MN USA
Contact: Website Skype

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 1:58 pm

Heater wrote:
Fri Jun 15, 2018 8:26 am
Perhaps what we need is to distribute our system over multiple redundant servers. In different buildings, different cities, perhaps even different countries, then the probability of them all getting taken out or getting disconnected is made much lower.

Hmm... that is a bit tricky to do in house unless you happen to be part of a global, multinational corporation. How about we rent the machines we need from somebody who has such a distributed system spread around the world? Better still, use different providers as well.
Why not go the extra measure: crowd source all the existing connected RPi computers into an on-demand cluster service. I certainly have a few spare ones lying around that are mostly idle. I'd be happy to kick them into service every once in a while to run some sort of small-ish tasks that can be highly distributed. Or virtually "trade" a 0W with someone on the other side of the world for some redundancy.

I've said many times that the RPF would benefit from highlighting more projects that use multiple RPi to get a job done. It'd be nice if they standardized on some particular methods to start bringing the community together to share the computing power that's out there. It may not be practical from a utility standpoint, but it'd be interesting to see exactly what the size of such a beast would be.
manili wrote: Are there any benefits to design an IaaS cloud based upon RPis?
Not really. Anything abstracted to "as a service" means evaluating the components from a commodity perspective. The RPi has a lot of strengths, but not as a generic cloud-based compute cluster. They'll work well for prototyping concepts, but there are other platforms that will scale up much higher when you need to move something into production.

manili
Posts: 6
Joined: Thu Jun 14, 2018 9:59 am

Re: Why do some people prefer a cluster of RPis over AWS?

Fri Jun 15, 2018 2:15 pm

droleary wrote:
Fri Jun 15, 2018 1:58 pm
They'll work well for prototyping concepts, but there are other platforms that will scale up much higher when you need to move something into production.
Thank you very much for the reply.
Would you mind give me some examples of these platforms (specially I'm concerning about the cost.)?
droleary wrote:
Fri Jun 15, 2018 1:58 pm
Why not go the extra measure: crowd source all the existing connected RPi computers into an on-demand cluster service. I certainly have a few spare ones lying around that are mostly idle. I'd be happy to kick them into service every once in a while to run some sort of small-ish tasks that can be highly distributed. Or virtually "trade" a 0W with someone on the other side of the world for some redundancy.

I've said many times that the RPF would benefit from highlighting more projects that use multiple RPi to get a job done. It'd be nice if they standardized on some particular methods to start bringing the community together to share the computing power that's out there. It may not be practical from a utility standpoint, but it'd be interesting to see exactly what the size of such a beast would be.
Such a great idea. Are there any OSes out there to manage such highly distributed infrastructure?
Last edited by manili on Fri Jun 15, 2018 2:26 pm, edited 1 time in total.

Return to “General discussion”

Who is online

Users browsing this forum: No registered users and 68 guests