The first idea I got when I saw this project was a HA webservice cluster. Don"t ask me why
So you have 2 devices who manage the requests (Round robin), and a number of mini servers who processes the webpages. When 1 device fails, just replace it with a new one. So you can build a HA cluster for almost nothing, which doensn"t consume much energy. Double profit
This tutorial shows how it can be done
http://www.howtoforge.com/high.....he_cluster
I have read the Clustering topic, but I didn't want to post it there, because they are more talking about a Distributed computing system, and that is not what I really want.
I don't have much experience with real clustering, so my question: Is this a good idea? Can it be done? Would it work fast? What do you guys think?
Re: High availability webservice cluster
I spoke to some colleagues about my idea and they all liked it , unfortunately they have found 2 probably issues:
1) Static pages: no problem, but what about heavy PHP website who needs much cpu? Will it have enough power?
(I will try to test this on a single machine)
2) What about your database?
Almost every website needs a database server. I don't think Mysql will run on this devices Maybe use SQlite?
So much stuff to think about while I wait for the first PI's to arrive
As always, share your opinion
1) Static pages: no problem, but what about heavy PHP website who needs much cpu? Will it have enough power?
(I will try to test this on a single machine)
2) What about your database?
Almost every website needs a database server. I don't think Mysql will run on this devices Maybe use SQlite?
So much stuff to think about while I wait for the first PI's to arrive
As always, share your opinion
Re: High availability webservice cluster
yer mysql will run but you would have to compile it from source as there is no binaries.
you wouldnt have to do this on the RPi itself just x compile on a higher end cpu
you wouldnt have to do this on the RPi itself just x compile on a higher end cpu
Re: High availability webservice cluster
Nice .. I wanted to add another aspect and make the Rpi in to a SSL accelerator (SSL reverse proxy) using Nginx
Sqlite WILL run on the Rpi without any problems.. MySql I am sure will work with some massaging
Sqlite WILL run on the Rpi without any problems.. MySql I am sure will work with some massaging
RaspberryPi's galore
Solid run CuBox
ODroid U2
Solid run CuBox
ODroid U2
Re: High availability webservice cluster
In principle, what you have written can, absolutely, be done. However, as your thought process has gone in your other posts in this thread, the problem is how much performance each node can provide you with? and whether the performance will be sufficient if your webservice is making use of multiple other 'performance sapping' services.
There comes a point with clusters, where reducing the 'power of the engine' and expanding the 'number of engines' stops saving you power/money and starts costing and sacrificing performance.
It will be very difficult with such a new and untried device to give you a firm answer. I would say if you have the money and time, please absolutely do give it a go. Post here, and maybe set up a blogg somewhere, with some pics, code extracts and the like (again, if you have time) as it would be invaluable to may folks to see what the performance characteristics are of the RPi.
We still dont have a clear picture of how USB read/writes will effect network performance and vice/versa, since the two are on a shared bus. Some hard numbers and results from the projects being discussed will enable people to make much better decisions and at the same time, help to guide the Foundation if they do in 1 or 2 years decide on a Type C/D, what aspects to improve if money allow.
There comes a point with clusters, where reducing the 'power of the engine' and expanding the 'number of engines' stops saving you power/money and starts costing and sacrificing performance.
It will be very difficult with such a new and untried device to give you a firm answer. I would say if you have the money and time, please absolutely do give it a go. Post here, and maybe set up a blogg somewhere, with some pics, code extracts and the like (again, if you have time) as it would be invaluable to may folks to see what the performance characteristics are of the RPi.
We still dont have a clear picture of how USB read/writes will effect network performance and vice/versa, since the two are on a shared bus. Some hard numbers and results from the projects being discussed will enable people to make much better decisions and at the same time, help to guide the Foundation if they do in 1 or 2 years decide on a Type C/D, what aspects to improve if money allow.
Re: High availability webservice cluster
If I have some spare time, I will definitely try to make this project First with a standalone webserver, and benchmark it. At that point we will have a beter view if the project will be performant or not
I know there is a point where the profit of this system collapses, but I hope that this point is very high in the sky
I know there is a point where the profit of this system collapses, but I hope that this point is very high in the sky
Re: High availability webservice cluster
For the people who are interested in this subject: HP is busy developing a system who is very alike
HP is now testing a new serverline, where a 4U serverchasis has 288 mini servers in it. Each device has a CPU (also ARM based) and memory.
http://www.tomshardware.com/ne.....13884.html
So for people who are interested in the commercial use of this project, contact HP
HP is now testing a new serverline, where a 4U serverchasis has 288 mini servers in it. Each device has a CPU (also ARM based) and memory.
http://www.tomshardware.com/ne.....13884.html
So for people who are interested in the commercial use of this project, contact HP
Re: High availability webservice cluster
From what I have read, the HP systems are basically a group of ARM chips on what appears to be a pseudo-daughterboard (expansion card) which goes onto a motherboard. The motherboard is very bare bones and offloads all it's workload onto those expansion cards which share the workload (as alone they are not very high CPU). On top of which, this method allows for lower power consumption as when the chip is not in use, it is in LP (Low Power) mode.
So essentially the power saving comes from, the fact that the power is shared between many smaller chips rather than few larger chips. So I would imagine (I have not done the math), if ALL of the chips were in HP (High Power) mode, running calculations, the total energy would be closer to (or higher than) a regular server. But the saving comes from when the server is not using every processor, and by having this type of design, depending on how the expansion system works, you could effectively have limitless amounts of additional processors added for minimal cost and zero downtime for hot-swapping some faulty ones for newer ones.
I could be completely wrong, but that's how it looks from what I've read and seen.
So essentially the power saving comes from, the fact that the power is shared between many smaller chips rather than few larger chips. So I would imagine (I have not done the math), if ALL of the chips were in HP (High Power) mode, running calculations, the total energy would be closer to (or higher than) a regular server. But the saving comes from when the server is not using every processor, and by having this type of design, depending on how the expansion system works, you could effectively have limitless amounts of additional processors added for minimal cost and zero downtime for hot-swapping some faulty ones for newer ones.
I could be completely wrong, but that's how it looks from what I've read and seen.
Re: High availability webservice cluster
HA web and mail clustering is what interested me as well. Mainly, the performance value vs an Atom-based cluster. I'm thinking network boot (if possable), iSCSI, and using a separate Casandra cluster (on enterprise hardware) for DB.
Re: High availability webservice cluster
@justin : You're probably right, it is just 1 motherboard with a couple of ARM cpu's on it.
@chris813 : If I have time I will benchmark one device with apache/lighttp on it, so we will have an idea if this device will be capable to cluster
If webclustering is possible, I suppose mail clustering will be too. Certainly something to think about
I hope I will have time next weekend to search how I can take care of the database servers, Mysql cluster looks very nice, but the system requirements are just insane
@chris813 : If I have time I will benchmark one device with apache/lighttp on it, so we will have an idea if this device will be capable to cluster
If webclustering is possible, I suppose mail clustering will be too. Certainly something to think about
I hope I will have time next weekend to search how I can take care of the database servers, Mysql cluster looks very nice, but the system requirements are just insane
Re: High availability webservice cluster
Update: unfortunately I wasn't able to buy a PI, so it will take a little bit longer to have the benchmark results
Re: High availability webservice cluster
I'm sure someone can do some benchmark tests when they get boards.
I don't think I'm in the first batch lottery but if by some fluke I am I will be doing lots of tests on the machine when I get one anyway
I don't think I'm in the first batch lottery but if by some fluke I am I will be doing lots of tests on the machine when I get one anyway
Re: High availability webservice cluster
I think the process of putting this together would be fun, and you'd learn a lot of what you'd need to know to set up a "real" HA scalable service.
I'm guessing if you're trying to run a web server on a Pi you're not likely to be putting it in a data centre, probably just running it off your home broadband connection. So one thing to think about - unless the RPi is for some reason really unreliable (and I certainly hope it's not!), the dominant availability issue is likely to be your network connection.
However, I think you could set up a pretty high availability system if you have a set of Pis and a friend you can convince to run half of them (preferably on a different internet provider).
Unfortunately that means the load balancing mechanism in the post you linked to isn't going to work - you can't share an IP between your two broadband connections. You also may have the problem of your IP changing if you don't have a static IP. I don't think there's a brilliant solution to this, but you could look at Amazon's Route53 DNS service - it's pretty cheap and you can have low time to live on your DNS records which means you can get away with using DNS for failover. (By pretty cheap I mean $0.50 for a domain/month, plus a few cents for actual requests, assuming your traffic is pretty low - note that doesn't include the registration for the domain).
I'm guessing if you're trying to run a web server on a Pi you're not likely to be putting it in a data centre, probably just running it off your home broadband connection. So one thing to think about - unless the RPi is for some reason really unreliable (and I certainly hope it's not!), the dominant availability issue is likely to be your network connection.
However, I think you could set up a pretty high availability system if you have a set of Pis and a friend you can convince to run half of them (preferably on a different internet provider).
Unfortunately that means the load balancing mechanism in the post you linked to isn't going to work - you can't share an IP between your two broadband connections. You also may have the problem of your IP changing if you don't have a static IP. I don't think there's a brilliant solution to this, but you could look at Amazon's Route53 DNS service - it's pretty cheap and you can have low time to live on your DNS records which means you can get away with using DNS for failover. (By pretty cheap I mean $0.50 for a domain/month, plus a few cents for actual requests, assuming your traffic is pretty low - note that doesn't include the registration for the domain).
Find Iridium Rising, our 3D space combat game, on the Pi Store!
Re: High availability webservice cluster
hvc123 said:
yer mysql will run but you would have to compile it from source as there is no binaries.
There are definately mysql armel binaries in debian, dunno about fedora.
But more generally the database is the main issue in making a website high availability. If you have a load of webservers fronting a single DB you haven't really improved availability over a single combined DB/webserver and trying to cluster databases opens a MASSIVE can of worms. Afaict the usual approach at least with mysql is to have a master/slave system with admins on hand to quickly promote the slave should the master die (you probablly DON'T want to do this automatically because if somehow both DBs end up as masters you have a potential nightmare merging the two back together).
yer mysql will run but you would have to compile it from source as there is no binaries.
There are definately mysql armel binaries in debian, dunno about fedora.
But more generally the database is the main issue in making a website high availability. If you have a load of webservers fronting a single DB you haven't really improved availability over a single combined DB/webserver and trying to cluster databases opens a MASSIVE can of worms. Afaict the usual approach at least with mysql is to have a master/slave system with admins on hand to quickly promote the slave should the master die (you probablly DON'T want to do this automatically because if somehow both DBs end up as masters you have a potential nightmare merging the two back together).
- Jim Manley
- Posts: 1600
- Joined: Thu Feb 23, 2012 8:41 pm
- Location: SillyCon Valley, California, and Powell, Wyoming, USA, plus The Universe
- Contact: Website
Re: High availability webservice cluster
Since the R-Pi is for education first and foremost, this is a good idea for learning the concepts of a multi-processor architecture, but, the R-Pi's hardware configuration is overkill on a per-processor level in the real world, vs. the architecture of the HP designed-to-purpose product. Processors should communicate over an interprocessor speed bus (as in the HP product), not 100 Mbps Ethernet, which will be a severe internal bottleneck. As has already been pointed out, the 100 Mbps Ethernet on a "master" node will also be a severe external bottleneck, as would a single-instance database running on one node. A truly distributed database running in a failover mode across multiple processors gets very complicated very quickly, especially if the users will be allowed to create database records, as locking and unlocking database elements at various levels of granularity within the database comes into play. The Linuxen that will run on the R-Pi (initially) are oriented toward single-processor operation - to do this right would really require something like a Unix Mach implementation, such as that used by Apple's OS X, some versions of BSD, and the server versions of Linuxen (e.g., Fedora Server). It might be possible to use one of the Beowulf cluster packages, but, that's really deisgned more for massively-parallel mathematical computation, and each R-Pi's GPU is already a screaming 24 GFLOPS floating-point powerhouse, internally (it could beat the pants off single-processor supercomputers of 15 ~ 20 years ago).
However, not to rain too hard on your parade, you would learn a great deal by implementing this on a handful of R-Pi's. Until the one-board-per-person rule is lifted, get some friends also lucky enough to receive their boards interested to work on it together and test what happens as their R-Pi's are inserted and removed from the network, literally as they take their boards home with them. You'll learn about failover, dynamic allocation of resources, resource balancing issues among multiple processors, the difficulties of detecting infinite circular logic loops, deadly embrace when more than one processor wants to access the same resource and the random back-off schemes for resolving this, shifting of master/slave tasks when various nodes fail or become disconnected from the network, etc.
Most of all, have fun and let us know what you're able to cobble together, if any of us ever receives our boards
However, not to rain too hard on your parade, you would learn a great deal by implementing this on a handful of R-Pi's. Until the one-board-per-person rule is lifted, get some friends also lucky enough to receive their boards interested to work on it together and test what happens as their R-Pi's are inserted and removed from the network, literally as they take their boards home with them. You'll learn about failover, dynamic allocation of resources, resource balancing issues among multiple processors, the difficulties of detecting infinite circular logic loops, deadly embrace when more than one processor wants to access the same resource and the random back-off schemes for resolving this, shifting of master/slave tasks when various nodes fail or become disconnected from the network, etc.
Most of all, have fun and let us know what you're able to cobble together, if any of us ever receives our boards
The best things in life aren't things ... but, a Pi comes pretty darned close! 
"Education is not the filling of a pail, but the lighting of a fire." -- W.B. Yeats
In theory, theory & practice are the same - in practice, they aren't!!!

"Education is not the filling of a pail, but the lighting of a fire." -- W.B. Yeats
In theory, theory & practice are the same - in practice, they aren't!!!
Re: High availability webservice cluster
A small update:
On the day of the release I wasn't able to order one. After much trouble I ordered 1 pi on the website of farnell. It has been over 2 months since then, and still no sign of my device
.
I presume some other people with a similar idea already started to build a HA cluster. If so, please feel free to share your impressions and thoughts
I already found 1 topic about this subject :
http://www.clusterdb.com/mysql-cluster/ ... pberry-pi/

On the day of the release I wasn't able to order one. After much trouble I ordered 1 pi on the website of farnell. It has been over 2 months since then, and still no sign of my device

I presume some other people with a similar idea already started to build a HA cluster. If so, please feel free to share your impressions and thoughts

I already found 1 topic about this subject :
http://www.clusterdb.com/mysql-cluster/ ... pberry-pi/
Re: High availability webservice cluster
I've been working on a few custom clustering/HA options with the pi (as are at least one or two other people) and am making some progress.
My first step has been booting pis over a network via iscsi (I've put a beginner-friendly guide up on elinux giving start to finish info on running an iscsi root on debian) and have expanded on that by adding some additional initramfs scripting which can tell pis apart and will boot from a different volume based on which pi is in use (in this way a single sd card can be used to boot multiple pis to different iscsi roots) - iscsi booting for a cluster in this way both provides performance benefits, allows for redundant backing data, and reduces costs by being able to run a cluster of pis from a single SD card, though obviously does require you have an iscsi target of some form or other available.
From there, clustering just becomes a question of what exactly you want to do - a MySQL cluster is perfectly achievable, MySQL is rather heavyweight for the pi but some proper tuning helps, though you still wont be wanting to run workloads that're too demanding. An ideal solution would likely be along the lines of 2 pis running database backends as MySQL master/slave with 3 webservers accessing them (giving 5 pis total), this is something i'll be exploring in detail but need 3 more pis first. A more lightweight approach might be along the lines of using sqlite as the database backend with unison or similar to replicate out the database, this would again be a master/slave approach, real-time active/active distribution of databasing across multiple pis is always going to be difficult.
That said, more for the curiosity than anything else I have been able to succesfully setup Openstack Nova + Swift across two pis on the Wheezy alpha build - Nova is obviously totally impractical for actual use on a pi (with nova running my pi takes the best part of 6 minutes just to boot) but Swift could potentially be easily used to distribute storage across multiple pis.
My first step has been booting pis over a network via iscsi (I've put a beginner-friendly guide up on elinux giving start to finish info on running an iscsi root on debian) and have expanded on that by adding some additional initramfs scripting which can tell pis apart and will boot from a different volume based on which pi is in use (in this way a single sd card can be used to boot multiple pis to different iscsi roots) - iscsi booting for a cluster in this way both provides performance benefits, allows for redundant backing data, and reduces costs by being able to run a cluster of pis from a single SD card, though obviously does require you have an iscsi target of some form or other available.
From there, clustering just becomes a question of what exactly you want to do - a MySQL cluster is perfectly achievable, MySQL is rather heavyweight for the pi but some proper tuning helps, though you still wont be wanting to run workloads that're too demanding. An ideal solution would likely be along the lines of 2 pis running database backends as MySQL master/slave with 3 webservers accessing them (giving 5 pis total), this is something i'll be exploring in detail but need 3 more pis first. A more lightweight approach might be along the lines of using sqlite as the database backend with unison or similar to replicate out the database, this would again be a master/slave approach, real-time active/active distribution of databasing across multiple pis is always going to be difficult.
That said, more for the curiosity than anything else I have been able to succesfully setup Openstack Nova + Swift across two pis on the Wheezy alpha build - Nova is obviously totally impractical for actual use on a pi (with nova running my pi takes the best part of 6 minutes just to boot) but Swift could potentially be easily used to distribute storage across multiple pis.
Re: High availability webservice cluster
Hi All,
Just thought I'd share my experiences with clustering and high-availability.
MySQL is the worst for HA and clustering as things like distributed data and replication were after-thoughts to the initial database design. It may be worth looking at Postgres, which has developed with distributed data from very early on. Better yet, some of the new "Big Data" solutions like Reddis and MongoDB might also be worth looking into, although their base requirements may be a little too high for our little RPi's.
RE: storage, using an, iSCSI target with a DRBD is a good way to go to get redundancy in shared storage. The other option is to build a storage cluster across many nodes using GlusterFS (just don't use it for mail... as Gluster is awful for atomic high concurrency operations). A benefit of GlusterFs is that you can do things like RAID5 across nodes which is pretty cool and means you don't have to put up with a 1:2 storage ratio
For automated failover of clustered resources look into Pacemaker and Corrosync, both excellent glue & CRM implementations. Not necessary for things like webservers, more for automating things that can only be active on one node at a time.
For open source load balancing, check out a combination of Pound (fast layer 7 load balancing) and HAProxy (L3 load balancing). You can combine these with VRRP or LVS for redundancy at the load-balancer level.
I can't wait till the one-per-person limit is lifted so I can get working on this. It should all be possible, but boils down to choosing the right software for the job.
Cheers,
Dave
Just thought I'd share my experiences with clustering and high-availability.
MySQL is the worst for HA and clustering as things like distributed data and replication were after-thoughts to the initial database design. It may be worth looking at Postgres, which has developed with distributed data from very early on. Better yet, some of the new "Big Data" solutions like Reddis and MongoDB might also be worth looking into, although their base requirements may be a little too high for our little RPi's.
RE: storage, using an, iSCSI target with a DRBD is a good way to go to get redundancy in shared storage. The other option is to build a storage cluster across many nodes using GlusterFS (just don't use it for mail... as Gluster is awful for atomic high concurrency operations). A benefit of GlusterFs is that you can do things like RAID5 across nodes which is pretty cool and means you don't have to put up with a 1:2 storage ratio
For automated failover of clustered resources look into Pacemaker and Corrosync, both excellent glue & CRM implementations. Not necessary for things like webservers, more for automating things that can only be active on one node at a time.
For open source load balancing, check out a combination of Pound (fast layer 7 load balancing) and HAProxy (L3 load balancing). You can combine these with VRRP or LVS for redundancy at the load-balancer level.
I can't wait till the one-per-person limit is lifted so I can get working on this. It should all be possible, but boils down to choosing the right software for the job.
Cheers,
Dave
Re: High availability webservice cluster
First RasPi cluster with step-by-step guide: http://www.southampton.ac.uk/~sjc/raspberrypi/
Re: High availability webservice cluster
Has anyone been able to get LVS (Linux Virtual Server) working on a Raspberry Pi yet? Do I have to recompile the kernel?
Re: High availability webservice cluster
not a programmer but/yet ... i'm interesting in the idea and what other roads lead to .... whatever!!
joomla with NginX http://youtu.be/u2MFQCoexD0
openstack.prov12n.com/openstack-on-raspberry-pi-part-2-getting-started
all the best ,
too late !
joomla with NginX http://youtu.be/u2MFQCoexD0
openstack.prov12n.com/openstack-on-raspberry-pi-part-2-getting-started
all the best ,
too late !
-
- Posts: 5
- Joined: Mon Jan 13, 2014 6:46 pm
Re: High availability webservice cluster
Hi ,
here is guide for Nginx with deb package (last versions) https://raspberry-hosting.com/en/faq/wh ... install-it
here is guide for HAProxy + VRRP (keepalived) with deb packages (last versions) https://raspberry-hosting.com/en/faq/wh ... stall-high
All packages (Nginx, Keepalived, HAProxy) in last version you can find direcly on https://packages.raspberry-hosting.com
Enjoy!
here is guide for Nginx with deb package (last versions) https://raspberry-hosting.com/en/faq/wh ... install-it
here is guide for HAProxy + VRRP (keepalived) with deb packages (last versions) https://raspberry-hosting.com/en/faq/wh ... stall-high
All packages (Nginx, Keepalived, HAProxy) in last version you can find direcly on https://packages.raspberry-hosting.com
Enjoy!