Do you remember the distributed denial of service (DDoS) attacks this website was undergoing a few months ago? They made the news (partly because it was just so bizarre to see someone attacking an educational computing charity) – if you want to refresh your memory, see this, this, or this.
Pete Stevens, who runs marathons and our hosting company, Mythic Beasts, thought you’d be interested in what he’s been doing to try to ensure this can’t happen again. (Famous last words, Pete.) Here’s what he did. Over to Pete!
In the past we’ve had occasional trouble with denial of service attacks against the Raspberry Pi website. In particular, simply overflowing us with traffic has proved not that difficult – the server only has a 1Gbps uplink. When your admin (me) cocks up, it turns out you can saturate a core calculating syncookies leaving the other cores idle because he should have configured IRQ balancing properly.
We briefly investigated cloud based DDoS protection which we still hold in reserve, but it has a habit of declaring that Liz can’t post things because apparently she’s a spambot. We also had to switch off IPv6 access to the website to use them, which, for an educational project, was unfortunate as there is going to eventually be a large network transition to IPv6 and allowing people to learn about it and use it is desirable.
So we’ve scaled out the hosting infrastructure out to a distributed cluster of machines. We’ve installed four additional little dual core machines, two in our Telecity Sovereign House site, two in our Telecity Harbour Exchange site. Each of these runs a load balancer and forwards connections back to the main webserver. This means the inbound load is now shared over 4 separate 1Gbps links and there’s rather more CPU available to calculate syncookies when required and rather more bandwidth to saturate.
We load balance over the load balancers using DNS round robin, as you can see from our public DNS:
$ dig www.raspberrypi.org AAAA +short lb.raspberrypi.org. 2a00:1098:0:80:1000:13:0:5 2a00:1098:0:80:1000:13:0:6 2a00:1098:0:82:1000:13:0:5 2a00:1098:0:82:1000:13:0:6 $ dig www.raspberrypi.org A +short lb.raspberrypi.org. 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11
Now, everybody knows that this is a stupid way of load balancing, and you don’t get anything like even usage across your sites. This isn’t even slightly born out by the bandwidth figures for the last few days:
18.104.22.168 347.04 GB 22.214.171.124 341.61 GB 126.96.36.199 349.58 GB 188.8.131.52 347.88 GB
That’s agreement to within 2%, which is a pretty even split. So much for commonly held wisdom, we prefer science.
We’ve set the entire internal network up with IPv6. So when you connect to one of the front end machines, it’ll connect back to the main webserver over IPv6. One of the reasons for this is so we’re now running mission critical services over IPv6 – a move to IPv6 worldwide is happening at a glacial pace, and we want to make sure that our support works well. Having an angry Liz phone you up if it doesn’t is a very effective motivator.
You may have seen odd forum and comment issue while we were setting this up. One of the forum spam filters allowed filtering people based on source IP address. The move to the new setup means that clicking the filter by IP address feature resulted in dropping all new comments from that load balancer – a quarter of our traffic. *Oops*. We had to fix that to read the forwarded-by headers.
Now of course the real question is, why aren’t we fronting the site with a massive cluster of Pis? Testing with hping3 suggests that a Pi starts to struggle at around 2500 syns/sec. The front-ends we have are absolutely fine at at 50,000 syncs/sec (reading roughly 10% cpu), so with four of them we can probably handle around 1,000,000+ syns/second. That’d require 400 Pis to keep up, so it’d be a very very large cluster of Pis, not to mention 5 switches in each site.
Of course a very stern warning has been given out to people who have access to the front end machines – not only can they receive a million syns/sec, they can also send them, and that could seriously upset other internet users if it was directed at them.
Now, a side effect of this scale-out is we’re left with a bunch of machines that have a reasonable amount of excess CPU. Eben has *strong views* about wasting CPU cycles, it makes him very sad. So we’ve put them to use.
Rob Bishop and Gordon Hollingworth at Raspberry Pi spend quite a lot of time building software. Compiling it is time-consuming, and their laptops get hot and make fan noises. So we’ve installed a set of dual core VMs on the five core servers running under KVM. When everything is fully operational the software team can kick off a build from the master VM which will then use distcc to farm out the compile across all five machines. This means there’s effectively 10 cores available most of the time for building software. When the website gets busy, the lower priority VMs slow down and hand the cycles back to the load balancer/Apache/PHP/MySQL.
Now, the Raspberry Pi is an educational project. It’s not just about educating children: adults still need to learn things, and that includes me. We’ve run many dual stack IPv4/IPv6 machines before, but we thought we’d try IPv6 only machines and discover the difficulties in order to improve our support for IPv6. So the distcc VMs are IPv6 only – they can’t access anything on the internet that isn’t accessible over IPv6. In reality this means they can see Google, Facebook, lots of mirror servers and a small fraction of other sites.
In the process of setting this up I discovered that I was unable to get the Debian Squeeze network installer to install from an IPv6 only network, so I had to do the initial install to the VMs from a full install image rather than the cut down one. I then realised that Mythic Beasts still doesn’t have an IPv6 aware resolver yet, which we need to sort out, so I had to use Googles public resolver. This is still on my todo list along with full DNSSEC resolver support.
Happily Debian appears to work fine with IPv6 only. The mirrors are v6 enabled, so the VMs recieve updates and can install packages fine, and so far it appears to be going well.
There’s still some things to do and questions to answer: should we move apache/php processing to the front end nodes? Will WordPress Supercache and the other plugins cope in a distributed environment? Will file uploads still work? Can we solve that with NFS? Does NFS even work over IPv6? Should we install a varnish cache on the front end nodes and disable WordPress Supercache? Should we do both? Will it confuse people if we have two layers of caching that expire at different times? Is that better than what we have now? Instead of having tcp-syn cookies on the whole time we could only enable them when under attack. Have we made a dreadful mistake with the build VMS, and is it all going to go offline when Rob tries to compile OpenOffice? Should we stop worrying about all of these questions and instead work out whose job it is to buy the first round at the Cambridge beer festival?
If this is the sort of thing you’d find interesting, and you would like to be paid to solve exactly these sorts of questions, Mythic Beasts is recruiting.
We’re looking for both junior and senior people, we very strongly like bright motivated people who get things done, and we’re not overly impressed by certifications. We’d really like a full time person or two but are not averse to taking on summer or gap year students providing they’re smart and they get things done.