Linux Cluster vs. Keepalived in 2015

Some simple factors that  may better inform your decision regarding a high-availablity service architecture.

Why choose Keepalived?

Keepalived is a VRRP application.  Essentially a daemon that detects when a node fails based on tests you write, then moves a virtual IP address that is “in front” of the node’s service to another node using scripts you write.  It can be a very loosely coupled system or pretty tightly coupled.  It depends on how you write the tests.

Keepalived also has a load balancer built into the daemon.

  • The only legitimate solution for a two-node architecture.  You could run more nodes but, managing failover could get complicated.
  • Some downtime needs to be tollerated. 10 seconds should be enough time to fail anything over.
  • Easy to run on truly commodity hardware
  • Two node services can easily be configured active/passive
  • Most services can easily run some kind of active/passive setup.
  • Built-in load balancer
  • No need for shared storage.
  • Failover scripting is not difficult.  But, the flipside is you have to script for every possibility.
  • Failover scripting is as simple as you make it.  Failure is “exit 1”, success is “exit 0.”

Why choose a Pacemaker cluster?

A pacemaker cluster does much more to ensure a service it manages runs somewhere and scaling to very many (10s) nodes is relatively painless.  In some sense, it’s easier, especially if you are setting up frequently used clustered services.

A pacemaker cluster is not a load balancer.  You can add a load balancer to a pacemaker cluster, but that’s different than keepalived’s VRRP.

  • You want the most robust solution.  10 nodes is no problem.  The limit on nodes is how many services the cluster manages, not so much the number of nodes.
  • You’ve got a supported method of STONITH.  It is possible to add your own, but you will need to learn how pacemaker works to be sure it is reliable.  If STONITH fails, your cluster isn’t much good.
  • You’ve got enterprise hardware sufficient for three or more nodes.
  • You’ve got “enterprise” shared storage. (iSCSI, FC, FCIP)
  • You want to deploy clustered file systems.
  • Plenty of others running common clusters. For example, a MySQL/MariaDB “cloud” service.

Each has their own strengths.  Good luck!

Debian Pacemaker Cluster with Stonith ILO

I had some issues setting up stonith on an HP DL380 cluster running Debian Squeeze. (In June, 2013 that’s old stable)

For the record, the incantation that worked for me and my older DL380 cluster:

primitive shoot-Node1 stonith:external/riloe \
        params hostlist=”node1″ ilo_user=”SomeUser” ilo_password=
“SomeSecret” ilo_powerdown_method=”button” ilo_can_reset=”1″ stonith-timeout=”20s” ilo_hostname=”Node1ILO” ilo_protocol=”1.9″

1-That incantation requires an entry in /etc/hosts so the machine can get to the Node1ILO interface.

2-It also requires iLO credentials to an account with sufficient privilege to reboot the machine.  I have both the ssh and web service enabled.

3-Be sure to set location for the resource so that a node cannot shoot itself thus leaving no nodes.

Finally, you may need to alter the ilo_protocol version to fit your machine.  If the service does not come online, after ruling out the other settings, fiddle with this one last.  I used another fence I could not get working to figure out the version

fence_ilo_mp -aHOSTNAME -x  -ostatus -lSomeUSER -pSomePassword -v

The output should have the version running on the second line from the top.  For example, “iLO  1.94 at 12:11:27 Mar 19 2013″  That’s how I got 1.9.  See how that works?

Pacemaker Cluster Node Recovery

I had a nasty stonith problem with a 2-node Pacemaker cluster I’m running and got into a situation where the second node would not rejoin the cluster.

It’s not enough sometimes to follow the instructions posted here.

If things are really messed up like mine were, then get rid of all the extra messages and cluster tasks that prevented the problem node from rejoining.

Do the following on the problem node:

1. service pacemaker stop, service corosync stop
2. rm /var/lib/heatbeat/crm/cib-*
3. rm /var/lib/pengine/*
4. service corosync start, service pacemaker start

If it still doesn’t work, then you may need to do the same on the other node.

If things are really bad, run the programs in the foreground in a couple of different terminals. Both corosync and pacemaker have verbose modes that don’t switch to daemon.

WordPress and Scheduled Posts Not Working

I have a hostname lookup problem with the server hosting this blog that I haven’t bothered to fix as well as the fact the low-end firewall I use doesn’t support DNS loopback.

Unfortunately, this combination of issues makes follow-on problems for WordPress’ scheduling feature.  A simple solution is to set a cron job.

* 23 * * * /usr/bin/php /path/to/wp-cron.php

Works for me!

International Olympic Committee Supports Doping

In Senate testimony provided by a former French Sports Minister, when the city of Paris was bidding to host the Summer Games, Hein Verbruggen as IOC games site selection committee member  AND the guy that runs the International Cycling Federation,  demand France suspend it’s stricter doping laws for the duration of the potential Summer games.

So, the basis of the drama of the Summer Olympic Games is very likely sports fraud.  The IOC is enabling doping.  If I’m aware of it, then elite IOC sanctioned sports athletes must be aware of this.

There are a number of IOC sanctioned athletes that have admitted to doping their entire career after years of denials and hiding behind “never tested positive.”  Now we know the most important part of the anti-doping system, the IOC, plays a vital role in athletes doping.

Specific to elite cycling, the guy running the UCI is promoting doping!  It’s now clear that Armstrong and other cyclists caught doping are only a small part of the doping problem.  The federation’s leaders are actively engaged in enabling doping.

Mysteriously, sportswriters who cover the Olympic games (Bob Fausulo,   to name two)  are not covering the story.


Chris Carmichael, Liar and Teen Doper

If there was ever a guy that deserves a WADA ban, it’s got to be Chris Carmichael.

That guy has poisoned cycling for decades and has reappeared at another cycling publication, Road Bike Action, with this interview.

The most obvious lie is this one, “RBA: Speaking of Lance, what do you say about the accusations that are there about your own history with doping?
While I know that it did occur, I never saw it occurring. Also, I never doped any of the athletes I worked with. “

Hmm, well it seems Greg Strock and others feel differently as you bought your way out of their lawsuit against USA Cycling where you were head of the Junior development program at the time. 60 Minutes did a great story on the whole sordid affair.

It’s a shame because it’s the coordinated efforts of people like Tim Maloney, presumably Zap, and Carmichael that discredit cycling as a legitimate sport in the U.S.  Not to mention what it does to what little credibility Road Bike Action ever had.

Solvang Century 2013

A big, belated thank-you to everyone who worked the 2013 Solvang century.  Special thanks go out to law enforcement and the military personnel working the 2013 Solvang Century.

I heard some grumbling about waiting to enter and leave the Air Force base.  I waited along with everyone else and had no problem.  The service people handled everyone in an efficient manner.

It was a tough day with a headwind right from the start then for most of the ride.  By the time all the climbing on Foxen Canyon Road started, I know my legs were toasted.

I find it interesting there is a limited number of GPS-results posted on either Strava or RidewithGPS.  My estimation is because it was slow times for everyone.  My estimation the wind added 1.5 hours or more to everyone’s ride.

My own time was ridiculously long, at about 8 hours.  Which is okay given the +/- four hours I workout every week.

It seemed like attendance was down a bit.  Maybe the weather scared people away?

Spain has a National Doping Program?

As mentioned in this article,

Eufemiano Fuentes got approval to distribute numerous doping products in Spain from the Spanish Government.

Given the way Fuentes has behaved at the trial and outside the trial, he appears ready to do a Floyd Landis and get specific with doping accusations should the trial end badly for him.  Unlike Landis, he talks like he has been central to quite a bit of doping in Spain.

For those not following along, this means the bigger sports like football (Soccer in the U.S.) and tennis athletes could finally have their own doping programs uncovered as well as the inevitable cooperation of the sports federations.

USA Cycling Ignores the USADA?

It probably helps having your business partner from Tailwind running  the UCI’s American cycling federation.  Did USA Cycling forget to vacate Lance Armstrong’s results?

Will the USOC or even the IOC will have a problem with that?

It may be a case of some editing never done because the results from the 1999 World Championships have no mention of Armstrong.

1993: Oslo, Norway
1. Jan Ullrich (GER)
2. Kaspars Ozers (LAT)
3. Lubor Tesar (CZE)