====== Geo-diverse resources with pacemaker ====== Condensed summary - if you need more detail, follow the links: * [[http://corosync.github.io/corosync/|Corosync]] is a cluster communication / co-ordination service * [[https://wiki.clusterlabs.org/wiki/Pacemaker|Pacemaker]] is a cluster resource management system * [[https://github.com/ClusterLabs/booth|Booth]] is a cluster ticket manager Corosync joins several machines together in a cluster, without specifying what they're supposed to do. Pacemaker specifies what they're supposed to do - what applications and services are supposed to be running where. Booth is a layer on top which allows multiple clusters to manage resources between them. ===== Rule number one ===== **You must have an odd number of machines in a cluster.** You cannot call one machine a cluster, therefore the minimum number of machines needed to create cluster is three. If you want to go and try building a two-machine cluster, by all means go ahead, but don't complain to me when the two machines lose sight of each other (but keep talking to everything else) and it all goes horribly wrong™. ===== Basic corosync ===== Here's a simple corosync configuration file (you often don't actually need much more than this): totem { version: 2 cluster_name: pleiades token: 3000 token_retransmits_before_loss_const: 10 clear_node_high_bit: yes crypto_cipher: aes256 crypto_hash: sha1 } logging { fileline: off to_stderr: no to_logfile: no logfile: /var/log/corosync/corosync.log to_syslog: yes debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off } } quorum { provider: corosync_votequorum } nodelist { node { ring0_addr: 198.51.100.1 nodeid: 1 name: sterope } node { ring0_addr: 198.51.100.2 nodeid: 2 name: merope } node { ring0_addr: 198.51.100.3 nodeid: 3 name: electra } } This sets up a 3-node cluster in which at least two of the machines need to be in communication and running corosync for the cluster to be "active". They don't yet __do__ anything useful, that comes next with pacemaker: ===== Basic pacemaker ===== Here's a simple pacemaker configuration file supporting a floating IP address which will be managed on one (and only one) of the three machines in the cluster. If that machine dies, another one takes over the IP address, and you have "a high-availability IP address" (on which you could run some application such as Apache or Asterisk if you wanted to): primitive IP-float4 IPaddr2 params ip=198.51.100.42 cidr_netmask=24 meta migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-fail=restart group floater IP-float4 resource-stickiness=100 property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop start-failure-is-fatal=false cluster-recheck-interval=60s You use the above configuration file with the command: * crm configure load replace cluster.cib Note that I did not really need to create the group definition "floater" since it only contains a single resource, but in practice I think it's unlikely that you only want one resource to be managed on its own, therefore I've included a group definition which can then be extended by listing the resources one after another (which, incidentally, also determines the order in which they get started and stopped - they're started left to right, and stopped right to left). ===== Taking things a step further ===== I had the following setup: * A three-node cluster in a data centre in Manchester, managing a floating IP address and several applications * Another three-node cluster in a data centre in York, managing the same set of resources, although based on a different floating IP address I then had a requirement to run one more application __either__ in Manchester __or__ in York, but not both. My system was already designed so that if the machines at Manchester, for example, became unavailable to the Internet, York would run everything I needed and I was happy. It was fine if both Manchester and York were running all their respective resources at the same time, too. I asked on the [[https://www.clusterlabs.org/|ClusterLabs]] [[https://lists.clusterlabs.org/mailman/listinfo/users|mailing list]] about how to achieve this extra resource which ran either in Manchester or in York but not both, and people started pointing me at [[https://github.com/ClusterLabs/booth|booth]], which when I looked at it seemed like a big over-complication for what I needed. Fortunately some other people said "you don't need that, try location constraints", so I investigated that, and came up with the following solution, which works nicely: - a single big cluster of seven machines, comprising the three in Manchester plus the three in York, plus one more somewhere else * remember Rule One - you can't just join two three-node clusters together and expect the six nodes to work properly, because six is not an odd number * I chose a data centre in London for the seventh machine, and it's just a cheap tiny virtual server which can communicate with Manchester and York. It runs corosync and pacemaker but never hosts any resources - a "site" attribute for each of the Manchester and York machines, showing which city they are in - a "location" preference for each of the existing resources, to ensure they continued running where they were supposed to * I can't have a floating IP address from the Manchester network assigned to a machine in York; it just doesn't work - a "colocation" preference for the application I needed running __somewhere__, but only once The result is the following pacemaker configuration file: node tom attribute site=Man node dick attribute site=Man node harry attribute site=Man node fred attribute site=York node george attribute site=York node ron attribute site=York primitive Man-float4 IPaddr2 params ip=198.51.100.42 cidr_netmask=24 meta migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-fail=restart primitive York-float4 IPaddr2 params ip=203.0.113.42 cidr_netmask=24 meta migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-fail=restart primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-fail=restart group Man Man-float4 resource-stickiness=100 group York York-float4 resource-stickiness=100 group Any Asterisk resource-stickiness=100 location use_Man Man rule -inf: site ne Man location use_York York rule -inf: site ne York location not_Man Man resource-discovery=never -inf: bert location not_York York resource-discovery=never -inf: bert colocation once 100: Any [ Man York ] property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop start-failure-is-fatal=false cluster-recheck-interval=60s Obviously you need to define all seven machines in your corosync.conf file as well (this file **must** be identical on all servers in the cluster), but that's just a matter of extending the "nodelist" section with more "node" definitions. The line **location use_Man Man rule -inf: site ne Man** means "I have a [[https://crmsh.github.io/man-2.0/#cmdhelp_configure_location|location preference]] which I choose to call 'use_Man' for the resource group named 'Man' where the rule is 'definitely do not run if the site attribute is not Man'" ("-inf:" means "minus infinity" to pacemaker, and basically means "this is a no-no"). The line **colocation once inf: Any [ Man York ]** means "I have a [[https://crmsh.github.io/man-2.0/#cmdhelp_configure_colocation|co-location preference]] which I choose to call 'once' for the resource group named 'Any' such that I want it to be running alongside the resource group 'Man' or alongside the resource group 'York', but I do not care which". The square brackets are [[https://crmsh.github.io/man-2.0/#topics_Features_Resourcesets|significant]]. The line **location not_Man Man resource-discovery=never -inf: bert** means "bert (the seventh machine which runs no resources) should not even try to find out whether it is running any resources, just in case they should be turned off". In general, you won't even have the commands on this machine which are needed to check whether resources are running, so allowing pacemaker to do this simply results in complaining messages about "command not installed". Adding these lines keeps it quiet. PS: Note also that in the location preference rules, "rule -inf: site ne Man" is apparently **not** the same as "rule inf: site eq Man". You might think so, and it might seem easier to read, but specifying a infinite preference for a resource to run where the site label is "Man" turns out not to be the same (for the people who wrote pacemaker, anyway) as an infinite preference for a resource not to run where the site label is not "Man". I have no idea why [[https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/constraints.html|this is considered sensible]], but the above works for me. ===== The result ===== All the resources which were previously defined on the 3-node Manchester cluster now run on one of the (same) three machines in Manchester - assuming that these machines are available. If those machines go down, then these resources do not run at all (just the same as before). Same thing for the resources at York. The resource which I want running just once, in either Manchester or York, runs on the same machine as all the resources are on in Manchester, or on the equivalent machine in York, but not both. ==== Bonuses ==== Previously I had to have at least two of the three machines at Manchester up and running in order for Manchester's resources to be running. Similarly for two out of three at York. Under the new arrangement, I need only one machine at Manchester, plus one machine at York, plus any two other machines (one of which can be the one in London) for __all__ resources to be running. That can mean one machine at Manchester and three at York, or one at Manchester, two at York and one in London. Previously it was impossible to have the Manchester resources available with only one working machine at Manchester, and vice versa for York. The "colocation" requirement ensures that all the resources which are running in Manchester (possibly including the one "anywhere" resource) are running on a single machine. There are other ways of specifying where you want this "anywhere" resource to run which can result in it running on one of the other two machines in Manchester. This may or may not be a problem for you, but having everything running on one server is much neater for me. ---- [[.:|Go up]]\\ Return to [[:|main index]].