Discussion:
[Pacemaker] Master-Slave role stickiness
brook davis
2015-01-21 20:06:47 UTC
Permalink
Hi,

I've got a master-slave resource and I'd like to achieve the following
behavior with it:

* Only ever run (as master or slave) on 2 specific nodes (out of N
possible nodes). These nodes are predetermined and are specified at
resource creation time.
* Prefer one specific node (of the 2 selected for running the resource)
for starting in the Master role.
* Upon failover event, promote the secondary node to master.
* Do not re-promote the failed node back to master, should it come back
online.

The last requirement is the one I'm currently struggling with. I can
force the resource to run on only the 2 nodes I want (out of 3 possible
nodes), but I can't get it to "stick" on the secondary node as master
after a failover and recovery. That is, when I take the original
master offline, the resource promotes correctly on the secondary, but if
I bring the origin node back online, the resource is demoted on the
secondary and promotes back to master on the origin. I'd like to avoid
that last bit.

Here's the relevant bits of my CRM configuration:

primitive NIMHA-01 ocf:heartbeat:nimha \
op start interval="0" timeout="60s" \
op monitor interval="30s" role="Master" \
op stop interval="0" timeout="60s" \
op monitor interval="45s" role="Slave" \
ms NIMMS-01 NIMHA-01 \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started" is-managed="true"
location prefer-elmy-inf NIMMS-01 5: elmyra
location prefer-elmyra-ms NIMMS-01 \
rule $id="prefer-elmyra-rule" $role="Master" 10: #uname eq elmyra
location prefer-pres-inf NIMMS-01 5: president
location prefer-president-ms NIMMS-01 \
rule $id="prefer-president-rule" $role="Master" 5: #uname eq president
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1421798334" \
default-resource-stickiness="200" \
symmetric-cluster="false"


I've set symmetric-cluster="false" to achieve an "opt-in" behavior, per
the corosync docs. From my understanding, these location constraints
should direct the resource to be able to run on the two nodes,
preferring 'elmyra' initially as Master. My question then becomes, is
there a way to apply the stickiness to the Master role ?? I've tried
adding explicit stickiness settings (high numbers and INF) to the
default-resource-stickiness, the actual "ms" resource, and the
primitive, all to no avail.

Anyone have any ideas on how to achieve stickiness on the master role in
such a configuration ?

Thanks for any and all help in advance,

brook

ps. please ignore/forgive the no-quorum-policy and stonith-enabled
settings in my configuration... I know it's bad and not best practice.
I don't think it should affect the answer to the above question, though,
based on my understanding of the system.

_______________________________________________
Pacemaker mailing list: ***@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Andrei Borzenkov
2015-01-22 13:44:06 UTC
Permalink
Post by brook davis
Hi,
I've got a master-slave resource and I'd like to achieve the following
* Only ever run (as master or slave) on 2 specific nodes (out of N possible
nodes). These nodes are predetermined and are specified at resource
creation time.
* Prefer one specific node (of the 2 selected for running the resource) for
starting in the Master role.
* Upon failover event, promote the secondary node to master.
* Do not re-promote the failed node back to master, should it come back
online.
The last requirement is the one I'm currently struggling with. I can force
the resource to run on only the 2 nodes I want (out of 3 possible nodes),
but I can't get it to "stick" on the secondary node as master after a
failover and recovery. That is, when I take the original master offline,
the resource promotes correctly on the secondary, but if I bring the origin
node back online, the resource is demoted on the secondary and promotes back
to master on the origin. I'd like to avoid that last bit.
It sounds like default-resource-stickiness does not kick in; and with
default resource-stickiness=1 it is expected (10 > 6). Documentation
says default-recource-stickiness is deprecated so may be it is ignored
in your version altogether? What "ptest -L -s" shows?
Post by brook davis
primitive NIMHA-01 ocf:heartbeat:nimha \
op start interval="0" timeout="60s" \
op monitor interval="30s" role="Master" \
op stop interval="0" timeout="60s" \
op monitor interval="45s" role="Slave" \
ms NIMMS-01 NIMHA-01 \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
notify="true" target-role="Started" is-managed="true"
location prefer-elmy-inf NIMMS-01 5: elmyra
location prefer-elmyra-ms NIMMS-01 \
rule $id="prefer-elmyra-rule" $role="Master" 10: #uname eq elmyra
location prefer-pres-inf NIMMS-01 5: president
location prefer-president-ms NIMMS-01 \
rule $id="prefer-president-rule" $role="Master" 5: #uname eq president
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1421798334" \
default-resource-stickiness="200" \
symmetric-cluster="false"
I've set symmetric-cluster="false" to achieve an "opt-in" behavior, per the
corosync docs. From my understanding, these location constraints should
direct the resource to be able to run on the two nodes, preferring 'elmyra'
initially as Master. My question then becomes, is there a way to apply the
stickiness to the Master role ?? I've tried adding explicit stickiness
settings (high numbers and INF) to the default-resource-stickiness, the
actual "ms" resource, and the primitive, all to no avail.
Anyone have any ideas on how to achieve stickiness on the master role in
such a configuration ?
Thanks for any and all help in advance,
brook
ps. please ignore/forgive the no-quorum-policy and stonith-enabled settings
in my configuration... I know it's bad and not best practice. I don't
think it should affect the answer to the above question, though, based on my
understanding of the system.
_______________________________________________
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
_______________________________________________
Pacemaker mailing list: ***@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
brook davis
2015-01-22 22:13:57 UTC
Permalink
< snip >
Post by Andrei Borzenkov
It sounds like default-resource-stickiness does not kick in; and with
default resource-stickiness=1 it is expected (10 > 6). Documentation
says default-recource-stickiness is deprecated so may be it is ignored
in your version altogether? What "ptest -L -s" shows?
I see now that default-resource-stickiness has been marked deprecated.
Thanks for the tip on ptest that's helpful... though, it looks like my
14.04 Ubuntu I'm using ships with crm_simulate instead, so using that...

I've seemingly successfully set the default stickiness using the
crm_attribute command and set it in the resource defaults section, as
you can see in my updated config here:

***@elmyra:~# crm configure show
node $id="168430537" elmyra \
attributes standby="off"
node $id="168430539" president \
attributes standby="off" maintenance="off"
primitive NIMHA-01 ocf:heartbeat:nimha \
op start interval="0" timeout="60s" \
op monitor interval="30s" role="Master" \
op stop interval="0" timeout="60s" \
op monitor interval="45s" role="Slave"
ms NIMMS-01 NIMHA-01 \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started" is-managed="true"
location prefer-elmy-inf NIMMS-01 5: elmyra
location prefer-elmyra-ms NIMMS-01 \
rule $id="prefer-elmyra-rule" $role="Master" 10: #uname eq elmyra
location prefer-pres-inf NIMMS-01 5: president
location prefer-president-ms NIMMS-01 \
rule $id="prefer-president-rule" $role="Master" 5: #uname eq president
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1421964175" \
default-resource-stickiness="200" \
symmetric-cluster="false"
rsc_defaults $id="rsc_defaults-options" \
resource-stickiness="200"
***@elmyra:~#


And here's the output of ptest/crm_simulate:

***@elmyra:~# crm_simulate -L -s

Current cluster status:
Online: [ elmyra president ]

Master/Slave Set: NIMMS-01 [NIMHA-01]
Masters: [ elmyra ]
Slaves: [ president ]

Allocation scores:
clone_color: NIMMS-01 allocation score on elmyra: 5
clone_color: NIMMS-01 allocation score on president: 5
clone_color: NIMHA-01:0 allocation score on elmyra: 205
clone_color: NIMHA-01:0 allocation score on president: 5
clone_color: NIMHA-01:1 allocation score on elmyra: 5
clone_color: NIMHA-01:1 allocation score on president: 205
native_color: NIMHA-01:0 allocation score on elmyra: 205
native_color: NIMHA-01:0 allocation score on president: 5
native_color: NIMHA-01:1 allocation score on elmyra: -INFINITY
native_color: NIMHA-01:1 allocation score on president: 205
NIMHA-01:0 promotion score on elmyra: 14
NIMHA-01:1 promotion score on president: 9

Transition Summary:
***@elmyra:~#


So, am I correct in my assessment that stickiness does not apply to the
promotion score? The 200 value I set the default resource stickiness to
seems to be taking affect. Not sure I entirely understand the scoring,
or at least the way crm_simulate is representing it, however.

Any insights, ideas, thoughts, help would be much appreciated.

Thanks,

brook




_______________________________________________
Pacemaker mailing list: ***@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Andrew Beekhof
2015-03-30 01:38:26 UTC
Permalink
Post by brook davis
< snip >
Post by Andrei Borzenkov
It sounds like default-resource-stickiness does not kick in; and with
default resource-stickiness=1 it is expected (10 > 6). Documentation
says default-recource-stickiness is deprecated so may be it is ignored
in your version altogether? What "ptest -L -s" shows?
I see now that default-resource-stickiness has been marked deprecated. Thanks for the tip on ptest that's helpful... though, it looks like my 14.04 Ubuntu I'm using ships with crm_simulate instead, so using that...
node $id="168430537" elmyra \
attributes standby="off"
node $id="168430539" president \
attributes standby="off" maintenance="off"
primitive NIMHA-01 ocf:heartbeat:nimha \
op start interval="0" timeout="60s" \
op monitor interval="30s" role="Master" \
op stop interval="0" timeout="60s" \
op monitor interval="45s" role="Slave"
ms NIMMS-01 NIMHA-01 \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Started" is-managed="true"
location prefer-elmy-inf NIMMS-01 5: elmyra
location prefer-elmyra-ms NIMMS-01 \
rule $id="prefer-elmyra-rule" $role="Master" 10: #uname eq elmyra
location prefer-pres-inf NIMMS-01 5: president
location prefer-president-ms NIMMS-01 \
rule $id="prefer-president-rule" $role="Master" 5: #uname eq president
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1421964175" \
default-resource-stickiness="200" \
symmetric-cluster="false"
rsc_defaults $id="rsc_defaults-options" \
resource-stickiness="200"
Online: [ elmyra president ]
Master/Slave Set: NIMMS-01 [NIMHA-01]
Masters: [ elmyra ]
Slaves: [ president ]
clone_color: NIMMS-01 allocation score on elmyra: 5
clone_color: NIMMS-01 allocation score on president: 5
clone_color: NIMHA-01:0 allocation score on elmyra: 205
clone_color: NIMHA-01:0 allocation score on president: 5
clone_color: NIMHA-01:1 allocation score on elmyra: 5
clone_color: NIMHA-01:1 allocation score on president: 205
native_color: NIMHA-01:0 allocation score on elmyra: 205
native_color: NIMHA-01:0 allocation score on president: 5
native_color: NIMHA-01:1 allocation score on elmyra: -INFINITY
native_color: NIMHA-01:1 allocation score on president: 205
NIMHA-01:0 promotion score on elmyra: 14
NIMHA-01:1 promotion score on president: 9
So, am I correct in my assessment that stickiness does not apply to the promotion score?
You are correct for the version you have, but I'm reasonably sure it does for later versions.
Post by brook davis
The 200 value I set the default resource stickiness to seems to be taking affect. Not sure I entirely understand the scoring, or at least the way crm_simulate is representing it, however.
Any insights, ideas, thoughts, help would be much appreciated.
Thanks,
brook
_______________________________________________
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
_______________________________________________
Pacemaker mailing list: ***@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Loading...