[Pacemaker] How to live migrate the kvm vm

Discussion:

邱志刚

2011-12-12 10:22:51 UTC

Hi all,

I have 2-node cluster of pacemaker,I want to migrate the kvm vm with command
"migrate", but I found the vm isn't migrated,
actually it is shutdown and then start on other node. I checked the log and
found the vm is stopped but not migrated.

How could I live migrate the vm ? The configuration :

crm(live)configure# show
node h10_145
node h10_151
primitive test1 ocf:heartbeat:VirtualDomain \
params config="/etc/libvirt/qemu/test1.xml"
hypervisor="qemu:///system" \
meta allow-migrate="ture" priority="100" target-role="Started"
is-managed="true" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="10s" timeout="30s" depth="0" \
op migrate_from interval="0" timeout="120s" \
op migrate_to interval="0" timeout="120"
primitive test2 ocf:heartbeat:VirtualDomain \
params config="/etc/libvirt/qemu/test2.xml"
hypervisor="qemu:///system" \
meta allow-migrate="ture" priority="100" target-role="Started"
is-managed="true" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="20s" timeout="30s" depth="0" \
op migrate_from interval="0" timeout="120s" \
op migrate_to interval="0" timeout="120s"
property $id="cib-bootstrap-options" \
dc-version="1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \
last-lrm-refresh="1323683481" \
symmetric-cluster="true" \
cluster-recheck-interval="1m" \
start-failure-is-fatal="false" \
stonith-enabled="false"
rsc_defaults $id="rsc-options" \
resource-stickiness="1000"
rsc_defaults $id="rsc_defaults-options" \
multiple-active="stop_start"

The log is following when execute migrate command.

Dec 12 18:04:10 h10_145 lrmd: [5520]: info: cancel_op: operation monitor[15]
on ocf::VirtualDomain::test2 for client 5523, its parameters:
hypervisor=[qemu:///system] CRM_meta_depth=[0]
config=[/etc/libvirt/qemu/test2.xml] depth=[0] crm_feature_set=[3.0.2]
CRM_meta_name=[monitor] CRM_meta_timeout=[30000] CRM_meta_interval=[20000]
cancelled
Dec 12 18:04:10 h10_145 crmd: [5523]: info: do_lrm_rsc_op: Performing
key=7:41:0:2673a006-012b-44e3-9329-087245782771 op=test2_stop_0 )
Dec 12 18:04:10 h10_145 lrmd: [5520]: info: rsc:test2:16: stop
Dec 12 18:04:10 h10_145 crmd: [5523]: info: process_lrm_event: LRM operation
test2_monitor_20000 (call=15, status=1, cib-update=0, confirmed=true)
Cancelled
Dec 12 18:04:10 h10_145 cib: [5519]: info: write_cib_contents: Archived
previous version as /var/lib/heartbeat/crm/cib-50.raw
Dec 12 18:04:11 h10_145 cib: [5519]: info: write_cib_contents: Wrote version
0.858.0 of the CIB to disk (digest: ba9e311049d3a3ff19ad12325cf329f5)
Dec 12 18:04:11 h10_145 VirtualDomain[8238]: INFO: Issuing graceful shutdown
request for domain test2.
Dec 12 18:04:11 h10_145 cib: [5519]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xzk7wg (digest:
/var/lib/heartbeat/crm/cib.oKKQ9P)
Dec 12 18:04:11 h10_145 lrmd: [5520]: info: RA output: (test2:stop:stdout)
Domain test2 is being shutdown
Dec 12 18:04:28 h10_145 kernel: sw1: port 2(t2v1) entering disabled state
Dec 12 18:04:28 h10_145 kernel: device t2v1 left promiscuous mode
Dec 12 18:04:28 h10_145 kernel: sw1: port 2(t2v1) entering disabled state
Dec 12 18:04:28 h10_145 kernel: sw1: port 3(t2v2) entering disabled state
Dec 12 18:04:28 h10_145 kernel: device t2v2 left promiscuous mode
Dec 12 18:04:28 h10_145 kernel: sw1: port 3(t2v2) entering disabled state
Dec 12 18:04:29 h10_145 crmd: [5523]: info: process_lrm_event: LRM operation
test2_stop_0 (call=16, rc=0, cib-update=31, confirmed=true) ok

Best Regards,
Jackie

Arnold Krille

2011-12-12 11:52:10 UTC

Permalink

Hi,

Post by é±å¿å
I have 2-node cluster of pacemaker,I want to migrate the kvm vm with
command "migrate", but I found the vm isn't migrated,
actually it is shutdown and then start on other node. I checked the log and
found the vm is stopped but not migrated.
crm(live)configure# show
primitive test1 ocf:heartbeat:VirtualDomain \
params config="/etc/libvirt/qemu/test1.xml"
hypervisor="qemu:///system" \
meta allow-migrate="ture" priority="100" target-role="Started"
is-managed="true" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="10s" timeout="30s" depth="0" \
op migrate_from interval="0" timeout="120s" \
op migrate_to interval="0" timeout="120"

I hope that "ture" is only a typo when writing the email. Otherwise its
probably the reason why your machine stop-start instead of a nice migration.
Try with 'allow-migrate="true"' and see if that helps.

Have fun,

Arnold
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20111212/051f5b12/attachment.sig>

Qiu Zhigang

2011-12-13 04:11:59 UTC

Permalink

Hi,

Thank you, you are right, I correct the 'allow-migrate="true"', but now I found another problem when migrate, migrate failed.
The following is the log.

Dec 13 12:10:03 h10_151 kernel: type=1400 audit(1323749403.251:623): avc: denied { search } for pid=27201 comm="virsh" name="libvirt" dev=dm-0 ino=2098071 scontext=unconfined_u:system_r:corosync_t:s0 tcontext=system_u:object_r:virt_var_run_t:s0 tclass=dir
Dec 13 12:10:04 h10_151 kernel: type=1400 audit(1323749404.067:624): avc: denied { search } for pid=27218 comm="VirtualDomain" name="" dev=0:1c ino=13825028 scontext=unconfined_u:system_r:corosync_t:s0 tcontext=system_u:object_r:nfs_t:s0 tclass=dir
Dec 13 12:10:04 h10_151 kernel: type=1400 audit(1323749404.252:625): avc: denied { read } for pid=27242 comm="virsh" name="random" dev=devtmpfs ino=3585 scontext=unconfined_u:system_r:corosync_t:s0 tcontext=system_u:object_r:random_device_t:s0 tclass=chr_file

[root at h10_145 ~]# crm
crm(live)# status
============
Last updated: Tue Dec 13 12:09:06 2011
Stack: openais
Current DC: h10_145 - partition with quorum
Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ h10_151 h10_145 ]

test2 (ocf::heartbeat:VirtualDomain): Started h10_151 (unmanaged) FAILED
test1 (ocf::heartbeat:VirtualDomain): Started h10_145 (unmanaged) FAILED

Failed actions:
test1_stop_0 (node=h10_145, call=19, rc=1, status=complete): unknown error
test2_stop_0 (node=h10_151, call=14, rc=1, status=complete): unknown error

Best Regards,

-----Original Message-----
From: Arnold Krille [mailto:arnold at arnoldarts.de]
Sent: Monday, December 12, 2011 7:52 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] How to live migrate the kvm vm
Hi,

Post by é±å¿å
I have 2-node cluster of pacemaker,I want to migrate the kvm vm with
command "migrate", but I found the vm isn't migrated, actually it is
shutdown and then start on other node. I checked the log and found the
vm is stopped but not migrated.
crm(live)configure# show
primitive test1 ocf:heartbeat:VirtualDomain \
params config="/etc/libvirt/qemu/test1.xml"
hypervisor="qemu:///system" \
meta allow-migrate="ture" priority="100" target-role="Started"
is-managed="true" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="10s" timeout="30s" depth="0" \
op migrate_from interval="0" timeout="120s" \
op migrate_to interval="0" timeout="120"

Dan Frincu

2011-12-13 08:42:40 UTC

Permalink

Hi,

Post by Qiu Zhigang
Hi,
Thank you, you are right, I correct the 'allow-migrate="true"', but now I found another problem when migrate, migrate failed.
The following is the log.
Dec 13 12:10:03 h10_151 kernel: type=1400 audit(1323749403.251:623): avc: ?denied ?{ search } for ?pid=27201 comm="virsh" name="libvirt" dev=dm-0 ino=2098071 scontext=unconfined_u:system_r:corosync_t:s0 tcontext=system_u:object_r:virt_var_run_t:s0 tclass=dir
Dec 13 12:10:04 h10_151 kernel: type=1400 audit(1323749404.067:624): avc: ?denied ?{ search } for ?pid=27218 comm="VirtualDomain" name="" dev=0:1c ino=13825028 scontext=unconfined_u:system_r:corosync_t:s0 tcontext=system_u:object_r:nfs_t:s0 tclass=dir
Dec 13 12:10:04 h10_151 kernel: type=1400 audit(1323749404.252:625): avc: ?denied ?{ read } for ?pid=27242 comm="virsh" name="random" dev=devtmpfs ino=3585 scontext=unconfined_u:system_r:corosync_t:s0 tcontext=system_u:object_r:random_device_t:s0 tclass=chr_file

You need to take a look at the SELinux context.

Regards,
Dan

Post by Qiu Zhigang
[root at h10_145 ~]# crm
crm(live)# status
============
Last updated: Tue Dec 13 12:09:06 2011
Stack: openais
Current DC: h10_145 - partition with quorum
Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ h10_151 h10_145 ]
?test2 ?(ocf::heartbeat:VirtualDomain): Started h10_151 (unmanaged) FAILED
?test1 ?(ocf::heartbeat:VirtualDomain): Started h10_145 (unmanaged) FAILED
? ?test1_stop_0 (node=h10_145, call=19, rc=1, status=complete): unknown error
? ?test2_stop_0 (node=h10_151, call=14, rc=1, status=complete): unknown error
Best Regards,

Post by é±å¿å
I have 2-node cluster of pacemaker,I want to migrate the kvm vm with
command "migrate", but I found the vm isn't migrated, actually it is
shutdown and then start on other node. I checked the log and found the
vm is stopped but not migrated.
crm(live)configure# show
primitive test1 ocf:heartbeat:VirtualDomain \
? ? params config="/etc/libvirt/qemu/test1.xml"
hypervisor="qemu:///system" \
? ? meta allow-migrate="ture" priority="100" target-role="Started"
is-managed="true" \
? ? op start interval="0" timeout="120s" \
? ? op stop interval="0" timeout="120s" \
? ? op monitor interval="10s" timeout="30s" depth="0" \
? ? op migrate_from interval="0" timeout="120s" \
? ? op migrate_to interval="0" timeout="120"

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

--
Dan Frincu
CCNA, RHCE

Qiu Zhigang

2011-12-13 09:13:16 UTC

Permalink

Hi,

-----Original Message-----
From: Dan Frincu [mailto:df.cluster at gmail.com]
Sent: Tuesday, December 13, 2011 4:43 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] How to live migrate the kvm vm
Hi,

Post by Qiu Zhigang
Hi,
Thank you, you are right, I correct the 'allow-migrate="true"', but now I found

another problem when migrate, migrate failed.

Post by Qiu Zhigang
The following is the log.
avc: denied { search } for pid=27201 comm="virsh" name="libvirt"
dev=dm-0 ino=2098071 scontext=unconfined_u:system_r:corosync_t:s0
tcontext=system_u:object_r:virt_var_run_t:s0 tclass=dir Dec 13
denied { search } for pid=27218 comm="VirtualDomain" name=""
dev=0:1c ino=13825028 scontext=unconfined_u:system_r:corosync_t:s0
tcontext=system_u:object_r:nfs_t:s0 tclass=dir Dec 13 12:10:04 h10_151
kernel: type=1400 audit(1323749404.252:625): avc: denied { read }
for pid=27242 comm="virsh" name="random" dev=devtmpfs ino=3585
scontext=unconfined_u:system_r:corosync_t:s0
tcontext=system_u:object_r:random_device_t:s0 tclass=chr_file

You need to take a look at the SELinux context.
Regards,
Dan

I'm not familiar with SElinux context, but I have disabled selinux .

[root at h10_151 ~]# cat /etc/sysconfig/selinux

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disable
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted

How can I solve this issue, or any other information you need to help me ?

Best Regards,

Post by Qiu Zhigang
[root at h10_145 ~]# crm
crm(live)# status
============
Last updated: Tue Dec 13 12:09:06 2011
Stack: openais
Current DC: h10_145 - partition with quorum
Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ h10_151 h10_145 ]
test2 (ocf::heartbeat:VirtualDomain): Started h10_151 (unmanaged)
FAILED
test1 (ocf::heartbeat:VirtualDomain): Started h10_145 (unmanaged)
FAILED
test1_stop_0 (node=h10_145, call=19, rc=1, status=complete): unknown error
test2_stop_0 (node=h10_151, call=14, rc=1, status=complete): unknown error
Best Regards,

Post by é±å¿å
I have 2-node cluster of pacemaker,I want to migrate the kvm vm
with command "migrate", but I found the vm isn't migrated, actually
it is shutdown and then start on other node. I checked the log and
found the vm is stopped but not migrated.
crm(live)configure# show
primitive test1 ocf:heartbeat:VirtualDomain \
params config="/etc/libvirt/qemu/test1.xml"
hypervisor="qemu:///system" \
meta allow-migrate="ture" priority="100" target-role="Started"
is-managed="true" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="10s" timeout="30s" depth="0" \
op migrate_from interval="0" timeout="120s" \
op migrate_to interval="0" timeout="120"

I hope that "ture" is only a typo when writing the email. Otherwise
its probably the reason why your machine stop-start instead of a nice

migration.

Post by Qiu Zhigang

Try with 'allow-migrate="true"' and see if that helps.
Have fun,
Arnold

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

--
Dan Frincu
CCNA, RHCE
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Dan Frincu

2011-12-13 09:17:57 UTC

Permalink

Hi,

Post by Qiu Zhigang
Hi,

Post by Qiu Zhigang
Hi,
Thank you, you are right, I correct the 'allow-migrate="true"', but now I found

another problem when migrate, migrate failed.

Post by Qiu Zhigang
The following is the log.
avc: ?denied ?{ search } for ?pid=27201 comm="virsh" name="libvirt"
dev=dm-0 ino=2098071 scontext=unconfined_u:system_r:corosync_t:s0
tcontext=system_u:object_r:virt_var_run_t:s0 tclass=dir Dec 13
denied ?{ search } for ?pid=27218 comm="VirtualDomain" name=""
dev=0:1c ino=13825028 scontext=unconfined_u:system_r:corosync_t:s0
tcontext=system_u:object_r:nfs_t:s0 tclass=dir Dec 13 12:10:04 h10_151
kernel: type=1400 audit(1323749404.252:625): avc: ?denied ?{ read }
for ?pid=27242 comm="virsh" name="random" dev=devtmpfs ino=3585
scontext=unconfined_u:system_r:corosync_t:s0
tcontext=system_u:object_r:random_device_t:s0 tclass=chr_file

You need to take a look at the SELinux context.
Regards,
Dan

I'm not familiar with SElinux context, but I have disabled selinux .
[root at h10_151 ~]# cat /etc/sysconfig/selinux
# This file controls the state of SELinux on the system.
# ? ? enforcing - SELinux security policy is enforced.
# ? ? permissive - SELinux prints warnings instead of enforcing.
# ? ? disabled - No SELinux policy is loaded.
SELINUX=disable
# ? ? targeted - Targeted processes are protected,
# ? ? mls - Multi Level Security protection.
SELINUXTYPE=targeted
How can I solve this issue, or any other information you need to help me ?

Try getenforce on both nodes, it should return Disabled. If it doesn't
you need to check that SELinux is disabled on both nodes and then
reboot the nodes.

HTH,
Dan

Post by Qiu Zhigang
Best Regards,

Post by é±å¿å
I have 2-node cluster of pacemaker,I want to migrate the kvm vm
with command "migrate", but I found the vm isn't migrated, actually
it is shutdown and then start on other node. I checked the log and
found the vm is stopped but not migrated.
crm(live)configure# show
primitive test1 ocf:heartbeat:VirtualDomain \
? ? params config="/etc/libvirt/qemu/test1.xml"
hypervisor="qemu:///system" \
? ? meta allow-migrate="ture" priority="100" target-role="Started"
is-managed="true" \
? ? op start interval="0" timeout="120s" \
? ? op stop interval="0" timeout="120s" \
? ? op monitor interval="10s" timeout="30s" depth="0" \
? ? op migrate_from interval="0" timeout="120s" \
? ? op migrate_to interval="0" timeout="120"

I hope that "ture" is only a typo when writing the email. Otherwise
its probably the reason why your machine stop-start instead of a nice

migration.

Post by Qiu Zhigang

Try with 'allow-migrate="true"' and see if that helps.
Have fun,
Arnold

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

--
Dan Frincu
CCNA, RHCE