Discussion:
[Pacemaker] pacemaker-1.1.12 - lots of Could not establish cib_ro
Nikola Ciprich
2015-02-09 09:06:00 UTC
Permalink
Hello,

I'd like to ask about following problem that troubles me for some time
and I wan't able to find solution for:

I've got cluster with quite a lot of resources, and when I try to do
multiple operations at time, I get a lot of resource failures (ie
failed starts)

The only related information I was able to find is following snippet of the log:

crmd: notice: process_lrm_event: Operation vmtnv03_start_0: unknown error (node=v1b, call=748, rc=1, cib-update=211, confirmed=true)
crmd: notice: process_lrm_event: v1b-vmtnv03_start_0:748 [ Error: 'Could not establish cib_ro connection: Resource temporarily unavailable (11)'\n ]

The OCF script does not do anything special, start action basically just runs some
python command..

Does somebody have a tip what to do with this problem, or how to debug it further?

my system is latest centos 6, pacemaker-1.1.12-4.el6, cman-3.0.12.1-68.el6,
resource-agents-3.9.5-12.

thanks a lot in advance!

with best regards

nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ***@linuxbox.cz
-------------------------------------
Andrew Beekhof
2015-02-23 01:10:55 UTC
Permalink
Post by Nikola Ciprich
Hello,
I'd like to ask about following problem that troubles me for some time
I've got cluster with quite a lot of resources, and when I try to do
multiple operations at time, I get a lot of resource failures (ie
failed starts)
crmd: notice: process_lrm_event: Operation vmtnv03_start_0: unknown error (node=v1b, call=748, rc=1, cib-update=211, confirmed=true)
crmd: notice: process_lrm_event: v1b-vmtnv03_start_0:748 [ Error: 'Could not establish cib_ro connection: Resource temporarily unavailable (11)'\n ]
The OCF script does not do anything special, start action basically just runs some
python command..
The python command has nothing to do with the cluster and no reason to connect to the cib?
Post by Nikola Ciprich
Does somebody have a tip what to do with this problem, or how to debug it further?
my system is latest centos 6, pacemaker-1.1.12-4.el6, cman-3.0.12.1-68.el6,
resource-agents-3.9.5-12.
thanks a lot in advance!
with best regards
nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava
tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
-------------------------------------
_______________________________________________
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
_______________________________________________
Pacemaker mailing list: ***@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Nikola Ciprich
2015-03-14 06:53:24 UTC
Permalink
Hello Andrew,

I'm really sorry for replying this late..
Post by Andrew Beekhof
The python command has nothing to do with the cluster and no reason to connect to the cib?
well, python script actually executes crm_mon to do some internal sanity checks..
is this problem?

nik
Post by Andrew Beekhof
Post by Nikola Ciprich
Does somebody have a tip what to do with this problem, or how to debug it further?
my system is latest centos 6, pacemaker-1.1.12-4.el6, cman-3.0.12.1-68.el6,
resource-agents-3.9.5-12.
thanks a lot in advance!
with best regards
nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava
tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
-------------------------------------
_______________________________________________
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava

tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ***@linuxbox.cz
-------------------------------------
Andrew Beekhof
2015-03-16 00:38:53 UTC
Permalink
Post by Nikola Ciprich
Hello Andrew,
I'm really sorry for replying this late..
Post by Andrew Beekhof
The python command has nothing to do with the cluster and no reason to connect to the cib?
well, python script actually executes crm_mon to do some internal sanity checks..
is this problem?
It certainly explains the log message.
Do you have a lot of these resources querying the CIB? Perhaps its overloaded
Post by Nikola Ciprich
nik
Post by Andrew Beekhof
Post by Nikola Ciprich
Does somebody have a tip what to do with this problem, or how to debug it further?
my system is latest centos 6, pacemaker-1.1.12-4.el6, cman-3.0.12.1-68.el6,
resource-agents-3.9.5-12.
thanks a lot in advance!
with best regards
nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava
tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
-------------------------------------
_______________________________________________
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava
tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
-------------------------------------
_______________________________________________
Pacemaker mailing list: ***@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Nikola Ciprich
2015-03-18 08:01:30 UTC
Permalink
Hello Andrew,
Post by Andrew Beekhof
It certainly explains the log message.
Do you have a lot of these resources querying the CIB? Perhaps its overloaded
well, it keeps happening when I try to start many those resources in one moment
(by many I mean for example 10), meaning theese execute that many crm_mons at time..

is there some internal cib limit I could increase to prevent those errors?

nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava

tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ***@linuxbox.cz
-------------------------------------
Andrew Beekhof
2015-03-18 09:33:24 UTC
Permalink
Post by Nikola Ciprich
Hello Andrew,
Post by Andrew Beekhof
It certainly explains the log message.
Do you have a lot of these resources querying the CIB? Perhaps its overloaded
well, it keeps happening when I try to start many those resources in one moment
(by many I mean for example 10), meaning theese execute that many crm_mons at time..
is there some internal cib limit I could increase to prevent those errors?
no, you'd need to retry on the client side
Post by Nikola Ciprich
nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava
tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
-------------------------------------
_______________________________________________
Pacemaker mailing list: ***@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Loading...