[AGENT++] Synchronized mutex_destroy failed in ATM exampleagent
Jean Li-Kam-Tin
jean.li-kam-tin at vega.co.uk
Tue May 15 11:01:39 CEST 2007
Hi Jeremy,
On what versions of AGENT++ are your findings based ? Current versions
appear to be :
SNMP++ v3.2.22
AGENT++ v3.5.28a
We are still using :
SNMP++ v3.2.20
AGENT++ v3.5.25
I think I need to recommend an upgrade of our AGENT++ for my project but
it might be interesting to catch the views of other AGENT++ users.
Thanks & regards,
Jean.
-----Original Message-----
From: Jeremy Nimmer [mailto:jnimmer at vanu.com]
Sent: 14 May 2007 19:05
To: agentpp at agentpp.org
Cc: Jean Li-Kam-Tin
Subject: Re: [AGENT++] Synchronized mutex_destroy failed in ATM
exampleagent
On Mon, 2007-05-14 at 18:35 +0100, Jean Li-Kam-Tin wrote:
> When I compile and run the agent in agent++/examples/atm_mib and then
> end the program with a Control-C I get a mutex_destroy error (see
> below).
>
> Does anyone else get this ? An how did they deal with it ?
Agent(X)++ is buggy. It doubly-unlocks mutexes, which causes
pthread_mutex_destroy to report this error under certain pthread
implementations. There are multiple paths through the code that can
cause this issue -- it's not just a simple, single bug.
I have discovered this problem some weeks ago, but hadn't yet found time
to finish writing up my full bug report to send in. I'll post what I
have so far, though.
Gdb investigation yielded at least two cases:
------------------------------------------------------------------------
------
20070220.15:30:04: 12047: (1)DEBUG : Synchronized unlock (ptr):
(135336932)
20070220.15:30:04: 12047: (1)DEBUG : Synchronized unlock (ptr):
(135336932)
20070220.15:30:04: 12047: (2)ERROR : Synchronized mutex_destroy failed
with
(result)(ptr): (16), (135336932)
(gdb) p/x 135336932
$1 = 0x81113e4
XXXX first unlock
Breakpoint 3, 0x40032a58 in pthread_mutex_unlock () from
/lib/tls/libpthread.so.0
(gdb) bt
#0 0x40032a58 in pthread_mutex_unlock () from /lib/tls/libpthread.so.0
#1 0x0808d3ea in Agentpp::Synchronized::unlock (this=0x81113e4) at
threads.cpp:469
#2 0x0808c000 in Agentpp::ThreadManager::end_synch (this=0x81113e0) at
threads.cpp:150
#3 0x08088292 in Agentpp::Mib::unlock_mib (this=0x81113c8) at
mib.cpp:3211
#4 0x0808689a in ~Mib (this=0x81113c8) at mib.cpp:2725
#5 0x0806d2e0 in ~SubAgentXMib (this=0x81113c8) at
agentx_subagent.cpp:384
#6 0x0804a144 in main ()
XXXX second unlock
Breakpoint 3, 0x40032a58 in pthread_mutex_unlock () from
/lib/tls/libpthread.so.0
(gdb) bt
#0 0x40032a58 in pthread_mutex_unlock () from /lib/tls/libpthread.so.0
#1 0x0808d3ea in Agentpp::Synchronized::unlock (this=0x81113e4) at
threads.cpp:469
#2 0x0808bf18 in ~ThreadManager (this=0x81113e0) at threads.cpp:130
#3 0x08086954 in ~Mib (this=0x81113c8) at mib.cpp:2733
#4 0x0806d2e0 in ~SubAgentXMib (this=0x81113c8) at
agentx_subagent.cpp:384
#5 0x0804a144 in main ()
Then also the same idiom (~ThreadManager) as called from
#2 0x0809cd39 in ~MibContext (this=0x8111500) at mib_context.cpp:308
------------------------------------------------------------------------
------
And the second case:
20070220.15:29:59: 12047: (1)DEBUG : Synchronized ctor (ptr):
(135347740)
20070220.15:29:59: 12047: (1)DEBUG : SubAgentXMib: registration success
(oid):
(1.3.6.1.4.1.8072.2.4.1.1.2.0)
20070220.15:29:59: 12047: (1)DEBUG : Synchronized unlock (ptr):
(135347740)
20070220.15:29:59: 12047: (2)ERROR : Synchronized mutex_destroy failed
with
(result)(ptr): (16), (135347740)
20070220.15:29:59: 12047: (1)DEBUG : Synchronized dtor (ptr):
(135347740)
(gdb) p/x 135347740
$1 = 0x8113e1c
(gdb) bt
#0 Agentpp::Synchronized::unlock (this=0x8113f3c) at threads.cpp:462
#1 0x08069154 in Agentpp::AgentXRequest::unlock (this=0x8113cc8) at
agentx_request.cpp:153
#2 0x08068b2b in ~AgentXRequest (this=0x8113cc8) at
agentx_request.cpp:65
#3 0x08069b70 in Agentpp::AgentXRequestList::remove (this=0x8111898,
req=0x8113cc8) at agentx_request.cpp:308
#4 0x0806fa0f in Agentpp::SubAgentXMib::process_response
(this=0x81113c8,
r=0x8113cc8) at agentx_subagent.cpp:878
#5 0x08073741 in Agentpp::AgentXResponse::run (this=0x81139d8) at
agentx_subagent.cpp:55
#6 0x0808e627 in Agentpp::TaskManager::run (this=0x8111c78) at
threads.cpp:799
#7 0x0808d66d in Agentpp::thread_starter (t=0x8111cc4) at
threads.cpp:543
#8 0x40030b63 in start_thread () from /lib/tls/libpthread.so.0
#9 0x401f918a in clone () from /lib/tls/libc.so.6
(gdb) bt
#0 0x0808c9a0 in ~Synchronized (this=0x8113f3c) at threads.cpp:244
#1 0x08068b58 in ~AgentXRequest (this=0x8113cc8) at
agentx_request.cpp:65
#2 0x08069b70 in Agentpp::AgentXRequestList::remove (this=0x8111898,
req=0x8113cc8) at agentx_request.cpp:308
#3 0x0806fa0f in Agentpp::SubAgentXMib::process_response
(this=0x81113c8,
r=0x8113cc8) at agentx_subagent.cpp:878
#4 0x08073741 in Agentpp::AgentXResponse::run (this=0x81139d8) at
agentx_subagent.cpp:55
#5 0x0808e627 in Agentpp::TaskManager::run (this=0x8111c78) at
threads.cpp:799
#6 0x0808d66d in Agentpp::thread_starter (t=0x8111cc4) at
threads.cpp:543
#7 0x40030b63 in start_thread () from /lib/tls/libpthread.so.0
#8 0x401f918a in clone () from /lib/tls/libc.so.6
------------------------------------------------------------------------
------
To work-around this problem, I changed the mutexes to be error-checking,
so that the second unlock has no effect, other than an error return code
from the call -- that is, the mutex is still in a viable state.
Double-unlocking a standard mutex results in unspecified behavior, often
very bad. Here is the error-checking patch (the mailing list doesn't
accept attachments, so it is inline):
------------------------------------------------------------------------
------
Index: agent++/src/threads.cpp
===================================================================
--- agent++/src/threads.cpp (revision 63313)
+++ agent++/src/threads.cpp (revision 63314)
@@ -206,8 +206,26 @@
#ifdef POSIX_THREADS
int result;
+ pthread_mutexattr_t attr;
+ memset(&attr, 0, sizeof(attr));
+ result = pthread_mutexattr_init(&attr);
+ if (result) {
+ LOG_BEGIN(ERROR_LOG | 0);
+ LOG("Synchronized mutexattr_init failed with (result)");
+ LOG(result);
+ LOG_END;
+ }
+
+ result = pthread_mutexattr_settype(&attr,
PTHREAD_MUTEX_ERRORCHECK);
+ if (result) {
+ LOG_BEGIN(ERROR_LOG | 0);
+ LOG("Synchronized mutexattr_settype failed with
(result)");
+ LOG(result);
+ LOG_END;
+ }
+
memset(&monitor, 0, sizeof(monitor));
- result = pthread_mutex_init(&monitor, 0);
+ result = pthread_mutex_init(&monitor, &attr);
if (result) {
LOG_BEGIN(ERROR_LOG | 0);
LOG("Synchronized mutex_init failed with (result)"); @@
-223,6 +241,15 @@
LOG(result);
LOG_END;
}
+
+ result = pthread_mutexattr_destroy(&attr);
+ if (result) {
+ LOG_BEGIN(ERROR_LOG | 0);
+ LOG("Synchronized mutexattr_destroy failed with
(result)");
+ LOG(result);
+ LOG_END;
+ }
+
#else
#ifdef WIN32
// Semaphore initially auto signaled, auto reset mode, unnamed
@@ -421,7 +448,15 @@
void Synchronized::unlock() {
#ifdef POSIX_THREADS
- pthread_mutex_unlock(&monitor);
+ int result;
+ result = pthread_mutex_unlock(&monitor);
+ if (result) {
+ LOG_BEGIN(ERROR_LOG | 2);
+ LOG("Synchronized unlock failed with (result)(ptr)");
+ LOG(result);
+ LOG((int)this);
+ LOG_END;
+ }
#else
#ifdef WIN32
isLocked = FALSE;
------------------------------------------------------------------------
------
I haven't tried to go understand the threading and synchronization model
of Agent++ to fix this, perhaps someone else would be interested in
doing so.
In general, I'd also be happy to see AgentX++ support for operating in a
single-threaded environment, working off of select-based event dispatch,
instead of internally launching and/or dispatching requests into worker
threads, outside of the control of the calling code.
Thanks,
- Jeremy
More information about the AGENTPP
mailing list