[AGENT++] Synchronized mutex_destroy failed in ATM exampleagent

Jean Li-Kam-Tin jean.li-kam-tin at vega.co.uk
Tue May 15 11:01:39 CEST 2007


Hi Jeremy,

On what versions of AGENT++ are your findings based ? Current versions
appear to be :

SNMP++ v3.2.22
AGENT++ v3.5.28a

We are still using :

SNMP++ v3.2.20
AGENT++ v3.5.25

I think I need to recommend an upgrade of our AGENT++ for my project but
it might be interesting to catch the views of other AGENT++ users.

Thanks & regards,

Jean.

-----Original Message-----
From: Jeremy Nimmer [mailto:jnimmer at vanu.com] 
Sent: 14 May 2007 19:05
To: agentpp at agentpp.org
Cc: Jean Li-Kam-Tin
Subject: Re: [AGENT++] Synchronized mutex_destroy failed in ATM
exampleagent

On Mon, 2007-05-14 at 18:35 +0100, Jean Li-Kam-Tin wrote:
> When I compile and run the agent in agent++/examples/atm_mib and then 
> end the program with a Control-C I get a mutex_destroy error (see 
> below).
> 
> Does anyone else get this ? An how did they deal with it ?

Agent(X)++ is buggy.  It doubly-unlocks mutexes, which causes
pthread_mutex_destroy to report this error under certain pthread
implementations.  There are multiple paths through the code that can
cause this issue -- it's not just a simple, single bug.

I have discovered this problem some weeks ago, but hadn't yet found time
to finish writing up my full bug report to send in.  I'll post what I
have so far, though.

Gdb investigation yielded at least two cases:

------------------------------------------------------------------------
------

20070220.15:30:04: 12047: (1)DEBUG  : Synchronized unlock (ptr):
(135336932)
20070220.15:30:04: 12047: (1)DEBUG  : Synchronized unlock (ptr):
(135336932)
20070220.15:30:04: 12047: (2)ERROR  : Synchronized mutex_destroy failed
with
(result)(ptr): (16), (135336932)

(gdb) p/x 135336932
$1 = 0x81113e4

XXXX first unlock
Breakpoint 3, 0x40032a58 in pthread_mutex_unlock () from
/lib/tls/libpthread.so.0
(gdb) bt
#0  0x40032a58 in pthread_mutex_unlock () from /lib/tls/libpthread.so.0
#1  0x0808d3ea in Agentpp::Synchronized::unlock (this=0x81113e4) at
threads.cpp:469
#2  0x0808c000 in Agentpp::ThreadManager::end_synch (this=0x81113e0) at
threads.cpp:150
#3  0x08088292 in Agentpp::Mib::unlock_mib (this=0x81113c8) at
mib.cpp:3211
#4  0x0808689a in ~Mib (this=0x81113c8) at mib.cpp:2725
#5  0x0806d2e0 in ~SubAgentXMib (this=0x81113c8) at
agentx_subagent.cpp:384
#6  0x0804a144 in main ()

XXXX second unlock
Breakpoint 3, 0x40032a58 in pthread_mutex_unlock () from
/lib/tls/libpthread.so.0
(gdb) bt
#0  0x40032a58 in pthread_mutex_unlock () from /lib/tls/libpthread.so.0
#1  0x0808d3ea in Agentpp::Synchronized::unlock (this=0x81113e4) at
threads.cpp:469
#2  0x0808bf18 in ~ThreadManager (this=0x81113e0) at threads.cpp:130
#3  0x08086954 in ~Mib (this=0x81113c8) at mib.cpp:2733
#4  0x0806d2e0 in ~SubAgentXMib (this=0x81113c8) at
agentx_subagent.cpp:384
#5  0x0804a144 in main ()

Then also the same idiom (~ThreadManager) as called from
#2  0x0809cd39 in ~MibContext (this=0x8111500) at mib_context.cpp:308

------------------------------------------------------------------------
------

And the second case:

20070220.15:29:59: 12047: (1)DEBUG  : Synchronized ctor (ptr):
(135347740)
20070220.15:29:59: 12047: (1)DEBUG  : SubAgentXMib: registration success
(oid):
(1.3.6.1.4.1.8072.2.4.1.1.2.0)
20070220.15:29:59: 12047: (1)DEBUG  : Synchronized unlock (ptr):
(135347740)
20070220.15:29:59: 12047: (2)ERROR  : Synchronized mutex_destroy failed
with
(result)(ptr): (16), (135347740)
20070220.15:29:59: 12047: (1)DEBUG  : Synchronized dtor (ptr):
(135347740)

(gdb) p/x 135347740
$1 = 0x8113e1c

(gdb) bt
#0  Agentpp::Synchronized::unlock (this=0x8113f3c) at threads.cpp:462
#1  0x08069154 in Agentpp::AgentXRequest::unlock (this=0x8113cc8) at
agentx_request.cpp:153
#2  0x08068b2b in ~AgentXRequest (this=0x8113cc8) at
agentx_request.cpp:65
#3  0x08069b70 in Agentpp::AgentXRequestList::remove (this=0x8111898,
req=0x8113cc8) at agentx_request.cpp:308
#4  0x0806fa0f in Agentpp::SubAgentXMib::process_response
(this=0x81113c8,
r=0x8113cc8) at agentx_subagent.cpp:878
#5  0x08073741 in Agentpp::AgentXResponse::run (this=0x81139d8) at
agentx_subagent.cpp:55
#6  0x0808e627 in Agentpp::TaskManager::run (this=0x8111c78) at
threads.cpp:799
#7  0x0808d66d in Agentpp::thread_starter (t=0x8111cc4) at
threads.cpp:543
#8  0x40030b63 in start_thread () from /lib/tls/libpthread.so.0
#9  0x401f918a in clone () from /lib/tls/libc.so.6

(gdb) bt
#0  0x0808c9a0 in ~Synchronized (this=0x8113f3c) at threads.cpp:244
#1  0x08068b58 in ~AgentXRequest (this=0x8113cc8) at
agentx_request.cpp:65
#2  0x08069b70 in Agentpp::AgentXRequestList::remove (this=0x8111898,
req=0x8113cc8) at agentx_request.cpp:308
#3  0x0806fa0f in Agentpp::SubAgentXMib::process_response
(this=0x81113c8,
r=0x8113cc8) at agentx_subagent.cpp:878
#4  0x08073741 in Agentpp::AgentXResponse::run (this=0x81139d8) at
agentx_subagent.cpp:55
#5  0x0808e627 in Agentpp::TaskManager::run (this=0x8111c78) at
threads.cpp:799
#6  0x0808d66d in Agentpp::thread_starter (t=0x8111cc4) at
threads.cpp:543
#7  0x40030b63 in start_thread () from /lib/tls/libpthread.so.0
#8  0x401f918a in clone () from /lib/tls/libc.so.6

------------------------------------------------------------------------
------

To work-around this problem, I changed the mutexes to be error-checking,
so that the second unlock has no effect, other than an error return code
from the call -- that is, the mutex is still in a viable state.
Double-unlocking a standard mutex results in unspecified behavior, often
very bad.  Here is the error-checking patch (the mailing list doesn't
accept attachments, so it is inline):

------------------------------------------------------------------------
------

Index: agent++/src/threads.cpp
===================================================================
--- agent++/src/threads.cpp     (revision 63313)
+++ agent++/src/threads.cpp     (revision 63314)
@@ -206,8 +206,26 @@
 #ifdef POSIX_THREADS
        int result;
 
+       pthread_mutexattr_t attr;
+       memset(&attr, 0, sizeof(attr));
+       result = pthread_mutexattr_init(&attr);
+       if (result) {
+               LOG_BEGIN(ERROR_LOG | 0);
+               LOG("Synchronized mutexattr_init failed with (result)");
+               LOG(result);
+               LOG_END;
+       }
+
+       result = pthread_mutexattr_settype(&attr,
PTHREAD_MUTEX_ERRORCHECK);
+       if (result) {
+               LOG_BEGIN(ERROR_LOG | 0);
+               LOG("Synchronized mutexattr_settype failed with
(result)");
+               LOG(result);
+               LOG_END;
+       }
+
        memset(&monitor, 0, sizeof(monitor));
-       result = pthread_mutex_init(&monitor, 0);
+       result = pthread_mutex_init(&monitor, &attr);
        if (result) {
                LOG_BEGIN(ERROR_LOG | 0);
                LOG("Synchronized mutex_init failed with (result)"); @@
-223,6 +241,15 @@
                LOG(result);
                LOG_END;
        }
+
+       result = pthread_mutexattr_destroy(&attr);
+       if (result) {
+               LOG_BEGIN(ERROR_LOG | 0);
+               LOG("Synchronized mutexattr_destroy failed with
(result)");
+               LOG(result);
+               LOG_END;
+       }
+
 #else
 #ifdef WIN32
        // Semaphore initially auto signaled, auto reset mode, unnamed
@@ -421,7 +448,15 @@
 
 void Synchronized::unlock() {
 #ifdef POSIX_THREADS
-       pthread_mutex_unlock(&monitor);
+       int result;
+       result = pthread_mutex_unlock(&monitor);
+       if (result) {
+               LOG_BEGIN(ERROR_LOG | 2);
+               LOG("Synchronized unlock failed with (result)(ptr)");
+               LOG(result);
+               LOG((int)this);
+               LOG_END;
+       }
 #else
 #ifdef WIN32
        isLocked = FALSE;

------------------------------------------------------------------------
------

I haven't tried to go understand the threading and synchronization model
of Agent++ to fix this, perhaps someone else would be interested in
doing so.

In general, I'd also be happy to see AgentX++ support for operating in a
single-threaded environment, working off of select-based event dispatch,
instead of internally launching and/or dispatching requests into worker
threads, outside of the control of the calling code.

Thanks,
- Jeremy





More information about the AGENTPP mailing list