[AGENT++] Patch to fix Mib::cleanup() and prevent possible deadlock
Claus Klein
claus.klein at arcormail.de
Tue Dec 16 22:09:06 CET 2014
Hi Frank,
after a timeout, the master agent close the connetion.
We see a sigpipe error and close our session too.
Then we call reqList->terminate_set_requests();
After reconnection the masteragent, we may call cleanup(), but the lock at the mibtable from last update() is still active!
As a result, we hang forever at the this point at main loop:
void MibTable::remove_unused_rows()
{
start_synch();
// … the subagent is not longer usable, deadlock, or not?
Best Regards,
Claus
: (4)DEBUG : AgentXSlave: received something on ports
20141216.21:45:50: 10 A3 EA 7B FF 7F 00 00 ...{....
: (2)EVENT : AgentXRequestList: request received (context)(tid)(pid)(siz)(type)(err)(status): (subagent), (37), (39), (1), (8), (0), (0)
20141216.21:45:50: 10 A3 EA 7B FF 7F 00 00 ...{....
: (2)DEBUG : LockQueue: adding lock request (ptr): (140319072986512)
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (2)EVENT : SubAgentXMib: starting thread execution
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (2)EVENT : SubAgentXMib: CLEANUPSET (tid)(pid)(oid)...: (35), (38), (1.3.6.1.4.1.8072.2.2.2.1.5.5.116.101.115.116.50)
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (3)EVENT : Agent: cleaning up set request: (35)
virtual void Agentpp::netSnmpHostsEntry::cleanup_set_request(Agentpp::Request *, int &)
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (2)DEBUG : LockQueue: adding release request (ptr): (140319061504408)
20141216.21:45:50: 00 30 60 03 01 00 00 00 .0`.....
: (8)DEBUG : Synchronized: trylock success (id): (2187)
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (2)DEBUG : LockQueue: adding release request (ptr): (140319064617872)
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (8)DEBUG : Synchronized: trylock success (id): (244)
20141216.21:45:50: 10 A3 EA 7B FF 7F 00 00 ...{....
: (1)DEBUG : TaskManager: task manager found
20141216.21:45:50: 10 A3 EA 7B FF 7F 00 00 ...{....
: (2)DEBUG : TaskManager: after notify
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (2)EVENT : SubAgentXMib: starting thread execution
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (2)EVENT : SubAgentXMib: TESTSET (tid)(pid)(oid)...: (37), (39), (1.3.6.1.4.1.8072.2.2.2.1.5.5.116.101.115.116.50)
20141216.21:45:50: 00 10 59 03 01 00 00 00 ..Y.....
: (3)EVENT : Agent: preparing set request: (37)
20141216.21:45:50: 10 A3 EA 7B FF 7F 00 00 ...{....
: (4)DEBUG : AgentXSlave: received something on ports
20141216.21:45:50: 10 A3 EA 7B FF 7F 00 00 ...{....
: (1)ERROR : AgentXSlave: lost connection with master
20141216.21:45:50: 10 A3 EA 7B FF 7F 00 00 ...{....
: (2)EVENT : Mib::cleanup()
virtual void Agentpp::netSnmpHostsEntry::update(Agentpp::Request *)
20141216.21:45:55: 00 10 59 03 01 00 00 00 ..Y.....
: (2)DEBUG : LockQueue: adding lock request (ptr): (140319061504408)
20141216.21:45:55: 00 30 60 03 01 00 00 00 .0`.....
: (8)DEBUG : Synchronized: trylock success (id): (213)
virtual int Agentpp::netSnmpHostsEntry::prepare_set_request(Agentpp::Request *, int &)
20141216.21:45:55: 00 10 59 03 01 00 00 00 ..Y.....
: (4)EVENT : RequestListAgentX: request answered (id)(status)(tid)(err)(removed)(sz): (37), (257), (37), (12), (0), (1)
20141216.21:45:55: 00 10 59 03 01 00 00 00 ..Y.....
: (2)DEBUG : LockQueue: adding release request (ptr): (140319072986512)
20141216.21:45:55: 00 10 59 03 01 00 00 00 ..Y.....
: (2)EVENT : SubAgentXMib: finished thread execution
On 16.12.2014, at 00:22, Frank Fock <fock at agentpp.com> wrote:
> Hi Claus,
>
> It is not a deadlock, because when you continue (end the sleep) everything
> works again. It is simply a global lock which is necessary at that point.
>
> Best regards,
> Frank
More information about the AGENTPP
mailing list