[SNMP4J] Threads are being blocked in snmp4j 1.8 RC1

Frank Fock fock at agentpp.com
Tue Jan 23 00:20:07 CET 2007


Hi Jamie,

OK, I got the point. There seems to be indeed a race
condition with sync requests and multiple retries.
Please change the send method of the Snmp class
that corresponds to the method below as follows:

   public ResponseEvent send(PDU pdu, Target target,
                             TransportMapping transport) throws 
IOException {
     if (!pdu.isConfirmedPdu()) {
       sendMessage(pdu, target, transport, null);
       return null;
     }
     SyncResponseListener syncResponse = new SyncResponseListener();
     PendingRequest retryRequest = null;
     synchronized (syncResponse) {
       PduHandle handle = null;
       PendingRequest request =
           new PendingRequest(syncResponse, target, pdu, target, transport);
       handle = sendMessage(pdu, target, transport, request);
       try {
         syncResponse.wait();
         retryRequest = (PendingRequest) pendingRequests.remove(handle);
         if (logger.isDebugEnabled()) {
           logger.debug("Removed pending request with handle: "+handle);
         }
         request.setFinished();
         request.cancel();
       }
       catch (InterruptedException iex) {
         logger.warn(iex);
         // ignore
       }
     }
     if (retryRequest != null) {
       synchronized (retryRequest) {
         retryRequest.setFinished();
         retryRequest.cancel();
       }
     }
     return syncResponse.response;
   }

This should avoid the deadlock.

Nevertheless, I would rethink the design of your discovery
towards using async requests with less threads.

Best regards,
Frank

Jamie Bisotti wrote:

> 
> Frank,
> 
> I work with Varma and I'm looking into this problem as well.  We are 
> doing a network device discovery, and performance is very important 
> (interrogating 65K addresses in 15 minutes is a usecase).  We've played 
> with the size of our thread pool and 250 seems to be the "sweet spot"; 
> many more or many less results in a slower discovery.  To answer one of 
> your questions though, in some cases the code will be executing on a 
> single proc and in some it will be on multi-procs.
> 
> Getting back to the issue at hand, in your last response, you mentioned 
> the internal timer, so I went back to JConsole and here's what I found:
> 
> 
> Name: Timer-1
> State: BLOCKED on org.snmp4j.Snmp$SyncResponseListener at 1d040d2 owned by: 
> pool-1-thread-150
> Total blocked: 1,211  Total waited: 329,080
> 
> Stack trace:
> org.snmp4j.Snmp$SyncResponseListener.onResponse (Unknown Source)
> org.snmp4j.Snmp$PendingRequest.run(Unknown Source)
> java.util.TimerThread.mainLoop(Timer.java:512)
> java.util.TimerThread.run(Timer.java:462)
> 
> 
> As you can see, it is being blocked by pool-1-thread-150; so I took a 
> look at that thread:
> 
> 
> Name: pool-1-thread-150
> State: BLOCKED on org.snmp4j.Snmp$PendingRequest at 838c6e owned by: Timer-1
> Total blocked: 7,622  Total waited: 127,540
> 
> Stack trace:
> org.snmp4j.Snmp.send(Unknown Source)
> org.snmp4j.Snmp.send(Unknown Source)
> ...
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask 
> (ThreadPoolExecutor.java:650)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
> java.lang.Thread.run(Thread.java:595)
> 
> 
> So, Timer-1 is blocked by pool-1-thread-150, which is blocked by 
> Timer-1...dead-lock.
> 
> Any more ideas?  Thanks.
> 
> 
> -- 
> Jamie Bisotti
> 

-- 
AGENT++
http://www.agentpp.com
http://www.mibexplorer.com
http://www.mibdesigner.com




More information about the SNMP4J mailing list