[SNMP4J] Problem with timeout on SNMPv3 message

Wed Dec 9 23:55:40 CET 2009

Hello Peter,

Please use SNMP4J 1.10.2 which might have fixed your issue.
See the change log at: http://www.snmp4j.org/CHANGES.txt

Best regards,
Frank

Peter Verthez wrote:
> Hello,
> 
> We are currently using SNMP4J version 1.10.1, and we are facing a
> problem with SNMPv3 messages that never seem to return from the SNMP4J
> library.    We are using synchronous messages with a custom timeout
> model of 8 + 16 + 32 seconds (i.e. 2 retries, with a timeout that
> doubles on each retry).
> 
> The exact circumstance is the following:
> - we are setting some attribute on an agent that will cause the agent to
> restart; this set request is successful
> - directly afterwards we are retrieving some attributes from that agent,
> and typically that should give a timeout (it does in SNMPv2)
> - but in SNMPv3, this agent is at some point responding with an SNMP
> report with the snmpUnknownContexts variable, presumably because it is
> at that moment restarting
> --> due to this, the synchronous message never returns from the SNMP4J
> library
> 
> I have the following tracing: first a Wireshark list with some comments,
> showing the exact sequence (unrelated messages are removed from this list):
> 
> No. Time Source Destination Protocol Info
>       1 0.000000 172.31.110.52 172.21.185.14 SNMP set-request
>                 -- this is the original set request (request id
> 214435902; SNMPv3 msg id 543834127)
>       2 0.237779 172.21.185.14 172.31.110.52 SNMP get-response
>                 -- response on the set request (request id 214435902;
> SNMPv3 msg id 543834127)
>       3 0.423487 172.31.110.52 172.21.185.14 SNMP get-request
>                 -- the get request to reload the object details (request
> id 214435904; SNMPv3 msg id 543834128)
>       4 8.425570 172.31.110.52 172.21.185.14 SNMP get-request
>                 -- first retry on the get request (request id 214435904;
> SNMPv3 msg id 543834129)
>       6 24.431112 172.31.110.52 172.21.185.14 SNMP get-request
>                 -- second retry on the get request (request id
> 214435904; SNMPv3 msg id 543834130)
>       9 31.247375 172.21.185.14 172.31.110.52 SNMP report
>                 -- a report of snmpUnknownContexts = 8 on SNMPv3 msg id
> 543834129
>      11 31.253862 172.21.185.14 172.31.110.52 SNMP report
>                 -- a report of snmpUnknownContexts = 9 on SNMPv3 msg id
> 543834130
> 
> As you see, the first retry is sent after 8 seconds, the second after 8
> + 16 = 24 seconds, and the report that causes the problem is coming at
> 31 seconds (the first report is ignored properly because that SNMPv3
> message already timed out).
> 
> I see the following in our logs after setting org.snmp4j logs to DEBUG
> (this is for the second SNMP report):
> 
> 2009-12-09 15:20:30,912 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.transport.DefaultUdpTransportMapping] 
> 
> Received message from /172.21.185.14/161 with length 112:
> 30:6e:02:01:03:30:11:02:04:20:6a:40:12:02:03:00:ff:ff:04:01:00:02:01:03:04:26:30:24:04:0b:80:00:02:7d:03:00:90:d0:6d:fa:b0:02:01:01:02:02:1e:40:04:0a:61:6d:73:61:6e:74:77:65:72:70:04:00:04:00:30:2e:04:0b:80:00:02:7d:03:00:90:d0:6d:fa:b0:04:02:6e:74:a8:1b:02:01:00:02:01:00:02:01:00:30:10:30:0e:06:09:2b:06:01:06:03:0c:01:05:00:41:01:09 
> 
> 
> 2009-12-09 15:20:30,912 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3] SNMPv3
> header decoded: msgId=543834130, msgMaxSize=65535, msgFlags=00, secModel=3
> 2009-12-09 15:20:30,912 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.security.USM]
> getUser(engineID=80:00:02:7d:03:00:90:d0:6d:fa:b0, securityName=amsantwerp)
> 2009-12-09 15:20:30,912 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3]
> Removed cache entry:
> StateReference[msgID=543834130,pduHandle=PduHandle[214435904],securityEngineID=80:00:02:7d:03:00:90:d0:6d:fa:b0,securityModel=org.snmp4j.security.USM at 1733af8,securityName=amsantwerp,securityLevel=1,contextEngineID=80:00:02:7d:03:00:90:d0:6d:fa:b0,contextName=xvps2] 
> 
> 
> 2009-12-09 15:20:30,995 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3]
> RFC3412 §7.2.10 - Received PDU (msgID=543834130) is a response or an
> internal class message. PduHandle.transactionID = 214435904
> 2009-12-09 15:20:30,995 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3] MPv3
> finished
> 2009-12-09 15:20:30,996 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.Snmp] Searching
> pending request with handlePduHandle[214435904]
> 2009-12-09 15:20:30,996 INFO
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.Snmp] Received
> late report from 172.21.185.14/161 with request ID 0
> 
> 
> This seems to say that the second report is also a report that is not
> in-time, but the report is coming well before the second retry should
> time out.   Is this a bug in SNMP4J?    I tried to trace into the SNMP4J
> code, but I couldn't find anything obvious...
> 
> Note that for the first report, the following is printed, which seems to
> be correct:
> 
> 2009-12-09 15:20:30,912 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.security.USM]
> getUser(engineID=80:00:02:7d:03:00:90:d0:6d:fa:b0, securityName=amsantwerp)
> 2009-12-09 15:20:30,912 DEBUG
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3]
> RFC3412 §7.2.10 - Received PDU (msgID=543834129) is a response or
> internal class message, but cached information for the msgID could not
> be found
> 2009-12-09 15:20:30,912 WARN
> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.MessageDispatcherImpl] 
> 
> noError
> 
> Best regards,
> Peter.
> 

-- 
AGENT++
http://www.agentpp.com
http://www.snmp4j.com
http://www.mibexplorer.com
http://www.mibdesigner.com