[SNMP4J] Problem with timeout on SNMPv3 message

Peter Verthez Peter.Verthez at alcatel-lucent.com
Thu Dec 10 10:17:27 CET 2009


Hello Frank,

The problem is indeed fixed by SNMP4J 1.10.2.

Thanks for your help!

Best regards,
Peter.


Frank Fock wrote:
> Hello Peter,
>
> Please use SNMP4J 1.10.2 which might have fixed your issue.
> See the change log at: http://www.snmp4j.org/CHANGES.txt
>
> Best regards,
> Frank
>
> Peter Verthez wrote:
>   
>> Hello,
>>
>> We are currently using SNMP4J version 1.10.1, and we are facing a
>> problem with SNMPv3 messages that never seem to return from the SNMP4J
>> library.    We are using synchronous messages with a custom timeout
>> model of 8 + 16 + 32 seconds (i.e. 2 retries, with a timeout that
>> doubles on each retry).
>>
>> The exact circumstance is the following:
>> - we are setting some attribute on an agent that will cause the agent to
>> restart; this set request is successful
>> - directly afterwards we are retrieving some attributes from that agent,
>> and typically that should give a timeout (it does in SNMPv2)
>> - but in SNMPv3, this agent is at some point responding with an SNMP
>> report with the snmpUnknownContexts variable, presumably because it is
>> at that moment restarting
>> --> due to this, the synchronous message never returns from the SNMP4J
>> library
>>
>> I have the following tracing: first a Wireshark list with some comments,
>> showing the exact sequence (unrelated messages are removed from this list):
>>
>> No. Time Source Destination Protocol Info
>>       1 0.000000 172.31.110.52 172.21.185.14 SNMP set-request
>>                 -- this is the original set request (request id
>> 214435902; SNMPv3 msg id 543834127)
>>       2 0.237779 172.21.185.14 172.31.110.52 SNMP get-response
>>                 -- response on the set request (request id 214435902;
>> SNMPv3 msg id 543834127)
>>       3 0.423487 172.31.110.52 172.21.185.14 SNMP get-request
>>                 -- the get request to reload the object details (request
>> id 214435904; SNMPv3 msg id 543834128)
>>       4 8.425570 172.31.110.52 172.21.185.14 SNMP get-request
>>                 -- first retry on the get request (request id 214435904;
>> SNMPv3 msg id 543834129)
>>       6 24.431112 172.31.110.52 172.21.185.14 SNMP get-request
>>                 -- second retry on the get request (request id
>> 214435904; SNMPv3 msg id 543834130)
>>       9 31.247375 172.21.185.14 172.31.110.52 SNMP report
>>                 -- a report of snmpUnknownContexts = 8 on SNMPv3 msg id
>> 543834129
>>      11 31.253862 172.21.185.14 172.31.110.52 SNMP report
>>                 -- a report of snmpUnknownContexts = 9 on SNMPv3 msg id
>> 543834130
>>
>> As you see, the first retry is sent after 8 seconds, the second after 8
>> + 16 = 24 seconds, and the report that causes the problem is coming at
>> 31 seconds (the first report is ignored properly because that SNMPv3
>> message already timed out).
>>
>> I see the following in our logs after setting org.snmp4j logs to DEBUG
>> (this is for the second SNMP report):
>>
>> 2009-12-09 15:20:30,912 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.transport.DefaultUdpTransportMapping] 
>>
>> Received message from /172.21.185.14/161 with length 112:
>> 30:6e:02:01:03:30:11:02:04:20:6a:40:12:02:03:00:ff:ff:04:01:00:02:01:03:04:26:30:24:04:0b:80:00:02:7d:03:00:90:d0:6d:fa:b0:02:01:01:02:02:1e:40:04:0a:61:6d:73:61:6e:74:77:65:72:70:04:00:04:00:30:2e:04:0b:80:00:02:7d:03:00:90:d0:6d:fa:b0:04:02:6e:74:a8:1b:02:01:00:02:01:00:02:01:00:30:10:30:0e:06:09:2b:06:01:06:03:0c:01:05:00:41:01:09 
>>
>>
>> 2009-12-09 15:20:30,912 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3] SNMPv3
>> header decoded: msgId=543834130, msgMaxSize=65535, msgFlags=00, secModel=3
>> 2009-12-09 15:20:30,912 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.security.USM]
>> getUser(engineID=80:00:02:7d:03:00:90:d0:6d:fa:b0, securityName=amsantwerp)
>> 2009-12-09 15:20:30,912 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3]
>> Removed cache entry:
>> StateReference[msgID=543834130,pduHandle=PduHandle[214435904],securityEngineID=80:00:02:7d:03:00:90:d0:6d:fa:b0,securityModel=org.snmp4j.security.USM at 1733af8,securityName=amsantwerp,securityLevel=1,contextEngineID=80:00:02:7d:03:00:90:d0:6d:fa:b0,contextName=xvps2] 
>>
>>
>> 2009-12-09 15:20:30,995 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3]
>> RFC3412 §7.2.10 - Received PDU (msgID=543834130) is a response or an
>> internal class message. PduHandle.transactionID = 214435904
>> 2009-12-09 15:20:30,995 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3] MPv3
>> finished
>> 2009-12-09 15:20:30,996 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.Snmp] Searching
>> pending request with handlePduHandle[214435904]
>> 2009-12-09 15:20:30,996 INFO
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.Snmp] Received
>> late report from 172.21.185.14/161 with request ID 0
>>
>>
>> This seems to say that the second report is also a report that is not
>> in-time, but the report is coming well before the second retry should
>> time out.   Is this a bug in SNMP4J?    I tried to trace into the SNMP4J
>> code, but I couldn't find anything obvious...
>>
>> Note that for the first report, the following is printed, which seems to
>> be correct:
>>
>> 2009-12-09 15:20:30,912 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.security.USM]
>> getUser(engineID=80:00:02:7d:03:00:90:d0:6d:fa:b0, securityName=amsantwerp)
>> 2009-12-09 15:20:30,912 DEBUG
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.mp.MPv3]
>> RFC3412 §7.2.10 - Received PDU (msgID=543834129) is a response or
>> internal class message, but cached information for the msgID could not
>> be found
>> 2009-12-09 15:20:30,912 WARN
>> [DefaultUDPTransportMapping_172.31.110.52/0]-[org.snmp4j.MessageDispatcherImpl] 
>>
>> noError
>>
>> Best regards,
>> Peter.
>>
>>     
>
>   


-- 
Peter Verthez
Systems Engineer Network Mgt.
Tel: (+32) 3 240 84 50 | Alcanet:
Fax: (+32) 3 240 84 59 | (6)2605

Alcatel-Lucent Bell NV
Copernicuslaan 50, 2018 Antwerp, Belgium
Fortis 220-0002334-42
VAT BE 0404 621 642 Register of Legal Entities Antwerp

***
This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law.  If you are not the intended recipient, you should delete this message.  Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited without the prior consent of its author.




More information about the SNMP4J mailing list