[SNMP4J] snmp v3 timeliness management

Tjip Pasma tjip.pasma at ericsson.com
Tue Apr 1 13:47:57 CEST 2008


Hi

During implementation / test of my snmp management application I ran
into some issues with regards to timeliness management.

1. snmpEngineTime calculations uses "System.currentTimeMillis()" as
reference. 
"System.currentTimeMillis()" is based on system time which may change
forward / backwards at any time (like user setting of time / ntp
activation / daylight saving / ....)
I made the simple testsequences that would 1) snmp.get  2) modify system
time 3) snmp.get
snmp.get is configured with retries=1 and timeout=6s

1a The effect of setting system time backwards in step 2 is that the
snmpEngineTime used in messages will be increased with the time that
systemtime was set backwards. 
This results in a usmStatsNotInTimeWindows report from the agent and the
manager now recovers fine from the usage of the wrong snmpEngineTime

1b The effect of setting system time forward in step 2 is that the
snmpEngineTime used in messages will be decreased with the time that
systemtime was set forward. 
This results in a usmStatsNotInTimeWindows report from the agent but in
this case the manager "ignores" the report from the agent and tries to
retransmit the message, thus ending in a timeout on the api level.
All future communication to the snmp agent will now fail with timeouts. 
The attached wireshark capture shows this communication, the system time
is increased with 160 seconds in between line 6 and 7 in the capture.
 <<systemtimechange_plus160s.pcap>> 

A solution to the above could / might be / should be that calculations
is based on "System.nanoTime" instead of "System.currentTimeMillis()" (I
tried doing this with fine result, but i havnt considered all aspect of
susch a change)



2. Changing system may just as well happen on the agent side as well, so
i decided to make similar test for such cases.
The agent I did such test on, is based on jdmk. It turned out that this
agent implementation also is affected by changes to system time. (the
manager application is still based on snmp4j as above)
The testsequence is similar to before: 1) snmp.get 2) modify time in
agent 3) snmp.get

2a) Agent time is adjusted forwards with some value larger than 150
seconds.
A usmStatsNotInTimeWindows-report is send from the agent to the manager
in step 3 and the manager now recovers fine.

2b) Agent time is adjusted backwards with some value larger than 150
seconds.
Behaviour is similar to 1b and end result is also that all future
communication fails with timeout.
Currently im not able to identify that this has happened since the
communication fails with timeout.
The attached wireshark capture shows this communication, the agents
system time is decreased with ~5minutes between line 6 and 7 in the
capture (and two snmp.gets (retries 1 timeout 6) is made after that 
 <<systemtimechange_plus160s.pcap>> 
Solutions ?
*	Make sure agents isnt affected by changes to system time. 
Hard  to do this as I have to deal with an installed base of these
agents.

*	Extend snmp4j so an errorstatus will be given for such a case
(jdmk has such a detector and gives an errormessage).
When errormessage is received the maanger application would have to
reset snmpEngineTime / snmpEngineBoots for that agent

*	Whenever a timeout occurs i could simply reset snmpEngineTime /
snmpEngineBoots for that agent
Easy workaround, but id rather have that errormessage so i wouldnt have
to wait for timeouts.




Kind Regards
Tjip Pasma
System Engineer
Ericsson



More information about the SNMP4J mailing list