[SNMP4J] Router loop causes all communication to stop

Tjip Pasma tjip.pasma at ericsson.com
Mon Apr 27 15:06:19 CEST 2009


Hi Frank

I spent some time last week to come up with a solution for this.
I have attached my solution (for DefaultUdpTransportMapping) to this mail 
(although this doesnt work for the mailing list ?)

Basically i use a ReadWriteLock which will ensure that no-one is using 
the socket while its being restarted. The new socket will be using the same 
port as the previous socket to avoid to much loss of traffic.

This solves the problem on Windows Vista. 

I did some further investigations of this problem:
Suse Linux Enterprise Server 10 sp2: does not have this bug at all.
Solaris version ??: does not have this bug.
Windows 2003 Server: does not have this bug.

Kind Regards
Tjip Pasma



-----Original Message-----
From: Frank Fock [mailto:fock at agentpp.com] 
Sent: 25. marts 2009 01:21
To: Tjip Pasma
Cc: snmp4j at agentpp.org
Subject: Re: [SNMP4J] Router loop causes all communication to stop

Hi Tjip,

Whether A) (ignore the exception) works is definitely depending on the JDK implementation.

B) should work when you call close() and then
listen() on the DefaultUDPTransportMapping instance. The small gap where there exists no socket could cause a NPE too, of course.

I will think about a clean solution to such problems for the next release...

Best regards,
Frank

Tjip Pasma wrote:
> Hi
>  
> I was building a small snmp test application this friday when i stumpled into this problem. 
> Part of my application is to run a snmp discovery, this was working just fine until i changed the ip range that the discovery should run within.
> Suddenly nothing was working anymore. Using wireshark i discovered that i only was sending to one ip address withinin the range.
> I enabled the snmp4j logging features and notiched this:
>  Socket for transport mapping 
> org.snmp4j.transport.DefaultUdpTransportMapping$ListenThread at 17431b9 
> error: socket closed This exception was thrown from the socket.receive() call. This explained why all further communication was stopped, but i was still clueless to the reason for the exception, until i notiched the icmp message "Time-to-live exceeded" in wireshark (from now on im using "udp port 161 or icmp" as wireshark filter :-).
>  
> This "Time-to-live exceeded" is in this case triggered by a router loop misconfiguration in my network, but i would prefer to be robust to this case. (discovery features is part of our main applications, and router misconfiguration may also happen in a production setup). 
>  
> So far i can only see 2 solutions to this.
> A) Ignore this exception
> Currently the flag "stop" is set to true when this exception is thrown, this causes the listening thread to be shut down and the socket to be closed.
> I did a few test with this and appearently the socket keeps on running, despite that the exception message is "socket closed". 
> This solution worries me since the socket state may be corrupt ?
>  
> B) start new socket.
> As i see it, this would be a larger rewrite of this class, requiering some blocking mechanism while the socket is being replaced.
>  
>  
> Can anyone help with other suggestions to solve this ?
>  
>  
> 
> Best Regards
> 
>  
> 
> Tjip Pasma
> System Engineer
> 
>  
> 
> Ericsson Danmark A/S
> 
> Fælledvej 17
> 7600 Struer
> 
> Denmark
> www.ericsson.com
> 
> Office: +45 97 86 92 45
> Mobile: +45 51 16 71 91
> 
>       Fax: +45 33 88 31 21
> tjip.pasma at ericsson.com <mailto:tjip.pasma at ericsson.com>
> 
> This communication is confidential and intended solely for the addressee(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you believe this message has been sent to you in error, please notify the sender by replying to this transmission and delete the message without disclosing it. Thank you.
> 
> E-mail including attachments is susceptible to data corruption, interception, unauthorized amendment, tampering and viruses, and we only send and receive emails on the basis that we are not liable for any such corruption, interception, amendment, tampering or viruses or any consequences thereof.
>  
> _______________________________________________
> SNMP4J mailing list
> SNMP4J at agentpp.org
> http://lists.agentpp.org/mailman/listinfo/snmp4j

-- 
AGENT++
http://www.agentpp.com
http://www.mibexplorer.com
http://www.mibdesigner.com



More information about the SNMP4J mailing list