[SNMP4J] SNMP4J patch contribution

Brett Wooldridge bwooldridge at alterpoint.com
Fri Jan 5 21:49:25 CET 2007


Hi,
   I work for AlterPoint, Inc. (www.alterpoint.com), but am a fulltime open
source developer on the sponsored ZipTie network management project
 (www.ziptie.org).  We¹re using SNMP4J in ZipTie as part of a network
auto-discovery engine, which is how I came across an issue that probably
only manifests itself in certain usage scenarios.  In an auto-discovery
scenario, you execute SNMP query operations against a device to discover
network peers in ARP caches, CDP neighbor tables, etc. and then initiate
similar SNMP queries against those devices.  We actually do this on many
threads against many devices simultaneously.

What we noticed in a profiler was what appeared to be excessive thread
creation and destruction coming from the SNMP4J library.  Further
investigation revealed that each constructed Snmp object instance creates
it¹s own Timer (java.util.Timer), which results in the creation of a Timer
thread and therefore a thread per-Snmp instance.  With hundreds of
simultaneous Snmp instances (all short-lived), this result is a fairly high
load of thread creation and destruction.

We realized it was possible to make a minor change (four lines) to the Snmp
class to reuse one Timer across instances of Snmp.  Timer is designed to
handle a large number of tasks, as noted in the documentation for the class:

   Implementation note: This class scales to large numbers of concurrently
scheduled tasks (thousands should
   present no problem). Internally, it uses a binary heap to represent its
task queue, so the cost to schedule a
   task is O(log n), where n is the number of concurrently scheduled tasks.

   Implementation note: All constructors start a timer thread.

So, now rather than creating and canceling a Timer for each Snmp instance,
the Timer is created once and only tasks (PendingRequest¹s) are cancelled
and purged from the Timer.  This results in much cleaner execution when a
large number of Snmp instances may simultaneously be in existence.

Note that this is not just an issue for short-lived Snmp instances as in our
auto-discovery scenario, but would also present a scalability barrier for
long-lived Snmp instances if they are large in number.  If I wanted to
maintain a pool of 1000 (or even 5000) live Snmp objects, I would need 1000
(or 5000) threads ‹ most of which are idle.  Some Oses will have issues with
such large numbers of threads.  Additionally, Java allocates 1Meg of stack
memory to each thread by default resulting in a 1Gig to 5Gig footprint.
Even setting stack sizes to something like 128Meg would result in a 100Meg
to 500Meg footprint just for pending request threads.  Excessive
context-switching overhead is also an issue.

Sorry for such a long post for a proposed 4-line change, but I wanted to
make sure the issue and merits of the proposal were well understood.  We
would very much like this change to be picked up by SNMP4J as we would
prefer not to maintain a separate patch that requires us to re-patch and
build whenever SNMP4J revs.

Brett Wooldridge
Project Lead
www.ziptie.org

Here is a simple diff of the proposed change (to Snmp.java):

146a147,149
>   // Timer for retrying pending requests
>   private static Timer timer = new Timer("SNMP pending request timer", true);
> 
162,164d164
<   // Timer for retrying pending requests
<   private Timer timer = new Timer(true);
< 
431c431
<     timer.cancel();
---
> 
433a434
>       pending.cancel();
439a441,442
>     timer.purge();
>  


Here is a unified diff of the proposed change:

@@ -144,6 +144,9 @@
 
   private static final LogAdapter logger =
LogFactory.getLogger(Snmp.class);
 
+  // Timer for retrying pending requests
+  private static Timer timer = new Timer("SNMP pending request timer",
true);
+
   // Message processing implementation
   private MessageDispatcher messageDispatcher;
 
@@ -159,9 +162,6 @@
    */
   private Hashtable asyncRequests = new Hashtable(50);
 
-  // Timer for retrying pending requests
-  private Timer timer = new Timer(true);
-
   // Listeners for request and trap PDUs
   private transient Vector commandResponderListeners;
 
@@ -428,15 +428,18 @@
          it.hasNext(); ) {
       ((TransportMapping) it.next()).close();
     }
-    timer.cancel();
+
     for (Iterator it = pendingRequests.values().iterator(); it.hasNext(); )
{
       PendingRequest pending = (PendingRequest) it.next();
+      pending.cancel();
       ResponseEvent e =
           new ResponseEvent(this, null, pending.pdu, null,
pending.userObject,
                             new InterruptedException(
           "Snmp session has been closed"));
       pending.listener.onResponse(e);
     }
+    timer.purge();
+    
     // close all notification listeners
     if (notificationDispatcher != null) {
       notificationDispatcher.closeAll();




More information about the SNMP4J mailing list