Unexplained SNMP crash during startup...

Ram Krishnaswamy RKrishnaswamy____pathfire.com
Sat Nov 9 04:25:10 CET 2002


Hi Nick,

I have run into similar crashes on solaris 2.8 running more than one
processor. What you can try to do is comment out all your code and run the
agent. If it does not crash then there is a problem with your code -
basically thread problems. 

Other thing you can try is to bind your agent process to a specific
processor using pbind. It is available on Solaris. I do not know whether it
is available on other Unix OS'es. 

HTH.

Ram
-----Original Message-----
From: Nick Woods
To: agentpp-dl____agentpp.com
Sent: 11/8/2002 5:44 PM
Subject: Unexplained SNMP crash during startup...

For the past month I have been running into some intermittent crashes
(sig 11's) during the startup of my system and mib entry creation.  The
following is a backtrace from my program.  I was getting some different
backtraces for a while but it now consistently seems to have this one
problem.  I have a class derived from Agentpp::MibLeaf that is being
created here.  It often works but will crash when the system seems to
get a little more active.  This never seems to be a problem after a
fresh reboot with no one else on the machine.  This is also a dual CPU
machine running Linux so it can make context switch threading bugs more
apparent.  Is there any state that I need to protect with a mutex when
creating these objects or making any agent++ calls?  Has anyone else run
into anything similar?  I have also tried increasing my thread stack
size AGENTPP_DEFAULT_STACKSIZE to 150000 to see if it would help but it
did not.  I turned on all of the agent++ logging but it gave me no more
useful information.  I never seemed to have this problem before thread
support was added to agent++ and have made no code changes since
upgrading so I'm wondering if there is a possible race condition that I
am exposing, or something else I need to be aware of regarding the
threads in the system.  I am currently running agent++v3.5.6, and
snmp++v3.2.1b.  Any help would be much appreciated.  

#0  0x40a5fb11 in __kill () at __kill:-1
#1  0x4084679b in raise (sig=6) at signals.c:65
#2  0x40a61092 in abort () at ../sysdeps/generic/abort.c:88
#3  0x0805e2a6 in signalHandler (signo=11) at test.cxx:551
#4  0x40846b53 in pthread_sighandler_rt (signo=11, si=0xbfffe070,
uc=0xbfffe0f0) at signals.c:121
#5  <signal handler called>
#6  chunk_alloc (ar_ptr=0x40b66300, nb=49) at malloc.c:2993
#7  0x40ab1828 in __libc_malloc (bytes=40) at malloc.c:2811
#8  0x409f215d in __builtin_new (sz=40) at ../../gcc/cp/new1.cc:-1
#9  0x409f22a0 in __builtin_vec_new (sz=40) at ../../gcc/cp/new2.cc:-1
#10 0x4098a543 in Oid::Oid (this=0x80909c8, oid=@0xbfffe650) at
oid.cpp:132
#11 0x4085c84f in Agentpp::Oidx::Oidx (this=0x80909c8,
_ctor_arg=@0xbfffe650) from
/home/test/dev/wa/cache/publish/lib/libpvmonitor.so
#12 0x408cd32b in Agentpp::MibEntry::MibEntry (this=0x80909a0,
o=@0xbfffe650, a=READONLY) at mib_entry.cpp:128
#13 0x408b61a8 in Agentpp::MibLeaf::MibLeaf (this=0x80909a0,
o=@0xbfffe650, a=READONLY, s=66) at mib.cpp:134

I have stepped through this code and the original calls to create the
Oid (a static string) look fine but it seems like some of the values
might be corrupted by the end although I am not 100% sure.

Thanks,

Nick



More information about the AGENTPP mailing list