persistence database causes crash [Re: varbind management/construction problem]

Dave Mason dmason____transat-tech.com
Wed Oct 30 02:18:13 CET 2002


Hi,
I believe this or something related may be back. :(

A little background: My agent is fairly typical.  I use the persistence 
mechanism provided by agent++, and I added a new thread for trap 
processing.  It receives events through a message queue and generates 
SNMP traps from them.  I use the signal handler provided in the example 
agent to stop all threads.

Everything runs fine if I start the agent with an empty persistence 
directory.  If I start the agent with existing persistence data, it 
starts up fine but sometimes when the trap thread receives an event and 
attempts to build the varbind array, it crashes in a similar way as the 
earlier problem I described.  It's intermittent for me, but another here 
can reproduce it every time he starts the agent with data in the 
persistence directory.

My theory is that somehow the persistence data gets corrupted.  This may 
be most likely if the agent is not shut down cleanly.  Sometimes I 
notice all the threads don't terminate, so you have to kill them with 
"-9".  I wonder if that leaves the persistence files in a bad state?  I 
added MibGroup::start_sync() and end_synch() to protect my 
save_to_file() calls as you described in a recent mail, but it didnt help.

Another theory I have is that this is somehow related to the allocation 
of heap memory.  If the wrong object gets stepped on, it the agent 
crashes in a desctructor when it tries to delete an old value and add a 
new one, for example.  This kind of crash happened all the time until I 
increased AGENTPP_DEFAULT_STACKSIZE.  Now it's only if I have 
persistence data.  Maybe there was some underlying problem that the 
bigger stack helped but didnt solve?

Is there any way to verify the state of the persistence data?  It must 
not be too bad or it wouldnt load, but it is suspicious that everything 
works if there is no persistence data on startup.

Regards,
Dave

Dave Mason wrote:

> Woohoo!  That did it.  The stack size in threads.h was 10000.  I tried 
> 100000 and it ran fine.  Do you have any suggestion as to how to 
> determine the correct stack size?
>
> Thanks - I was really stumped on that one.
> Dave
>
> Jochen Katz wrote:
>
>> Hi,
>>
>> this could be a stack problem. Try to increase 
>> AGENTPP_DEFAULT_STACKSIZE in threads.h. Or did you solve the crash 
>> already?
>>
>> Kind regards,
>>   Jochen
>>
>>>>>> I checked the Vbx methods, and they all appear to delete the OID 
>>>>>> and value if the vb is reassigned, so this looked OK.  The 
>>>>>> problem occurs when the second event arrives.  Vb::set_oid gets 
>>>>>> called to set the new OID, and the crash happens at the delete 
>>>>>> for the old OID.  The crash looks like this:
>>>>>>
>>>>>> (gdb) where
>>>>>> #0  0x40410e46 in chunk_free (ar_ptr=0x404c4620, p=0x80d3998) at 
>>>>>> malloc.c:3242
>>>>>> #1  0x40410bf4 in __libc_free (mem=0x80d39a0) at malloc.c:3154
>>>>>> #2  0x4027a1f6 in __builtin_delete (ptr=0x80d39a0) from 
>>>>>> /usr/lib/libstdc++-libc6.2-2.so.3
>>>>>> #3  0x4027a21f in __builtin_vec_delete (ptr=0x80d39a0) from 
>>>>>> /usr/lib/libstdc++-libc6.2-2.so.3
>>>>>> #4  0x40369d60 in Oid::operator= (this=0x80d0bb8, oid=@0x4094b8cc) 
>>>>>
>>>>>
>>
>>>>>> I tweaked it a few ways, but it always seems to crash in a delete 
>>>>>> operator.  The activateAlarm code only reads the PDU without 
>>>>>> changing it, so I dont suspect it.
>>>>>>
>>>>>> The biggest change I made was to move the new and delete for vbs 
>>>>>> inside the while loop.  In that case, the crash happens in the 
>>>>>> new operator when it's trying to set a new value into an existing 
>>>>>> Vbx:
>>>>>>
>>>>>> (gdb) where
>>>>>> #0  chunk_alloc (ar_ptr=0x404c4620, nb=32) at malloc.c:2983
>>>>>> #1  0x40410028 in __libc_malloc (bytes=24) at malloc.c:2811
>>>>>> #2  0x4027a00d in __builtin_new (sz=24) from 
>>>>>> /usr/lib/libstdc++-libc6.2-2.so.3
>>>>>> #3  0x4036f2a3 in Vb::set_value (this=0x80d3a4c, ptr=0x80c3940 
>>>>>> "descr") at vb.cpp:224
>>>>>
>>>>>
>>
>
>
>
>

-- 
--------------------------------------------------------------------------------
Dave Mason  (817)481-4412 x139 voice, (817)481-4461 fax, dmason at transat-tech.com
Transat Technologies                180 State St, Suite 240, Southlake, TX 76092
Integrating 3GSM and WLAN                            http://www.transat-tech.com
--------------------------------------------------------------------------------






More information about the AGENTPP mailing list