starting a new agent with old persistent data

Dave Mason dmason____transat-tech.com
Sat Apr 26 01:55:30 CEST 2003


Hi Frank,
I believe your thought about a problem with the persistence data write 
is a good bet.  We've had a recurring problem where we send a SIGTERM to 
terminate the agent and it doesnt gracefully exit.  In these cases, the 
agent continues to hang around, and our watchdog script eventually does 
a kill -9 to get rid of it.  In other cases, one or more threads segment 
fault on the way out, though the agent does terminate.  In any case, I 
call save_to_file on the appropriate MibGroup after every SET, so I 
thought the persistence files would be OK.  However, if the agent tries 
to do a final write as part of its shutdown process, and that process 
either doesnt complete or a kill -9 comes along and terminates it, maybe 
the files get corrupted?

I think this is another iteration of a problem I had last year that I 
discussed in my "persistence database causes crash" thread.  Any 
suggestions about what to look for that might cause these shutdown 
problems?  I can send more detail if you like.  Short of that, some way 
to protect the persistence data files would be good too.

Regards,
Dave

Frank Fock wrote:

> Hi Dave,
>
> The persistence files contain the objects (that are serializable)
> of a MibGroup BER encoded. Important for the deserialization
> process is the order of the MIB objects in the group which must
> much the order when the objects have been serialized.
>
> That said, objects can be added at the end of a MibGroup or removed
> from the end of a MibGroup without causing problems with the
> deserializatzion of existing persistence files.
>
> Since the order of the objects in a MibGroup can be freely
> chosen, it should be not a problem to retain compatibility
> with exisiting persistency files (-> if objects have to be
> removed from within the MibGroup, those objects have to do
> their deserialization job and can then be unregistered from
> the Mib instance later before the agent gets online).
>
> Other comments follow inline:
>
> Dave Mason wrote:
>
>> Hi,
>> We have a problem with restarting an agent with old persistence data. 
>> Am I right in assuming that the persistence files are just the object 
>> values written sequentially?  That would explain why, if we add an 
>> object to a MIB and rebuild the agent, object values contain garbage 
>> when we restart.  The strange thing is, we have a problem now where 
>> we can rebuild and run an agent, with the only MIB change being a new 
>> default value for an existing object, and garbage values get loaded. 
>
>
> That's indeed strange. Are you sure that the order of the
> objects added to the MibGroup have not been changed?
>
>> Maybe there is something else affecting the data that gets read.
>>
> It might have been a problem when the data has been written
> too, i.e. a crash during serialization.
>
>> We're still using agent++ v3.5.3a and snmp++ v3.1.6c with no other 
>> observed problems.  Has the persistence mechanism changed since then? 
>
> No.
>
>> Ideally it should try to match up object values with OIDs.  On restart 
>
> Yes, ideally. It would require a little bit more disk space
> and change the current format. Improving the serialization concept
> is one of the things planned for version 3.6.
>
>> it should ignore values for missing OIDs or use a default value for a 
>> new OID that's not in the file.
>>
> Yes, but as described above, this behavior can be already
> achieved.
>
> Best regards,
> Frank
>
>
>




More information about the AGENTPP mailing list