[AGENT++] Broken pipe error with subagent

Claus Klein claus.klein at arcormail.de
Sun Nov 25 22:44:55 CET 2012


Hi Frank,

Thanks, that works better at least on my apple.
I will test it tomorrow on our target.

     reqList = new AgentXRequestList(agentx);
     // register requestList for outgoing requests
     mib->set_request_list(reqList);
     mib->set_default_priority(120);

     mib->init());	// fails if master agent is not yet running!
     init(*mib);
     module_init(context, mib);

     Request* req;
     int loopCount = 0;
     do {
         while ((run) && (!mib->get_agentx()->quit())) {
             req = reqList->receive(4 * 1000);  // ms (4 sec)
             if (req) {
                 mib->process_request(req);
             } else {
                 mib->cleanup();

                 if (loopCount++ > 5) {
                     loopCount = 0;
                     // ping the master every 20 sec
                     if (mib->ping_master()) {
                         printf("Ping master agent\n");
                     }
                 }

             }
         }

         printf("Lost Connected to master agent ... \n");
         mib->save_all();
         loopCount = 0;
         // Make sure that all pending set requests which may
         // have locked any resources are terminated and resources
         // are freed before connection to master is reestablished.
         reqList->terminate_set_requests();
         while ((run) && (!mib->init())) {
             printf("Not Connected to master agent ... \n");
             Thread::sleep(2 * 1000);    // ms (2 sec)
         }

         if (!mib->get_agentx()->quit()) {
             printf("Connected to master agent\n");
         }

     } while ((run));


###################################
Connected to master agent
virtual void Agentpp::netSnmpHostsEntry::update(Agentpp::Request*)()
virtual void Agentpp::netSnmpHostsEntry::update(Agentpp::Request*)()
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
Ping master agent
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
Ping master agent
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
Ping master agent
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
Ping master agent
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
20121125.22:22:48: 6426: (1)ERROR  : AgentX: receive unknown error  
(errno): (Connection reset by peer)
20121125.22:22:48: 6426: (1)ERROR  : AgentX: send failed due to socket  
EOF
Ping master agent
20121125.22:22:48: 6426: (1)ERROR  : AgentXSlave: lost connection with  
master
Lost Connected to master agent ...
20121125.22:22:48: 6426: (1)EVENT  : AgentXSlave: connecting on TCP  
(socket)(addr)(port): (3), (127.0.0.1), (705)
20121125.22:22:48: 6426: (1)EVENT  : AgentXSlave: connnected TCP  
(socket)(port): (3), (705)
20121125.22:22:48: 6426: (1)EVENT  : SubAgentXMib: contacting master,  
please wait
20121125.22:22:48: 6426: (1)EVENT  : SubAgentXMib: connected - now  
registering..
20121125.22:22:48: 6426: (1)EVENT  : SubAgentXMib: nodes registering...
Connected to master agent
virtual void Agentpp::netSnmpHostsEntry::update(Agentpp::Request*)()

...

###############################

I found a small issue if the connection could not established with the  
first mib->init() call:
the mib->get_agentx()->quit() returned always false and the first  
while loop does not stop.


This patch helps:




Best regards,
Claus

On 25.11.2012, at 20:36, Frank Fock wrote:

> Hi Claus,
>
> If you need to handle such cases where the broken connection
> cannot be detected through OS events, then you will have
> to use the AgentXPing PDU to periodically check from the
> subagent if the master is still there. If you do not receive
> a response for, let's say three, consecutive ping requests
> then close the session (and connection) and start a new
> session.
>
> Best regards,
> Frank
>
> Am 25.11.2012 10:32, schrieb Claus Klein:
>> Hi Frank,
>>
>> We have also an embedded OS with no SIGPIPE,
>> In case of a closed peer, there is too no error while write to the  
>> socked.
>>
>> This ends with a closed session by the Master agent (net-snmp), but  
>> the subagent
>> does not reregister the MIB.
>>
>>     do {
>>         while((run) && (!mib->get_agentx()->quit())) {
>>             req = reqList->receive(4 * 1000);  // ms (4 sec)
>>             if(req) {
>>                 mib->process_request(req);
>>             } else {
>>                 mib->cleanup();
>>             }
>>         }
>>
>>         mib->save_all();
>>         loopCount = 0;
>>         // Make sure that all pending set requests which may
>>         // have locked any resources are terminated and resources
>>         // are freed before connection to master is reestablished.
>>         reqList->terminate_set_requests();
>>         while((run) && (!mib->init())) {
>> #ifdef _WIN32
>>             Sleep(2 * 1000); // ms
>> #else
>>             sleep(2);  // sec (posix)
>> #endif
>>             if(loopCount++ > 10) {
>>                 loopCount = 0;
>>                 // ping the master every 20 sec
>>                 mib->ping_master();
>>             }
>>         }
>>     } while((run));
>>
>> What can I do? quit() (does not break the 1. while rule)
>>
>> It is not an option to restart the system in this case.
>>
>>
>> Thanks in advance
>> Best Regards
>> Claus
>>
>> On 19.11.2012, at 20:15, Frank Fock wrote:
>>
>>> Hi Claus,
>>>
>>> The signal SIGPIPE does not exists on Windows. So here it has not  
>>> the be
>>> handled.
>>>
>>> Best regards,
>>> Frank
>>>
>>> Am 18.09.2012 06:36, schrieb Claus Klein:
>>>> Hi,
>>>>
>>>> I have found that the SIGPIPE signal can also happens at an agentx
>>>> subagent when the response is delayed and the master agent (net- 
>>>> snmp
>>>> in this case) close the session after timeout:
>>>>
>>>> agentx/master: close transport
>>>> agentx/master: response too late on session 0x39e660
>>>> ...
>>>>
>>>> The subagent examples do not handle this signal?
>>>>
>>>> So I added the following code to the main:
>>>>
>>>> // catch SIGPIPE and ignore it this can occur when the master  
>>>> agent dies
>>>> // and we are trying to send a delayed responce
>>>> #ifndef _WIN32
>>>>  signal(SIGPIPE, SIG_IGN);
>>>> #endif
>>>>
>>>>
>>>> But how can it handled right under Windows?
>>>>
>>>> Can anyone please help?
>>>>
>>>> Thanks in advance
>>>> Best Regards
>>>> Claus
>>>>
>>>>
>>>> _______________________________________________
>>>> AGENTPP mailing list
>>>> AGENTPP at agentpp.org
>>>> http://lists.agentpp.org/mailman/listinfo/agentpp
>>>
>>> -- 
>>> ---
>>> AGENT++
>>> Maximilian-Kolbe-Str. 10
>>> 73257 Koengen, Germany
>>> https://agentpp.com
>>> Phone: +49 7024 8688230
>>> Fax:   +49 7024 8688231
>>>
>>> _______________________________________________
>>> AGENTPP mailing list
>>> AGENTPP at agentpp.org
>>> http://lists.agentpp.org/mailman/listinfo/agentpp
>>
>
> -- 
> ---
> AGENT++
> Maximilian-Kolbe-Str. 10
> 73257 Koengen, Germany
> https://agentpp.com
> Phone: +49 7024 8688230
> Fax:   +49 7024 8688231



More information about the AGENTPP mailing list