[AGENT++] Re: recvfrom's blocking and process grinding to a halt

John McCaskey jmccaskey at gmail.com
Thu Mar 23 00:32:07 CET 2006


Hi Jochen,

On 3/22/06, Jochen Katz <katz at agentpp.com> wrote:
>
> Hi,


<snip>

Each Snmp object opens its own socket (random port if not specified). So
> they cannot interfere with each other and no thread can receive data for
> another thread.


Ahh... I'm stupid.  I wasn't thinking through the UDP send/recv process
correctly, and was not accounting for the fact that each thread has bound a
socket to a specific port for its recvfrom calls.  This makes sense now, and
I agree it should not be causing me issues.

> To take this to the race condition issue if the lock is per snmp object
> > then it seems like it does me no good.  What if 2 threads both do a
> > request to 10.0.0.1 <http://10.0.0.1> and then both select to see if
> > there is data from that address at once (they can do this since they
> > have different locks), each of them sees that there is data, but in
> > reality there is only one datagram waiting for reading.  Now whichever
> > thread gets to recvfrom first gets that datagram and the other blocks.
> > Am I missing something here or is that a potential issue?
>
> This is no issue, as each thread / Snmp object uses it's own socket.
>
> >     Do you get timeouts although the agent sends a reply?
> >
> >
> > It's hard to know (since I'm polling the same agent hundreds of times as
> > fast as I can), but I don't seem to be getting any unexpected timeouts.
> > The threads just block sometimes.
>
> From what you wrote (Each thread has it's own Snmp object, sync
> requests, so no two threads within the event code of the same objects) I
> now can only think of the following:
> select() returns that data has arrived on the socket n. The OS throws
> away the data (some kind of buffer overflow, do you have large Pdus?)
> and the recvfrom() waits because nothing is there. So the solution would
> be to add the MSG_DONTWAIT flag to the recvfrom() call.


I've put in the MSG_DONTWAIT flag, and I've set a check aftewards for an
EAGAIN error on which I will abort and print out an error.  So I'll let it
run a bit and see if that occurs.  Otherwise I'll have to rethink the entire
issue as you do now have me convinced there should be no race condition
between the threads and no reason for the recvfroms to block.

Thanks,

John



More information about the AGENTPP mailing list