[SNMP4J] help - possible thread-unsafe code in snmp4j

Frank Fock fock at agentpp.com
Sat Feb 2 01:33:48 CET 2013


Hi Syed,

I cannot follow your argumentation. I know that the ++ is not atomic, 
but that is not relevant
in the processReport method, because the value of the incremented 
integer is not read
(after it was incremented).

What could happen is, that the agent sends two identical reports at the 
same time and
SNMP4J (Snmp) processes it concurrently by using a 
MultithreadMessageDispatcher.
Then a two stage engine ID and time discovery would fail, but I cannot 
see why
Snmp should send more than two reports per request.

Best regards,
Frank


Am 02.02.2013 01:04, schrieb Ali, Syed F:
> Hi,
>
> We use snmp4j in one of our applications and while tracking down an issue reported by our customer, I came across some code in snmp4j that I believe may be thread-unsafe and that MIGHT occasionally be causing a device to be flooded with too many GET requests.
>
> Code snippet as follows
> In org.snmp4j.Snmp.java, org.snmp4j.Snmp.ReportProcessor class
>
> 1294 public void processReport(PduHandle handle, CommandResponderEvent e) {
> 1295 PDU pdu = e.getPDU();
> 1296 logger.debug("Searching pending request with handle" + handle);
> ..
> ..
> ...
> ...
> ...
> ..
> 1313 boolean resend = false;
> 1314 if (request.requestStatus < request.maxRequestStatus) {
> 1315 switch (request.requestStatus) {
> 1316 case 0:
> 1317 if (SnmpConstants.usmStatsUnknownEngineIDs.equals(firstOID)) {
> 1318        resend = true;
> 1319 }
> 1320 else if (SnmpConstants.usmStatsNotInTimeWindows.equals(firstOID)) {
> 1321        request.requestStatus++;
> 1322        resend = true;
> 1323 }
> 1324 break;
> 1325 case 1:
> 1326 if (SnmpConstants.usmStatsNotInTimeWindows.equals(firstOID)) {
> 1327        resend = true;
> 1328 }
> 1329 break;
> 1330 }
> 1331 }
> 1332 // if legal report PDU received, then resend request
> 1333 if (resend) {
> 1334 logger.debug("Send new request after report.");
> 1335 request.requestStatus++;
> 1336 try {
> 1337        // We need no callback here because we already have an equivalent
> 1338        // handle registered.
> 1339 PduHandle resentHandle =
> 1340 sendMessage(request.pdu, request.target, e.getTransportMapping(),
> 1341 null);
> 1342 // make sure reference to handle is hold until request is finished,
> 1343 // because otherwise cache information may get lost (WeakHashMap)
> 1344 request.key = resentHandle;
> 1345 }
> 1346 catch (IOException iox) {
> 1347        logger.error("Failed to send message to " + request.target + ": " +
> 1348        iox.getMessage());
> 1349        return;
> 1350 }
> 1351 }
> 1352 else {
> ..
> ....
>
> The value of the resend boolean is based on the value of requestStatus. But the use of request.requestStatus++ is not threadsafe in this method above - because
> a) the ++ operation is not atomic on a primitive integer and
> b) the value is read on one line and modified on another.
> If multiple threads enter the processReport() method in order to process a report for the same request object, they may end up seeing the cached value of 0 over and over again and resending the request several times until the request.requestStatus object value finally hits a memory barrier and the real value is flushed.
> Also, since the value of requestStatus is read first on line 1314, and then incremented either on line 1335, line 1321, if multiple threads got in here for the same request object, you wouldn't get atomic behavior.
>
> The customer reported a scenario (and provided a wireshark capture to prove it) that showed the exact same SNMP GET request (with the same requestID) going out to a particular snmp device approximately 180 times in ~9 seconds. We are using timeout=3000 and retries = 2 - so we expect to see 3 requests in 9 seconds with the same request ID. So in the span of approximately 20 milliseconds, we see about 60 identical GET requests are being issued with the same requestID. I can provide the pcap file showing this behavior if necessary. Also, these GETs are interspersed with many REPORTs coming in from the device so this situation may have been induced by a bug on the agent - however I'm writing to check whether the code should protect itself from such agent bugs. Perhaps I am wrong and this code is synchronized somewhere at a higher layer so that a REPORT for a request with a particular handle can only be processed on a single thread at a time.
>
> Thanks for any assistance / clarification in advance!
>
> Syed F. Ali
>
>
> _______________________________________________
> SNMP4J mailing list
> SNMP4J at agentpp.org
> http://lists.agentpp.org/mailman/listinfo/snmp4j

-- 
---
AGENT++
Maximilian-Kolbe-Str. 10
73257 Koengen, Germany
https://agentpp.com
Phone: +49 7024 8688230
Fax:   +49 7024 8688231





More information about the SNMP4J mailing list