[SNMP4J] max-bindings with big tables

Steffen Brüntjen Steffen.Bruentjen at macmon.eu
Thu Jul 5 17:04:58 CEST 2018


Hi Frank

I believe I found an issue in the TableUtils class. In certain scenarios, the returned List<TableEvent> from getTable(Target target, OID[] columnOIDs, OID lowerBoundIndex, OID upperBoundIndex) will contain incomplete and duplicate rows.


Here's an extract of an exemplary List<TableEvent> for a "good" result:

[1.3.6.1.2.1.31.1.1.1.1.278 = VLAN105, [...], 1.3.6.1.2.1.31.1.1.1.18.278 = service]
[1.3.6.1.2.1.31.1.1.1.1.279 = VLAN106, [...], 1.3.6.1.2.1.31.1.1.1.18.279 = reception]
[1.3.6.1.2.1.31.1.1.1.1.283 = VLAN110, [...], 1.3.6.1.2.1.31.1.1.1.18.283 = voice]
[1.3.6.1.2.1.31.1.1.1.1.373 = VLAN200, [...], 1.3.6.1.2.1.31.1.1.1.18.373 = clients]
[1.3.6.1.2.1.31.1.1.1.1.774 = VLAN601, [...], 1.3.6.1.2.1.31.1.1.1.18.774 = VLAN601]
[1.3.6.1.2.1.31.1.1.1.1.783 = VLAN610, [...], 1.3.6.1.2.1.31.1.1.1.18.783 = lab6]


But in some specific circumstances, I get results like these:

[ ... 75 normal rows ... ]
[1.3.6.1.2.1.31.1.1.1.1.278 = VLAN105, [...], 1.3.6.1.2.1.31.1.1.1.18.278 = service] 
[1.3.6.1.2.1.31.1.1.1.1.279 = VLAN106, [...], 1.3.6.1.2.1.31.1.1.1.18.279 = reception]
[null, null, null, null, 1.3.6.1.2.1.31.1.1.1.14.283 = 2, 1.3.6.1.2.1.31.1.1.1.15.283 = 0, 1.3.6.1.2.1.31.1.1.1.18.283 = voice]
[null, null, null, null, 1.3.6.1.2.1.31.1.1.1.14.373 = 2, 1.3.6.1.2.1.31.1.1.1.15.373 = 0, 1.3.6.1.2.1.31.1.1.1.18.373 = clients]
[null, null, null, null, 1.3.6.1.2.1.31.1.1.1.14.774 = 2, 1.3.6.1.2.1.31.1.1.1.15.774 = 0, 1.3.6.1.2.1.31.1.1.1.18.774 = VLAN601]
[null, null, null, null, 1.3.6.1.2.1.31.1.1.1.14.783 = 2, 1.3.6.1.2.1.31.1.1.1.15.783 = 0, 1.3.6.1.2.1.31.1.1.1.18.783 = lab6]
[1.3.6.1.2.1.31.1.1.1.1.283 = VLAN110, 1.3.6.1.2.1.31.1.1.1.17.283 = 2, 1.3.6.1.2.1.31.1.1.1.6.283 = 0, 1.3.6.1.2.1.31.1.1.1.10.283 = 0, null, null, null]
[1.3.6.1.2.1.31.1.1.1.1.373 = VLAN200, 1.3.6.1.2.1.31.1.1.1.17.373 = 2, 1.3.6.1.2.1.31.1.1.1.6.373 = 0, 1.3.6.1.2.1.31.1.1.1.10.373 = 0, null, null, null]
[1.3.6.1.2.1.31.1.1.1.1.774 = VLAN601, 1.3.6.1.2.1.31.1.1.1.17.774 = 2, 1.3.6.1.2.1.31.1.1.1.6.774 = 0, 1.3.6.1.2.1.31.1.1.1.10.774 = 0, null, null, null]
[1.3.6.1.2.1.31.1.1.1.1.783 = VLAN610, 1.3.6.1.2.1.31.1.1.1.17.783 = 2, 1.3.6.1.2.1.31.1.1.1.6.783 = 0, 1.3.6.1.2.1.31.1.1.1.10.783 = 0, null, null, null]
[ ... everything normal ... ]


Here we find some rows split into two: One block with the first 4 columns set null, and another block with the last 3 columns set null.


Here's the setting which produces the second result:

- max-bindings is set to 4 - TableUtils.setMaxNumColumnsPerPDU(int)
- max-repetitions is set to 30 - TableUtils.setMaxNumRowsPerPDU(int)
- the device returns many rows (like 120)
- the table request contains more columns than max-bindings
- the table request contains not a multiple of max-bindings
- the problem will also depend on MTU size, but that's not important here


This is what happens:

1. TableUtils will request the first 4 columns
2. device returns 60 variable bindings, that's 15 cells per column
3. TableUtils will request the latter 3 columns
4. device returns 60 variable bindings, that's 20 cells per column

This is repeating until all bindings are retrieved. So far, so good. The problem is now, that all second requests (step 3) will receive more rows, and so these requests will reach index 283 (as in the example above) earlier. I did some debugging and I think I found the reason: When the first results with index 283 are received (step 3), TableUtils creates a row for this index. That row is filled up with null values for the first 4 columns so that it's size equals 7 (and not 3). Having size=7, the row is considered finished too soon. TableUtils then prunes these incomplete but finished rows from rowCache. When TableUtils receives the other 4 columns for row 283, it creates a new row with the same index.


How to fix?

I believe a moderately easy, but not very good way to fix this is to have the little part contain the first 3 columns, not the remaining last 3 columns:

max-bindings = 4
columns: .1, .2, .3, .4, .5, .6, .7
1. packet should contain: .1, .2, and .3
2. packet should contain: .4, .5, .6, and .7

Number of columns for the first packet is NumColumnsTotal % maxBindings.
Number of columns for the other packets is maxBindings.


Please tell me if you need more information or if my method invocation is wrong.


Best regards
Steffen Brüntjen


More information about the SNMP4J mailing list