[SNMP4J] max-bindings with big tables

Frank Fock fock at agentpp.com
Wed Jul 18 09:31:00 CEST 2018


Hi Steffen,

The latest SNMP4J-3.0.0-SNAPSHOT contains a getTable method in TableUtils with a new parameter SparseTableMode which can be used to control the behaviour for tables that do not contain Null (not-accessible) cells (= dense tables). 

Even for the situation, that rows appear or disappear during the retrieval, a solution could be implemented. In this case, a GET will fetch the missing columns. If they cannot be fetched, the row is ignored (because it disappeared).

Hope this helps?

Best regards,
Frank


> On 12. Jul 2018, at 08:40, Frank Fock <fock at agentpp.com> wrote:
> 
> Hi Steffen,
> 
> If the agent sends a tooBig error on a GETBULK request, then this is an error in the agent. See RFC3416 4.2.3:
> 	
> 	If the size of the message encapsulating the Response-PDU
>         containing the requested number of variable bindings would be
>         greater than either a local constraint or the maximum message
>         size of the originator, then the response is generated with a
>         lesser number of variable bindings.  This lesser number is the
>         ordered set of variable bindings with some of the variable
>         bindings at the end of the set removed, such that the size of
>         the message encapsulating the Response-PDU is approximately
>         equal to but no greater than either a local constraint or the
>         maximum message size of the originator.  Note that the number
>         of variable bindings removed has no relationship to the values
>         of N, M, or R.
> 
> For the issue you reported, there is no general solution, because it interferes with sparse tables. 
> A solution would either decrease the performance for sparse tables or will filter out sparse rows. 
> The latter is not acceptable for intentionally sparse tables. 
> For dense tables, the filtering could be the best option. Although it would hide new rows although the command generator already detected them.
> 
> I am currently about to add an option for getDenseTable to activate a filtering for new rows that appear during the table retrieval and are therefore incompletely received. Would that help you?
> 
> Best regards,
> Frank 
> 
>> On 9. Jul 2018, at 19:45, Steffen Brüntjen <Steffen.Bruentjen at macmon.eu> wrote:
>> 
>> Hi Frank
>> 
>> Thank you for having a look at it. I agree, the performance with many bindings is indeed *much* higher and yes, values should be retrieved row-by-row in order to avoid data inconsistencies. But there are also problems with many bindings:
>> 
>> 1. Since the agent can not - in the contrast to max-repetition-count - decide how many values to send, the packet size might get too big if you have a table with many (big) columns.
>> 
>> 2. There are agents that get into trouble when many columns are requested. This often results in timeouts (no tooBig error) and then there's no other option to requesting fewer bindings.
>> 
>> Maybe the proposed change is the way to go, it's decent, but effective (I believe).
>> 
>> Best regards
>> Steffen 
>> 
>> 
>> -----Original Message-----
>> From: Frank Fock [mailto:fock at agentpp.com] 
>> Sent: Freitag, 6. Juli 2018 18:55
>> To: Steffen Brüntjen <Steffen.Bruentjen at macmon.eu>
>> Cc: snmp4j at agentpp.org
>> Subject: Re: [SNMP4J] max-bindings with big tables
>> 
>> Hi Steffen,
>> I will try to reproduce this issue. 
>> Independent from the result, the parameters for TableUtils are not suitable for your setup. The maxNumColumnsPerPDU has to be as large as possible. Otherwise the overall performance will be bad and the likelihood of incomplete table rows increases significantly (through changes in the agent while TableUtils operate).
>> Best regards 
>> Frank
>> 
>>> Am 06.07.2018 um 10:20 schrieb Steffen Brüntjen <Steffen.Bruentjen at macmon.eu>:
>>> 
>>> Hi!
>>> 
>>> I'm using SNMP4J version 2.6.2.
>>> 
>>> Best regards
>>> Steffen
>>> 
>>> -----Original Message-----
>>> From: Frank Fock [mailto:fock at agentpp.com] 
>>> Sent: Donnerstag, 5. Juli 2018 19:37
>>> To: Steffen Brüntjen <Steffen.Bruentjen at macmon.eu>
>>> Cc: snmp4j at agentpp.org
>>> Subject: Re: [SNMP4J] max-bindings with big tables
>>> 
>>> Hi Steffen 
>>> What SNMP4J version are you using?
>>> Best regards 
>>> Frank
>>> 
>>>> Am 05.07.2018 um 17:04 schrieb Steffen Brüntjen <Steffen.Bruentjen at macmon.eu>:
>>>> 
>>>> Hi Frank
>>>> 
>>>> I believe I found an issue in the TableUtils class. In certain scenarios, the returned List<TableEvent> from getTable(Target target, OID[] columnOIDs, OID lowerBoundIndex, OID upperBoundIndex) will contain incomplete and duplicate rows.
>>>> 
>>>> 
>>>> Here's an extract of an exemplary List<TableEvent> for a "good" result:
>>>> 
>>>> [1.3.6.1.2.1.31.1.1.1.1.278 = VLAN105, [...], 1.3.6.1.2.1.31.1.1.1.18.278 = service]
>>>> [1.3.6.1.2.1.31.1.1.1.1.279 = VLAN106, [...], 1.3.6.1.2.1.31.1.1.1.18.279 = reception]
>>>> [1.3.6.1.2.1.31.1.1.1.1.283 = VLAN110, [...], 1.3.6.1.2.1.31.1.1.1.18.283 = voice]
>>>> [1.3.6.1.2.1.31.1.1.1.1.373 = VLAN200, [...], 1.3.6.1.2.1.31.1.1.1.18.373 = clients]
>>>> [1.3.6.1.2.1.31.1.1.1.1.774 = VLAN601, [...], 1.3.6.1.2.1.31.1.1.1.18.774 = VLAN601]
>>>> [1.3.6.1.2.1.31.1.1.1.1.783 = VLAN610, [...], 1.3.6.1.2.1.31.1.1.1.18.783 = lab6]
>>>> 
>>>> 
>>>> But in some specific circumstances, I get results like these:
>>>> 
>>>> [ ... 75 normal rows ... ]
>>>> [1.3.6.1.2.1.31.1.1.1.1.278 = VLAN105, [...], 1.3.6.1.2.1.31.1.1.1.18.278 = service] 
>>>> [1.3.6.1.2.1.31.1.1.1.1.279 = VLAN106, [...], 1.3.6.1.2.1.31.1.1.1.18.279 = reception]
>>>> [null, null, null, null, 1.3.6.1.2.1.31.1.1.1.14.283 = 2, 1.3.6.1.2.1.31.1.1.1.15.283 = 0, 1.3.6.1.2.1.31.1.1.1.18.283 = voice]
>>>> [null, null, null, null, 1.3.6.1.2.1.31.1.1.1.14.373 = 2, 1.3.6.1.2.1.31.1.1.1.15.373 = 0, 1.3.6.1.2.1.31.1.1.1.18.373 = clients]
>>>> [null, null, null, null, 1.3.6.1.2.1.31.1.1.1.14.774 = 2, 1.3.6.1.2.1.31.1.1.1.15.774 = 0, 1.3.6.1.2.1.31.1.1.1.18.774 = VLAN601]
>>>> [null, null, null, null, 1.3.6.1.2.1.31.1.1.1.14.783 = 2, 1.3.6.1.2.1.31.1.1.1.15.783 = 0, 1.3.6.1.2.1.31.1.1.1.18.783 = lab6]
>>>> [1.3.6.1.2.1.31.1.1.1.1.283 = VLAN110, 1.3.6.1.2.1.31.1.1.1.17.283 = 2, 1.3.6.1.2.1.31.1.1.1.6.283 = 0, 1.3.6.1.2.1.31.1.1.1.10.283 = 0, null, null, null]
>>>> [1.3.6.1.2.1.31.1.1.1.1.373 = VLAN200, 1.3.6.1.2.1.31.1.1.1.17.373 = 2, 1.3.6.1.2.1.31.1.1.1.6.373 = 0, 1.3.6.1.2.1.31.1.1.1.10.373 = 0, null, null, null]
>>>> [1.3.6.1.2.1.31.1.1.1.1.774 = VLAN601, 1.3.6.1.2.1.31.1.1.1.17.774 = 2, 1.3.6.1.2.1.31.1.1.1.6.774 = 0, 1.3.6.1.2.1.31.1.1.1.10.774 = 0, null, null, null]
>>>> [1.3.6.1.2.1.31.1.1.1.1.783 = VLAN610, 1.3.6.1.2.1.31.1.1.1.17.783 = 2, 1.3.6.1.2.1.31.1.1.1.6.783 = 0, 1.3.6.1.2.1.31.1.1.1.10.783 = 0, null, null, null]
>>>> [ ... everything normal ... ]
>>>> 
>>>> 
>>>> Here we find some rows split into two: One block with the first 4 columns set null, and another block with the last 3 columns set null.
>>>> 
>>>> 
>>>> Here's the setting which produces the second result:
>>>> 
>>>> - max-bindings is set to 4 - TableUtils.setMaxNumColumnsPerPDU(int)
>>>> - max-repetitions is set to 30 - TableUtils.setMaxNumRowsPerPDU(int)
>>>> - the device returns many rows (like 120)
>>>> - the table request contains more columns than max-bindings
>>>> - the table request contains not a multiple of max-bindings
>>>> - the problem will also depend on MTU size, but that's not important here
>>>> 
>>>> 
>>>> This is what happens:
>>>> 
>>>> 1. TableUtils will request the first 4 columns
>>>> 2. device returns 60 variable bindings, that's 15 cells per column
>>>> 3. TableUtils will request the latter 3 columns
>>>> 4. device returns 60 variable bindings, that's 20 cells per column
>>>> 
>>>> This is repeating until all bindings are retrieved. So far, so good. The problem is now, that all second requests (step 3) will receive more rows, and so these requests will reach index 283 (as in the example above) earlier. I did some debugging and I think I found the reason: When the first results with index 283 are received (step 3), TableUtils creates a row for this index. That row is filled up with null values for the first 4 columns so that it's size equals 7 (and not 3). Having size=7, the row is considered finished too soon. TableUtils then prunes these incomplete but finished rows from rowCache. When TableUtils receives the other 4 columns for row 283, it creates a new row with the same index.
>>>> 
>>>> 
>>>> How to fix?
>>>> 
>>>> I believe a moderately easy, but not very good way to fix this is to have the little part contain the first 3 columns, not the remaining last 3 columns:
>>>> 
>>>> max-bindings = 4
>>>> columns: .1, .2, .3, .4, .5, .6, .7
>>>> 1. packet should contain: .1, .2, and .3
>>>> 2. packet should contain: .4, .5, .6, and .7
>>>> 
>>>> Number of columns for the first packet is NumColumnsTotal % maxBindings.
>>>> Number of columns for the other packets is maxBindings.
>>>> 
>>>> 
>>>> Please tell me if you need more information or if my method invocation is wrong.
>>>> 
>>>> 
>>>> Best regards
>>>> Steffen Brüntjen
>>>> _______________________________________________
>>>> SNMP4J mailing list
>>>> SNMP4J at agentpp.org
>>>> https://oosnmp.net/mailman/listinfo/snmp4j
>>> 
>> 
> 
> _______________________________________________
> SNMP4J mailing list
> SNMP4J at agentpp.org
> https://oosnmp.net/mailman/listinfo/snmp4j



More information about the SNMP4J mailing list