Skip to content

Conversation

@Yakiv-Huryk
Copy link
Contributor

What I did

In the 202511, the tunnel ECN mode for Mellanox platforms has changed from standard to copy_from_outer.
This commit adds a migration that aligns APPL_DB to this change.

How I did it

Added a new migration for the Mellanox platform.

How to verify it

New unit test, manual tests.

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@developfast
Copy link
Contributor

@Yakiv-Huryk please make sure STATE and ASIC db are also aligned along with APPL db

if k in keys_to_migrate:
log.log_info(f"Migrating {k}, ecn mode: {v.get('ecn_mode')} -> copy_from_outer")
v['ecn_mode'] = 'copy_from_outer'
self.appDB.set_entry(table_name, k, v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting the entry in the app db is enough? Will asic catch this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, setting it into APPL_DB is enough, it will be configured in the OrchDaemon::warmRestoreAndSyncUp()

Also, ASIC db is flushed, so it's not relevant:

sonic-db-cli ASIC_DB FLUSHDB > /dev/null

Regarding STATE db, I'm not sure I understand why it's relevant for this fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yakiv-Huryk To clarify, the ask here is that values across STATE, ASIC and APPL DB should be synced before and after warm-boot i.e. here after warm-boot ecn_mode is changing to standard after warm-boot, therefore after a warm-boot you are required to ensure that the ecn_mode value is standard across all 3 of these DBs. We have a db consistency check that we run after warm-boot that will fail if these DBs are not synced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@developfast, I've added a migration for the STATE DB. Regarding the ASIC DB, since it's flushed before migration (and will be filled after), there is nothing to migrate. Please correct me if I'm wrong.

@kperumalbfn kperumalbfn requested a review from vaibhavhd December 2, 2025 19:00
if k in keys_to_migrate:
log.log_info(f"Migrating {k}, ecn mode: {v.get('ecn_mode')} -> copy_from_outer")
v['ecn_mode'] = 'copy_from_outer'
self.appDB.set_entry(table_name, k, v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this trigger set attribute SAI API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't trigger the set attribute API, it changes the parameter of a tunnel that will be created during warm boot restore:
|c|SAI_OBJECT_TYPE_TUNNEL:oid:0x2a000000000b61| SAI_TUNNEL_ATTR_TYPE=SAI_TUNNEL_TYPE_IPINIP|SAI_TUNNEL_ATTR_OVERLAY_INTERFACE=oid:0x6000000000b60| SAI_TUNNEL_ATTR_UNDERLAY_INTERFACE=oid:0x600000000000a| SAI_TUNNEL_ATTR_DECAP_ECN_MODE=SAI_TUNNEL_DECAP_ECN_MODE_COPY_FROM_OUTER| SAI_TUNNEL_ATTR_DECAP_TTL_MODE=SAI_TUNNEL_TTL_MODE_PIPE_MODEL| SAI_TUNNEL_ATTR_DECAP_DSCP_MODE=SAI_TUNNEL_DSCP_MODE_UNIFORM_MODEL

The set attribute flow will happen as a part of ipinip.json processing (in apply_ipinip_subset()). However, by that time, the tunnel will already be created with a new ecn mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kperumalbfn, kind reminder to check this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Yakiv-Huryk, makes sense with the above sairedis log.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if bulk routine is used for the tunnel obect?

Copy link
Contributor Author

@Yakiv-Huryk Yakiv-Huryk Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we update the APPL DB config, I don't see how bulk would cause any difference. Just to remind, the ASIC DB is not migrated, we flush it before the warmboot:

sonic-db-cli ASIC_DB FLUSHDB > /dev/null

@vmittal-msft
Copy link
Contributor

@Yakiv-Huryk please help raise pr for 202511

@Yakiv-Huryk Yakiv-Huryk force-pushed the mlnx-tunnel-ecn-mode-202511 branch from e48554e to e9f569a Compare December 5, 2025 17:59
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Yakiv-Huryk Yakiv-Huryk force-pushed the mlnx-tunnel-ecn-mode-202511 branch from e9f569a to 48659c5 Compare December 5, 2025 18:06
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Yakiv-Huryk Yakiv-Huryk force-pushed the mlnx-tunnel-ecn-mode-202511 branch from 48659c5 to 2d4e5b6 Compare December 8, 2025 12:57
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

In the 202511, the tunnel ecn mode for mlnx platforms has changed from
standard to copy_from_outer. This commit adds a migration that aligns
APPL_DB to this change.

Signed-off-by: Yakiv Huryk <[email protected]>
@Yakiv-Huryk Yakiv-Huryk force-pushed the mlnx-tunnel-ecn-mode-202511 branch from 2d4e5b6 to f18f344 Compare December 8, 2025 14:32
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dprital
Copy link
Collaborator

dprital commented Dec 11, 2025

@kperumalbfn , Can you please approve ?

@developfast
Copy link
Contributor

developfast commented Dec 11, 2025

@Yakiv-Huryk bump on my earlier request to check that STATE, ASIC and APPL db all show ecn mode as copy_from_outer after warm-boot with your changes. This PR cannot be merged if there is a mismatch.

@Yakiv-Huryk
Copy link
Contributor Author

@Yakiv-Huryk bump on my earlier request to check that STATE, ASIC and APPL db all show ecn mode as copy_from_outer after warm-boot with your changes. This PR cannot be merged if there is a mismatch.

@developfast, the APPL DB and STATE DB (as requested) are migrated. The ASIC DB is flushed before the warmboot, and then a new tunnel is created based on the migrated APPL_DB. Could you please be more specific about what is missing? I'm not sure I understand the request.

@developfast
Copy link
Contributor

developfast commented Dec 12, 2025

@Yakiv-Huryk I just need you to confirm the values in the tunnel_decap_table entry for ecn_mode has indeed changed to copy_from_outer after warm-boot to 202511 (maybe wait 1 min after warm-boot). Need the output for the following redis entries:

1. redis-cli -n 0 hgetall 'TUNNEL_DECAP_TABLE:IPINIP_TUNNEL'
2. redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_V6_TUNNEL'
3. redis-cli -n 1 KEYS "*TUNNEL*" <- this will output what to look for in (4)
4. redis-cli -n 1 hgetall "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a000000000bfe" <- not exactly this but will have a similar prefix like ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid with a different OID, you'll get this from step 3
5. redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_V6_TUNNEL'
6. redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_TUNNEL'

@Yakiv-Huryk
Copy link
Contributor Author

@Yakiv-Huryk I just need you to confirm the values in the tunnel_decap_table entry for ecn_mode has indeed changed to copy_from_outer after warm-boot to 202511 (maybe wait 1 min after warm-boot). Need the output for the following redis entries:

1. redis-cli -n 0 hgetall 'TUNNEL_DECAP_TABLE:IPINIP_TUNNEL'
2. redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_V6_TUNNEL'
3. redis-cli -n 1 KEYS "*TUNNEL*" <- this will output what to look for in (4)
4. redis-cli -n 1 hgetall "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a000000000bfe" <- not exactly this but will have a similar prefix like ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid with a different OID, you'll get this from step 3
5. redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_V6_TUNNEL'
6. redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_TUNNEL'

Old image:

// APPL_DB
$ redis-cli -n 0 hgetall 'TUNNEL_DECAP_TABLE:IPINIP_TUNNEL'
 1) "decap_dscp_to_tc_map"
 2) "AZURE"
 3) "dscp_mode"
 4) "pipe"
 5) "ecn_mode"
 6) "standard"
 7) "ttl_mode"
 8) "pipe"
 9) "tunnel_type"
10) "IPINIP"

$ redis-cli -n 0 hgetall "TUNNEL_DECAP_TABLE:IPINIP_V6_TUNNEL"
 1) "decap_dscp_to_tc_map"
 2) "AZURE"
 3) "dscp_mode"
 4) "pipe"
 5) "ecn_mode"
 6) "standard"
 7) "ttl_mode"
 8) "pipe"
 9) "tunnel_type"
10) "IPINIP"

// ASIC_DB
$ redis-cli -n 1 KEYS "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:*"
1) "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a0000000005b1"
2) "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a0000000005b3"

$ redis-cli -n 1 hgetall "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a0000000005b1"
 1) "SAI_TUNNEL_ATTR_TYPE"
 2) "SAI_TUNNEL_TYPE_IPINIP"
 3) "SAI_TUNNEL_ATTR_OVERLAY_INTERFACE"
 4) "oid:0x60000000005b0"
 5) "SAI_TUNNEL_ATTR_UNDERLAY_INTERFACE"
 6) "oid:0x600000000000a"
 7) "SAI_TUNNEL_ATTR_DECAP_ECN_MODE"
 8) "SAI_TUNNEL_DECAP_ECN_MODE_STANDARD"
 9) "SAI_TUNNEL_ATTR_DECAP_TTL_MODE"
10) "SAI_TUNNEL_TTL_MODE_PIPE_MODEL"
11) "SAI_TUNNEL_ATTR_DECAP_DSCP_MODE"
12) "SAI_TUNNEL_DSCP_MODE_PIPE_MODEL"
13) "SAI_TUNNEL_ATTR_DECAP_QOS_DSCP_TO_TC_MAP"
14) "oid:0x140000000005a8"

$ redis-cli -n 1 hgetall "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a0000000005b3"
 1) "SAI_TUNNEL_ATTR_TYPE"
 2) "SAI_TUNNEL_TYPE_IPINIP"
 3) "SAI_TUNNEL_ATTR_OVERLAY_INTERFACE"
 4) "oid:0x60000000005b2"
 5) "SAI_TUNNEL_ATTR_UNDERLAY_INTERFACE"
 6) "oid:0x600000000000a"
 7) "SAI_TUNNEL_ATTR_DECAP_ECN_MODE"
 8) "SAI_TUNNEL_DECAP_ECN_MODE_STANDARD"
 9) "SAI_TUNNEL_ATTR_DECAP_TTL_MODE"
10) "SAI_TUNNEL_TTL_MODE_PIPE_MODEL"
11) "SAI_TUNNEL_ATTR_DECAP_DSCP_MODE"
12) "SAI_TUNNEL_DSCP_MODE_PIPE_MODEL"
13) "SAI_TUNNEL_ATTR_DECAP_QOS_DSCP_TO_TC_MAP"
14) "oid:0x140000000005a8"

// STATE_DB
$ redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_V6_TUNNEL'
1) "tunnel_type"
2) "IPINIP"
3) "dscp_mode"
4) "pipe"
5) "ecn_mode"
6) "standard"
7) "ttl_mode"
8) "pipe"

$ redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_TUNNEL'
1) "tunnel_type"
2) "IPINIP"
3) "dscp_mode"
4) "pipe"
5) "ecn_mode"
6) "standard"
7) "ttl_mode"
8) "pipe"

After warmboot:

// APPL_DB
$ redis-cli -n 0 hgetall 'TUNNEL_DECAP_TABLE:IPINIP_TUNNEL'
 1) "decap_dscp_to_tc_map"
 2) "AZURE"
 3) "dscp_mode"
 4) "pipe"
 5) "ecn_mode"
 6) "copy_from_outer"
 7) "ttl_mode"
 8) "pipe"
 9) "tunnel_type"
10) "IPINIP"

$ redis-cli -n 0 hgetall "TUNNEL_DECAP_TABLE:IPINIP_V6_TUNNEL"
 1) "decap_dscp_to_tc_map"
 2) "AZURE"
 3) "dscp_mode"
 4) "pipe"
 5) "ecn_mode"
 6) "copy_from_outer"
 7) "ttl_mode"
 8) "pipe"
 9) "tunnel_type"
10) "IPINIP"

// ASIC_DB
$ redis-cli -n 1 KEYS "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:*"
1) "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a00000000060f"
2) "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a000000000611"

$ redis-cli -n 1 hgetall "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a00000000060f"
 1) "SAI_TUNNEL_ATTR_TYPE"
 2) "SAI_TUNNEL_TYPE_IPINIP"
 3) "SAI_TUNNEL_ATTR_OVERLAY_INTERFACE"
 4) "oid:0x600000000060e"
 5) "SAI_TUNNEL_ATTR_UNDERLAY_INTERFACE"
 6) "oid:0x600000000000e"
 7) "SAI_TUNNEL_ATTR_DECAP_ECN_MODE"
 8) "SAI_TUNNEL_DECAP_ECN_MODE_COPY_FROM_OUTER"
 9) "SAI_TUNNEL_ATTR_DECAP_TTL_MODE"
10) "SAI_TUNNEL_TTL_MODE_PIPE_MODEL"
11) "SAI_TUNNEL_ATTR_DECAP_DSCP_MODE"
12) "SAI_TUNNEL_DSCP_MODE_PIPE_MODEL"
13) "SAI_TUNNEL_ATTR_DECAP_QOS_DSCP_TO_TC_MAP"
14) "oid:0x14000000000606"

$ redis-cli -n 1 hgetall "ASIC_STATE:SAI_OBJECT_TYPE_TUNNEL:oid:0x2a000000000611"
 1) "SAI_TUNNEL_ATTR_TYPE"
 2) "SAI_TUNNEL_TYPE_IPINIP"
 3) "SAI_TUNNEL_ATTR_OVERLAY_INTERFACE"
 4) "oid:0x6000000000610"
 5) "SAI_TUNNEL_ATTR_UNDERLAY_INTERFACE"
 6) "oid:0x600000000000e"
 7) "SAI_TUNNEL_ATTR_DECAP_ECN_MODE"
 8) "SAI_TUNNEL_DECAP_ECN_MODE_COPY_FROM_OUTER"
 9) "SAI_TUNNEL_ATTR_DECAP_TTL_MODE"
10) "SAI_TUNNEL_TTL_MODE_PIPE_MODEL"
11) "SAI_TUNNEL_ATTR_DECAP_DSCP_MODE"
12) "SAI_TUNNEL_DSCP_MODE_PIPE_MODEL"
13) "SAI_TUNNEL_ATTR_DECAP_QOS_DSCP_TO_TC_MAP"
14) "oid:0x14000000000606"

// STATE DB
$ redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_V6_TUNNEL'
1) "tunnel_type"
2) "IPINIP"
3) "dscp_mode"
4) "pipe"
5) "ecn_mode"
6) "copy_from_outer"
7) "ttl_mode"
8) "pipe"

$ redis-cli -n 6 hgetall 'TUNNEL_DECAP_TABLE|IPINIP_TUNNEL'
1) "tunnel_type"
2) "IPINIP"
3) "dscp_mode"
4) "pipe"
5) "ecn_mode"
6) "copy_from_outer"
7) "ttl_mode"
8) "pipe"

Copy link
Contributor

@developfast developfast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for the output and changes @Yakiv-Huryk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants