Skip to content

Conversation

HaavardRei
Copy link
Contributor

Check whether the connection response parameters both with and without ECRED are within the valid ranges from the Bluetooth Core Specification (part 3.A.4 v6.0).

if (result == BT_L2CAP_LE_SUCCESS) {
if (!L2CAP_LE_CID_IS_DYN(dcid)) {
LOG_ERR("dcid 0x%04x is not dynamic", dcid);
result = BT_L2CAP_LE_ERR_UNACCEPT_PARAMS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretending that the remote peer answered BT_L2CAP_LE_ERR_UNACCEPT_PARAMS may not be optimal.

I'm tempted to say that it would be better to disconnect the ACL connection since there has been a protocol error on the L2CAP signalling channel.

If we don't disconnect, then the peer may believe the L2CAP connection is successful, while our side has closed this channel. But you can argue that the peer invoked undefined behavior so we can leave it hanging.

But telling our application or upper layers (ecred_cb->ecred_conn_rsp) that the peer responded with something different than it actually did is something we should do very carefully. It can make debugging harder.

Even if we didn't propagate result anywhere, clearly this is a just a clever trick to reuse existing code. I think we should avoid clever code if possible. I think it may be a better engineering decision to add new values for result. It may infringe on reserved values in the spec, but I would argue that that is a lesser evil. Or we could make result be a 32-bit integer instead, giving us room for custom codes. But both of these options may be API breaking to a certain extent. I would like some discussion on this.

Since the correct way to handle this case is not obvious, I would like a comment that explains the choice we make here and why it's fine.

Also, if we are to pretend the peer responded with something, then BT_L2CAP_LE_ERR_INVALID_PARAMS would be slightly more appropriate.

Change requested: Please at least use BT_L2CAP_LE_ERR_INVALID_PARAMS and write a comment documenting the engineering decisions. (Alternatively let's continue this discussion.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points!
What if we instead, if any of the checks fail run bt_l2cap_chan_remove(), bt_l2cap_chan_del() and bt_l2cap_chan_disconnect(). Should we/do we need to run l2cap_remove_ident and cancel chan->rtx_work as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to instead return early/do its own disconnecting, which seems fine according to Core 1.E.2.7 (What to do regarding invalid behavior).

Although Core 3.A.4.6 says

Terminating an L2CAP channel requires that an L2CAP_DISCONNECTION_REQ packet be sent and acknowledged by an L2CAP_DISCONNECTION_RSP packet

I'm not quite sure which has precedence (would think the one regarding invalid behavior)?

Copy link
Contributor

@alwa-nordic alwa-nordic Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core 1.E.2.7 allows us to clamp values when that is safe to do:

If a value is out of range, using the nearest permitted value

I think we should clamp MTU, MPS. Then it's up to the peer to disconnect the channel if we send it an unexpected PDU because of that or if the PDU is bigger than it's willing to accept. This is very safe and tosses the hot potato of disconnecting the channel to the peer.

I think we should accept zero credits in a ECRED connection request even if it breaks spec. It is a well defined state that can be reached later in the connection's life and our app-facing API does not differentiate between ECRED and LECRED, so apps have to tolerate zero credits on connection.

In case of a invalid CID, things get more complicated.

I think the easiest safe thing to do is to disconnect the whole ACL connection.

Alternatively we recover the connection to a good state by sending a disconnect request for the newly created channel. If the remote CID is for a fixed channel, there's no state and we are good. If the remote CID is dynamic but aliases an existing channel, then we should disconnect both channels.

Alternatively to disconnecting both channels, we can give all credits (including the credits given in the connect response) to the existing channel. This is because it's much better for UX to have too many credits rather than leaking them. If credits are leaked, a connection can hang. If we have too many credits the peer must close the connection and we fail fast to a good state.

We can ignore the PDU as like explicitly allowed by Core 1.E.2.7, but I think that can be bad UX?

We can close the channel locally and forget the local CID, like the code does as it is. But the network will not have had a chance to recover to a good state, and if re-use the local CID then we are contributing to the confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, changed the pattern to:
Receive request with invalid parameters: Send response with corresponding error code as result parameter
Receive response with invalid parameters: Disconnect the ACL connection

@zephyrbot zephyrbot added area: Bluetooth Host Bluetooth Host (excluding BR/EDR) area: Bluetooth labels Jul 17, 2025
@HaavardRei HaavardRei force-pushed the conn_rsp_validation branch 2 times, most recently from 86ec0c2 to e8881fa Compare July 21, 2025 06:34
@HaavardRei HaavardRei marked this pull request as ready for review July 21, 2025 06:38
@HaavardRei HaavardRei force-pushed the conn_rsp_validation branch 2 times, most recently from 5a458e5 to 0f64b95 Compare July 22, 2025 07:17
PavelVPV
PavelVPV previously approved these changes Jul 24, 2025
@HaavardRei HaavardRei force-pushed the conn_rsp_validation branch 2 times, most recently from 972f427 to 2037238 Compare July 28, 2025 12:59
Thalley
Thalley previously approved these changes Jul 28, 2025
PavelVPV
PavelVPV previously approved these changes Jul 29, 2025
@HaavardRei HaavardRei requested a review from alwa-nordic July 29, 2025 06:45
@HaavardRei HaavardRei dismissed stale reviews from PavelVPV and Thalley via 53a6fc6 August 1, 2025 12:18
@HaavardRei HaavardRei force-pushed the conn_rsp_validation branch from 2037238 to 53a6fc6 Compare August 1, 2025 12:18
Check whether the connection response parameters both with and without
ECRED are within the valid ranges from the Bluetooth Core Specification
(part 3.A.4 v6.0). Changes validation checks in requests to match the
same pattern.

Signed-off-by: Håvard Reierstad <[email protected]>
The many_conn l2cap test used a MTU lower than the minimum permitted
value. This commit bumps it to the minimum (23).

Signed-off-by: Håvard Reierstad <[email protected]>
@HaavardRei HaavardRei force-pushed the conn_rsp_validation branch from 53a6fc6 to 899c0b4 Compare August 1, 2025 12:56
Copy link

sonarqubecloud bot commented Aug 1, 2025

@HaavardRei HaavardRei requested review from PavelVPV and Thalley August 4, 2025 05:39
@fabiobaltieri fabiobaltieri merged commit b799d18 into zephyrproject-rtos:main Aug 4, 2025
28 checks passed
@HaavardRei HaavardRei deleted the conn_rsp_validation branch August 4, 2025 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Bluetooth Host Bluetooth Host (excluding BR/EDR) area: Bluetooth
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants