Skip to content

chore(x-goog-spanner-request-id): more updates for batch_write + mockserver tests #1375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

odeke-em
Copy link
Contributor

This change plumbs in some x-goog-spanner-request-id updates for batch_write and some tests too.

Updates #1261

@odeke-em odeke-em requested review from a team as code owners May 18, 2025 07:13
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: spanner Issues related to the googleapis/python-spanner API. labels May 18, 2025
@odeke-em
Copy link
Contributor Author

Kindly help me run the bots on this @olavloite @surbhigarg92 @harshachinta @sakthivelmanii

@odeke-em odeke-em force-pushed the x-goog-request-id-continuation-2-of-3 branch 3 times, most recently from 19e341a to 2cd1502 Compare May 19, 2025 02:35
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels May 19, 2025
@odeke-em odeke-em force-pushed the x-goog-request-id-continuation-2-of-3 branch 3 times, most recently from 8000c2c to 64bd582 Compare May 19, 2025 02:46
@odeke-em odeke-em changed the title chore(x-goog-spanner-request-id): more updates for batch_write chore(x-goog-spanner-request-id): more updates for batch_write + mockserver tests May 19, 2025
@odeke-em odeke-em force-pushed the x-goog-request-id-continuation-2-of-3 branch from 64bd582 to fa77dd1 Compare May 19, 2025 03:19
@olavloite olavloite added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 19, 2025
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 19, 2025
@odeke-em
Copy link
Contributor Author

Kindly help me re-run those bots @olavloite

@olavloite olavloite added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 19, 2025
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 19, 2025
@odeke-em odeke-em force-pushed the x-goog-request-id-continuation-2-of-3 branch from 3af40e4 to 4cd0c31 Compare May 19, 2025 07:20
@olavloite olavloite added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 19, 2025
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 19, 2025
@olavloite olavloite added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 19, 2025
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 19, 2025
@odeke-em odeke-em requested a review from olavloite May 19, 2025 18:30
@odeke-em
Copy link
Contributor Author

@olavloite @surbhigarg92 @sakthivelmanii @harshachinta @rahul2393 kindly help me re-run the bots. Thank you.

@odeke-em odeke-em force-pushed the x-goog-request-id-continuation-2-of-3 branch from 4ac9fef to 422f858 Compare May 19, 2025 18:32
@olavloite olavloite added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 20, 2025
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 20, 2025
@olavloite olavloite added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 20, 2025
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 20, 2025
Comment on lines 279 to 285
allowed_exceptions={
InternalServerError: _check_rst_stream_error,
ServiceUnavailable: _check_unavailable,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not add new retry logic here (and in general also not elsewhere). The general rules for retries are:

  1. For unary RPCs: We rely on Gax handling all retries. This means that any RPC that has been configured as retryable at that level are automatically retried by Gax. We don't need to do anything in the handwritten client library. Currently, the retry configuration specifies that UNAVAILABLE and RESOUCE_EXHAUSTED are retryable error codes. But the client does not need to know, as this is handled by Gax.
  2. For streaming RPCs: The have disabled retries by Gax by not adding any retryable error codes to the configuration. Retries of streaming RPCs therefore need to be handled manually in the handwritten client. Streaming RPCs should be retried in case of an UNAVAILABLE error, OR if it is one of the specific INTERNAL errors. (Note however that it is fair to consider the current retry implementation as complete, even if for example specific INTERNAL errors are currently not retried. Adding support for request-id is not about fixing potentially missing retry logic.)
  3. For transactions, we retry the entire transaction if the transaction is aborted by Spanner. This type of retry is 'irrelevant' for the request-id. With 'irrelevant' I mean that these retries do not need any special handling, and should not lead to the attempt number being increased.


def test_unary_retryable_error(self):
add_select1_result()
add_error(SpannerServicer.BatchCreateSessions.__name__, unavailable_status())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the error code here from INTERNAL to UNAVAILABLE, because:

  1. INTERNAL was not retried before for unary RPCs, and this change should not add retries for that. The specific INTERNAL errors that are retried are only relevant for streaming RPCs.
  2. This PR seems to try to add additional retry handling for UNAVAILABLE errors. That handling is however never used, because the RPC itself is already retried by Gax, and this happens at a lower level. This again causes this test to seem to work, however the attempt number is not being increased during these retries. Because this test initially used INTERNAL to test, the additional retry handler was being invoked, and that did increase the attempt number.

@odeke-em odeke-em force-pushed the x-goog-request-id-continuation-2-of-3 branch from c62dab6 to 092efe5 Compare May 21, 2025 19:35
@odeke-em odeke-em force-pushed the x-goog-request-id-continuation-2-of-3 branch from 3da8c25 to 67d0131 Compare May 21, 2025 20:28
@odeke-em
Copy link
Contributor Author

@olavloite I've added a TODO so that we can firstly merge the code in and then can fully focus on figuring out request-id with unary retries. Kindly please take a look at the existing code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the googleapis/python-spanner API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants