Skip to content

Fix S3RepositoryAnalysisRestIT #126593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

DaveCTurner
Copy link
Contributor

  • Translate a 404 during a multipart copy into a FileNotFoundException

  • Use multiple threads in S3HttpHandler to avoid CopyObject/PutObject deadlock

Closes #126576

- Translate a 404 during a multipart copy into a `FileNotFoundException`

- Use multiple threads in `S3HttpHandler` to avoid
  `CopyObject`/`PutObject` deadlock

Closes elastic#126576
@DaveCTurner DaveCTurner added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v9.1.0 labels Apr 10, 2025
@DaveCTurner DaveCTurner requested a review from bcully April 10, 2025 10:53
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Apr 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@DaveCTurner
Copy link
Contributor Author

Actually >non-issue since the bug is not released

@DaveCTurner DaveCTurner force-pushed the 2025/04/10/fix-copy-object-repo-analysis branch from e669e4a to ca89288 Compare April 10, 2025 11:04
Copy link
Contributor

@bcully bcully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

Comment on lines +623 to +625
if (e instanceof AmazonServiceException ase && ase.getStatusCode() == RestStatus.NOT_FOUND.getStatus()) {
throw new NoSuchFileException(blobName, null, e.getMessage());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd hit this during my own testing and thought I'd fixed it in copyBlob (so that it would translate for single part as well). I am slapping my head noticing now that executeMultipart is wrapping AmazonClientException in IOException so this doesn't actually kick in on the multipart copy path, but I wonder if it's better to unwrap back in copyBlob, or to hoist the wrapping out of executeMultipart into writeBlob?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with that but it ends up being a bit weird catching a 404 in writeBlob and friends since it doesn't really make sense for a single upload, and only makes sense for multipart uploads if they are aborted concurrently. I'm going to leave it as it is for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'd meant not wrapping AmazonServiceException into IOException in executeMultipart, but rather in writeBlob. That way this 404 handler here could just disappear, since the copyBlob handler would then handle it. But it's a trivial difference really.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I guess you mean that multipart upload (not copy) can also return a 404 (I suppose if the upload itself is cancelled while parts are being uploaded). I don't think we were catching that before, but if that's the issue then I see what you mean (though I don't know if a writeBlob caller would make the distinction).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah maybe we can tidy this up a bit but I'm going to keep this PR focussed on just getting the SDKv2 tests passing again

@DaveCTurner DaveCTurner added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Apr 10, 2025
@elasticsearchmachine elasticsearchmachine merged commit b10b35f into elastic:main Apr 10, 2025
17 checks passed
@DaveCTurner DaveCTurner deleted the 2025/04/10/fix-copy-object-repo-analysis branch April 10, 2025 19:41
bcully pushed a commit to bcully/elasticsearch that referenced this pull request Apr 10, 2025
- Translate a 404 during a multipart copy into a `FileNotFoundException`

- Use multiple threads in `S3HttpHandler` to avoid `CopyObject`/`PutObject` deadlock

Closes elastic#126576

(cherry picked from commit b10b35f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] S3RepositoryAnalysisRestIT testRepositoryAnalysis failing
3 participants