Skip to content

container restarts on upstream connection resets #152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mvording opened this issue Jul 17, 2023 · 2 comments
Closed

container restarts on upstream connection resets #152

mvording opened this issue Jul 17, 2023 · 2 comments

Comments

@mvording
Copy link

mvording commented Jul 17, 2023

Describe the bug
When doing some load testing, the upstream S3 gateway system started returning connection resets due to rate limiting.

This error doesn't seem to be caught gracefully by the gateway nginx javascript , which likely caused the nginx process to exit and the container to restart as a result.

To Reproduce
Steps to reproduce the behavior:

  1. Start container
  2. Configure against a S3 compatible backend
  3. S3 backend returns connection reset
  4. javascript error
  5. nginx process exit

Expected behavior
On http response errors from requests made from the s3 gateway serverside javascript, nginx process should not end up exiting.

Your environment

  • Version of the repo - latest nginx-s3-gateway image
  • Version of the container used (if downloaded from Docker Hub or Github) nginx-s3-gateway:latest
  • S3 backend implementation you are using : internal compatible S3 backend
  • How you are deploying Docker/Stand-alone, etc kubernetes
  • NGINX type (OSS/Plus) OSS
  • Authentication method (IAM, IAM with Fargate, IAM with K8S, AWS Credentials, etc) AWS key secret credential auth

Additional context
Add any other context about the problem here. Any log files you want to share.

1.2.3.4- - [17/Jul/2023:15:15:55 +0000] "GET /caching-test/something.mp4 HTTP/1.1" 200 18874715 "-" "Apache-HttpClient/4.5.13 (Java/11.0.15)" "100.100.100.200"
2023/07/17 15:16:01 [info] 79#79: *867 client prematurely closed connection (104: Connection reset by peer), client: 1.2.5.6, server: , request: "GET /caching-test/something.mp4 HTTP/1.1", host: "somehost.com"
1.2.5.6 - - [17/Jul/2023:15:16:01 +0000] "GET /caching-test/something.mp4 HTTP/1.1" 200 11260809 "-" "Apache-HttpClient/4.5.13 (Java/11.0.15)" "100.100.100.201"
2023/07/17 15:16:01 [notice] 79#79: exiting
2023/07/17 15:16:01 [notice] 79#79: exit
2023/07/17 15:16:01 [notice] 1#1: signal 17 (SIGCHLD) received from 79
2023/07/17 15:16:01 [notice] 1#1: worker process 79 exited with code 0
2023/07/17 15:16:01 [notice] 1#1: exit

@mvording mvording changed the title container restart on upstream connection resets container restarts on upstream connection resets Jul 17, 2023
@dekobon
Copy link
Collaborator

dekobon commented Jul 18, 2023

In order for me to understand this report, I'd like to try to restate it. Basically, when NGINX is running under the S3 Gateway configuration, outbound client connections to S3 that are closed on the S3 side are causing nginx to shutdown.
Is that a correct assessment?
Also, I assume based on the log messages above, that there is no core dump.

Are you able to consistently reproduce the issue?
If so, can you turn the verbosity of logs up for the error log to debug, and then post the output?

For our reference, at the time that this bug was file, the latest container image was:

ghcr.io/nginxinc/nginx-s3-gateway/nginx-oss-s3-gateway:latest-20230703

@mvording
Copy link
Author

Tried running a load test with an AWS S3 bucket backend and encountered the same issue.
I didn't see anything useful with debug logging enabled.

It turns out the issue is the /health endpoint used as a liveness check will fail under load and kill the pod, showing the "graceful shutdown message" in the logs.

When I remove the liveness check (keeping the /health readiness check), the load tests were able to complete without the pods shutting down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants