Skip to content

KAFKA-17019: Producer TimeoutException should include root cause #20159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

chickenchickenlove
Copy link
Contributor

@chickenchickenlove chickenchickenlove commented Jul 12, 2025

Changes

  • Add new Exception class PotentialCauseException.
  • All org.apache.kafka.common.errors.TimeoutException in
    KafkaProducer has PotentialCauseException as root cause if it
    cannot catch any exception.

Describe

TimeoutException can be thrown for various reasons.
However, it is often difficult to identify the root cause,
Because there are so many potential factors that can lead to a
TimeoutException.

For example:

  1. The ProducerClient might be busy, so it may not be able to send the
    request in time. As a result, some batches may expire, leading to a
    TimeoutException.
  2. The broker might be unavailable due to network issues or internal
    failures.
  3. A request may be in flight, and although the broker successfully
    handles and responds to it, the response might arrive slightly late.

As shown above, there are many possible causes. In some cases, no
exception is caught in the catch block, and a TimeoutException is
thrown simply by comparing the elapsed time. However, the developer
using TimeoutException in KafkaProducer likely already knows which
specific reasons could cause it in that context. Therefore, I think it
would be helpful to include a PotentialCauseException that reflects
the likely reason, based on the developer’s knowledge.

@github-actions github-actions bot added triage PRs from the community producer clients small Small PRs labels Jul 12, 2025
Copy link
Contributor

@frankvicky frankvicky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chickenchickenlove: Thanks for the patch.

this.await(timeout, unit, null);
}

public void await(long timeout, TimeUnit unit, PotentialCauseException potentialCauseException) {
Copy link
Contributor

@frankvicky frankvicky Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change PotentialCauseException to Supplier<PotentialCauseException>?
Creating an exception instance is very expensive; lazy creation could help us avoid this situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments and sounds great!
I have addressed PR review.
When you have time, please take another look. 🙇‍♂️

@github-actions github-actions bot removed the triage PRs from the community label Jul 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants