Skip to content

KAFKA-19070:: Adding task number to user provided client id to ensure each consumer has a unique client ID to avoid metric registration conflicts. #19341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

kumarpritam863
Copy link
Contributor

What
This PR updates the behavior of client.id assignment when a user provides a custom value via override configs in Kafka Connect.

Why
Currently, if a user overrides the client.id, all tasks or consumers inherit the same client ID. While this doesn't cause an immediate failure in Kafka, it leads to the following issues:

  • Metrics and logs are merged or overwritten, making observability inaccurate.

  • Quotas and throttling may be applied incorrectly.

  • Debugging becomes harder due to lack of per-task identity.

According to Kafka core behavior, client.id should be unique per client instance for proper tracking and diagnostics.

How
This PR appends the task number to the user-provided client.id to ensure uniqueness across tasks.

For example:

  • User provides: client.id=my-custom-client

  • Final client.id used by task 2: my-custom-client-2

This approach:

  • Respects the user’s original intent in naming

  • Guarantees unique client.id values per task

  • Improves metrics, logging, and debugging consistency

…each consumer has a unique client ID to avoid metric registration conflicts.
@github-actions github-actions bot added triage PRs from the community connect small Small PRs labels Apr 1, 2025
@kumarpritam863
Copy link
Contributor Author

@C0urante can you please review this.

@kumarpritam863
Copy link
Contributor Author

@dajac can you please review this.

@kumarpritam863
Copy link
Contributor Author

Hi @mimaison can you please review this. Thanks.

Copy link

github-actions bot commented Apr 9, 2025

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

Copy link

github-actions bot commented May 6, 2025

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

@mimaison
Copy link
Member

Thanks for the PR!

Overall I agree that we should ensure metrics don't collide so operators can rely on them to be accurate. On the other hand the cliend Id can be used for authorization and quotas so if we don't use the exact value provided by the user we may break existing systems.

I'm leaning considering this as a bug and automatically injecting the task Id, as proposed in this PR. @gharris1727 WDYT?

Copy link

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

@kumarpritam863
Copy link
Contributor Author

Hi @mimaison, thanks for reviewing the PR.

There is an another alternate way of ensuring unique client-id for each task as following.

The configuration we provide through the POST/PUT API is the connector-level config. However, the task-level config is generated by the Connect framework via the

public abstract List<Map<String, String>> taskConfigs(int maxTasks);
method. This method returns a list of configs—one per task—which are then used to instantiate the tasks.

Since we want each task to have a unique client-id, we can modify the value at this point (inside taskConfigs(...)) by appending the task number.

Please let me know if this direction sounds good to you, and I’ll proceed with the change.

Copy link

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

@mimaison
Copy link
Member

My point is whatever mechanism you use, this can potentially break some existing deployments. A bunch of connectors only run a single task (for example Debezium), and if users set a specific client-id this would modify it.

It may be worth discussing it on the dev list to see what people think. Can you send an email to the dev mailing list explaining the issue and asking whether this change requires a KIP or not? Thanks

@mimaison mimaison removed triage PRs from the community needs-attention labels Jun 18, 2025
@kumarpritam863
Copy link
Contributor Author

Hi @mimaison ,
Thanks for the input — I completely agree it's worth discussing this carefully.

That said, I'm a bit unsure how this would break existing deployments, since client.id is primarily used for logging, metrics, and monitoring. It doesn’t affect Kafka's core delivery semantics like group.id or partition assignment. From what I understand, the only change here would be in how the client.id appears in logs and metrics.

Also, similar changes have been made in the past — for example, we updated the client ID format for consumers of the management topic from --statuses to -statuses, and a few versions ago we modified the default client ID of internal sink consumers.

That said, I agree it's best to bring this to the dev mailing list. I'll send an email shortly summarizing the context and asking whether this requires a KIP.

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants