Skip to content

Conversation

@michaeljmarshall
Copy link
Member

@michaeljmarshall michaeljmarshall commented Jul 28, 2023

Fixes: #20862

Motivation

PIP for improving connector configuration security. See content for details.

Modifications

Add the design doc for PIP 289.

Current implementation PRs:

Documentation

  • doc

@michaeljmarshall michaeljmarshall added doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. type/PIP labels Jul 28, 2023
@michaeljmarshall michaeljmarshall added this to the 3.2.0 milestone Jul 28, 2023
@michaeljmarshall michaeljmarshall self-assigned this Jul 28, 2023
@github-actions github-actions bot added doc-label-missing and removed doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. labels Jul 28, 2023
@apache apache deleted a comment from github-actions bot Jul 28, 2023
@michaeljmarshall michaeljmarshall added doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. and removed doc-label-missing labels Jul 28, 2023
@Technoboy- Technoboy- modified the milestones: 3.2.0, 3.1.0 Jul 31, 2023
@shibd
Copy link
Member

shibd commented Aug 5, 2023

On the three problems you mentioned:

  • Elasticsearch Config(Nested configurations): We can improve IOConfigUtils to support it. Right?
  • RabbitMQ Config(Not using IOConfigUtils to load config): We can find and fix connectors for these same problems. Right?
  • Kafka Config(Use kafka properties map):
    • Perhaps we should refine these keys to create configurations separately
    • Or we can enhance annotations to support set sensitive key names, such as:
    @FieldDoc(
            defaultValue = "",
            help = "Config properties to pass to the kafka connector."
            sensitiveNames="password-1, password-2"
    )
    private Map<String, String> kafkaConnectorConfigProperties;

My point is to avoid introducing complicated configurations for users to understand.

Thanks.

@michaeljmarshall
Copy link
Member Author

Elasticsearch Config(Nested configurations): We can improve IOConfigUtils to support it. Right?

It's possible, but it would likely require a change/expansion to the way the FieldDoc annotation works. However, fixing it for Elasticsearch does not solve the general problem.

RabbitMQ Config(Not using IOConfigUtils to load config): We can find and fix connectors for these same problems. Right?

I already proposed this as part of the PIP here: #20902.

  • Kafka Config(Use kafka properties map):

    * Perhaps we should refine these keys to create configurations separately
    * Or we can enhance annotations to support set sensitive key names, such as:
    
    @FieldDoc(
            defaultValue = "",
            help = "Config properties to pass to the kafka connector."
            sensitiveNames="password-1, password-2"
    )
    private Map<String, String> kafkaConnectorConfigProperties;

The Kafka Config is the real reason for this PIP. We need arbitrary interpolation because the Kafka Config works as a wrapper. For example, it is possible to configure it with any connector that works with Kafka. See https://github.com/datastax/pulsar-3rdparty-connector for many third party connectors that are simply plugged into the Kafka Adapter Connector without any additional work. Since it is a wrapper, it is not possible to annotate the relevant fields with a custom Pulsar annotation (FieldDoc). Instead, I propose we find a more general solution.

@shibd
Copy link
Member

shibd commented Aug 11, 2023

Hi, @michaeljmarshall. Thanks for your explanation. I got it.

I think we can avoid adding a new config: interpolate_secrets_into_config_map.

What's the problem with us calling the method by default?

    private static void interpolateSecretsIntoConfigs(SecretsProvider secretsProvider, Map<String, Object> configs)

Only when the user uses placeholders will it merge secrets to the config map, right?

      "username": "${username}",
      "password": "${password}",

This way we just need to explain to the user that we provide a new more general way to set secrets.

@michaeljmarshall
Copy link
Member Author

What's the problem with us calling the method by default?

Great question, thanks @shibd. I wrote this text in the PIP explaining why I didn't want to enable it by default:

The primary security consideration is whether there is any risk in giving users a way to interpolate environment variables into their connector. Note that this kind of feature led to Log4Shell. Unlike Log4Shell, the risk is negligible when running in a containerized environment, like Kubernetes. Further, this feature will be disabled by default, so users can evaluate the security risks on their own.

In taking a closer look, the proposed change will only affect k8s based deployments, which probably means we can enable it by default. However, the only remaining risk could be to users that have created their own extensions and are using these classes in unknown ways. Given your comments, I think we should remove the configuration option and always enable it. Let me know what you think, thanks.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

+1 (binding)

@michaeljmarshall
Copy link
Member Author

@shibd - I updated the text of the PIP. PTAL

Copy link
Member

@shibd shibd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@michaeljmarshall michaeljmarshall merged commit d014713 into apache:master Aug 16, 2023
@michaeljmarshall michaeljmarshall deleted the pip-289-improve-connector-secret-injection branch August 16, 2023 14:20
michaeljmarshall added a commit that referenced this pull request Aug 16, 2023
PIP: #20903
Relates to: #20862 

### Motivation

The primary motivation is to make it possible to configure Pulsar Connectors in a secure, non-plaintext way. See the PIP for background and relevant details. The new interpolation feature only applies when deploying with functions to Kubernetes.

### Modifications

* Add `SecretsProvider#interpolateSecretForValue` method with a default that maintains the current behavior.
* Override `interpolateSecretForValue` in the `EnvironmentBasedSecretsProvider` so that configuration values formatted as `${my-env-var}` will be replaced with the result of `System.getEnv("my-env-var")` if the result is not `null`.
* Implement a recursive string interpolation method that will replace any configuration value that the `interpolateSecretForValue` implementation determines ought to be replaced.

### Verifying this change

Tests are added/modified.

### Documentation

- [x] `doc-required`

### Matching PR in forked repository

PR in forked repository: michaeljmarshall#55
michaeljmarshall added a commit to datastax/pulsar that referenced this pull request Aug 16, 2023
PIP: apache#20903
Relates to: apache#20862

The primary motivation is to make it possible to configure Pulsar Connectors in a secure, non-plaintext way. See the PIP for background and relevant details. The new interpolation feature only applies when deploying with functions to Kubernetes.

* Add `SecretsProvider#interpolateSecretForValue` method with a default that maintains the current behavior.
* Override `interpolateSecretForValue` in the `EnvironmentBasedSecretsProvider` so that configuration values formatted as `${my-env-var}` will be replaced with the result of `System.getEnv("my-env-var")` if the result is not `null`.
* Implement a recursive string interpolation method that will replace any configuration value that the `interpolateSecretForValue` implementation determines ought to be replaced.

Tests are added/modified.

- [x] `doc-required`

PR in forked repository: michaeljmarshall#55

(cherry picked from commit bfde0de)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. type/PIP

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generic Secret Injection for Connector Configuration

4 participants