Add API details GEP-3779 - E/W Authorization #3891

LiorLieberman · 2025-06-30T17:09:09Z

Add API details for GEP-3779.

/kind gep

Related #3779

Does this PR introduce a user-facing change?:

NONE

k8s-ci-robot · 2025-06-30T17:09:17Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LiorLieberman
Once this PR has been reviewed and has the lgtm label, please assign thockin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

geps/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

robscott

Thanks @LiorLieberman!

robscott · 2025-07-01T00:31:30Z

geps/gep-3779/index.md

@@ -246,10 +246,261 @@ Cilium has the concept of CiliumIdentity. Pods are assigned identities derived f
 More on https://docs.cilium.io/en/stable/internals/security-identities/ & https://docs.cilium.io/en/stable/security/network/identity/


-
 ## API


I recommend moving this above the state of the world section.

robscott · 2025-07-01T00:31:55Z

geps/gep-3779/index.md

+* **Sources:** The source identities to which the rule applies. A request’s identity must match one of the listed sources. Supported source kinds are:  
+  * **SPIFFE ID**
+  * **Kubernetes ServiceAccount**  
+  * **Kubernetes Namespace**  
+* **Attributes:** Conditions on the target workload, at the time of writing this, only port is supported. If no attributes are specified, the rule applies to all traffic toward the target.


The formatting is not working as intended - see https://deploy-preview-3891--kubernetes-sigs-gateway-api.netlify.app/geps/gep-3779/.

robscott · 2025-07-01T00:32:07Z

geps/gep-3779/index.md

+* A request is allowed if:  
+  * It matches at least one rule in any ALLOW policy targeting the workload **and**  
+  * It is not explicitly denied by any DENY policy.  


Same comment as above re: list formatting

geps/gep-3779/index.md

robscott · 2025-07-01T00:38:21Z

geps/gep-3779/index.md

+    // TCPRules defines the list of matching criteria. A request is considered to
+    // match the policy if it matches any of the rules.
+    // +optional
+    TCPRules []AuthorizationTCPRule `json:"tcpRules,omitempty"`


This is a list of lists of lists, does each layer need to have its own distinct list? The exponential nature of this pattern can get pretty awful. We're currently missing max lengths for each of these, but let's just assume that we say 16 across the board. That means that we could end up needing support 4096 distinct sources in just one policy.

Some ideas to potentially simplify this to 256 by removing one of the layers of lists:

Before

tcpRules: - authorizationSources: - identities: [foo] - namespaces: [bar]

After v1

tcpRules: - authorizationSource: - identity: foo - namespace: bar

After v2

tcpRules: - authorizationSource: identities: [foo] namespaces: [bar]

robscott · 2025-07-01T00:41:53Z

geps/gep-3779/index.md

+
+##### **Experimental Pattern**
+
+To mitigate some of the concerns, `LabelSelector` support in policy attachment is designated as an **experimental pattern**.


For the record, I really hate the combination of label selectors and policy attachment because we'd be combining 2 topics that are already quite complicated. With that said, this has been a widely requested feature, and testing it out in a relatively controlled way seems ~ok to me. We need to be ready to pivot to another approach if this doesn't work and/or ends up being too complicated.

robscott · 2025-07-01T00:45:11Z

geps/gep-3779/index.md

+    // While the exact workload identifier structure is implementation-specific,
+    // implementations are encouraged to follow the convention of
+    // `spiffe://<trust_domain>/ns/<namespace>/sa/<serviceaccount>`
+    // when representing Kubernetes workload identities.


Why would someone use this matcher to refer to Kubernetes identities when the serviceAccounts list is below? Is this intended primarily for cross-cluster use cases?

I think cross-cluster or external workloads are reasonable examples which might apply.

It's niche at the moment, but I have seen users express desire to be able to assert policy at a sub-serviceAccount granularity. This seems to leave the door open for an implementation to support that use case.

robscott · 2025-07-01T00:47:56Z

geps/gep-3779/index.md

+
+    // Namespaces specifies a list of Kubernetes Namespaces that are matched
+    // by this rule. A request originating from any pod within one of these
+    // namespaces will match the rule, regardless of its specific Service Account.


So this is basically just another way of saying "match all ServiceAccounts in these namespaces"? Given the proposed format of ServiceAccounts is ns/name above, would ns/* be a reasonable alternative way to express this instead of having a separate field in the API?

Thats valid. Happy to remove and support ns/* if we are fine with supporting "". I am worried we are opening a door for more use of "" or any regex in value fields here though.

robscott · 2025-07-01T00:51:27Z

geps/gep-3779/index.md

+
+The `targetRef` of the policy specifies the workload(s) to which the policy applies. Two options are available for `targetRef`:
+
+#### **Option 1: Targeting a Service**


I seem to remember "ServiceAccount" being raised as a potential option at a previous KubeCon when this topic came up. I think it ended up being deemed impractical/poor UX, but could be good to cover it here for completeness.

aryan16 · 2025-07-01T16:37:38Z

geps/gep-3779/index.md

+
+    // Action specifies the action to take when a request matches the rules.
+    // +kubebuilder:validation:Required
+    Action AuthorizationPolicyAction `json:"action"`


can you also define the UX for DENY_BY_DEFAULT policies and how a users can add allow rules to override deny_by_default posture.
Use Case (Assuming we have ns wide policy support as well) - User has created a deny all policy for the namespace, now workload owners add Allow policies for the workloads.

aryan16 · 2025-07-01T16:38:37Z

geps/gep-3779/index.md

+type AuthorizationPolicySpec struct {
+    // TargetRef identifies the resource this policy is attached to.
+    // +kubebuilder:validation:Required
+    TargetRefs gatewayv1.PolicyTargetReferenceWithLabelSelectors `json:"targetRefs"`


how can we target the entire namespace? ns-wide authz policies are very common for meshes.

And with ns-wide policy we need to think about the clear separation between Gateway and mesh. This could be a valid scenario that Gateway and mesh workload are in the same namespace and user just wants to apply policy on namespace affecting only mesh workloads.

added more section about this. PTAL!

aryan16 · 2025-07-01T16:42:55Z

geps/gep-3779/index.md

+// for policies implemented within Gateway API. It is currently not intended for general-purpose
+// use outside of Gateway API resources.
+// +kubebuilder:validation:XValidation:rule="!(has(self.selector)) || (self.kind == 'Pod' && (self.group == '' || self.group == 'core'))",message="Selector may only be set when targeting Pods."
+type PolicyTargetReferenceWithLabelSelectors struct {


will this "PolicyTargetReferenceWithLabelSelectors" ever be applicable for Gateways?
Gateways can be in-cluster and defined via K8s workload (Gateway may have a pod with some labels). If someone uses the pod labels for the gateway, do we select that Gateway with these semantics?

I think that's a reasonable question that is one of the reasons that we haven't done label selecctors for pods before.

Tricky, a gateway implemented by a pod in k8s is both a workload unto itself, and also a proxy for 0 or more other "entities". In some cases it might be handy to be able to express that certain identities can talk to a gateway (and by extension, may be allowed to talk to entities that gateway proxies for) independent of expressing policy against all the disparate entities which that gateway is a proxy for.

This is also probably a complexity to worry about in the Service reference option as well. If you have a gateway selected by a service and attach policy to that service, what does it mean?

I would say YES. With the option to exclude it if necessary (using the right label selector combination ,e.g with NotIn operator). Though as Ian pointed out there are cases where it will be useful to be able to distribute the policies to the gateway (which is also a workload in this case).

added sections about it, PTAL!

youngnick · 2025-07-02T03:01:00Z

geps/gep-3779/index.md

+##### **Enhanced Discoverability with `gwctl`**
+
+A key challenge with `LabelSelector` is the loss of discoverability. It’s easier to see which policies target a `Service` but difficult to determine which policies might affect a specific pod.
+
+To address this, **investment in tooling is required.** Specifically, the `gwctl` CLI tool should be enhanced to provide insights such as:
+
+```sh
+TODO: complete gwctl commands
+```
+
+Without dedicated tooling, the `LabelSelector` approach could significantly degrade the user experience and observability.


This section is critical for making label selectors actually be understandable. As it stands, they are the very definition of spooky action at a distance - if you don't know to go and look for an AuthorizationPolicy, then requests will randomly fail or succeed based on things outside your control.

geps/gep-3779/index.md

ilrudie · 2025-07-02T18:31:21Z

geps/gep-3779/index.md

+* **No API Extension Required:** Works with the current PolicyAttachment model in Gateway API without modification.  
+* **Simplicity:** Intuitive for users familiar with Kubernetes networking concepts.
+
+**Downsides and Open Questions:**


Does "addressing" of the traffic matter in the service binding case? For instance, if Prometheus sends directly to a pod should service-attached policy still apply? Perhaps this is just a twist of the topic about how pods "belong" to services.

added some more clarifications above. the TL;DR is that I think there is no consistent/good way for a the (dst) component that enforces authorization to know what service was dialed, mainly with sidecar-based meshes.

So if we were to go with Service, the answer to you question would probably be yes, cause we will use the workload selector from the service. But this brings other complications that I described above

ilrudie · 2025-07-02T19:15:35Z

geps/gep-3779/index.md

+
+
+// AuthorizationSource specifies the source of a request.
+// Only one of its fields may be set.


Is the only one restriction necessary? I can envision an ask for:

from: serviceaccounts - monitoring/prometheus - ingress/my-gw namespaces - valid-clients

Per Rob's other comment, allowing serviceaccount: [valid-clients/*] might be a way to express it.

Its tricky, for example, in istio all of the attributes are ANDed together. We can say it will be ORed, but I don't see a reason to do so. We could support [valid-clients/*] if we need to, or in this case, just ask to have another rule for namespaces: valid-clients.

karthikbox · 2025-06-30T23:46:38Z

geps/gep-3779/index.md

+// AuthorizationSource specifies the source of a request.
+// Only one of its fields may be set.
+// +kubebuilder:validation:XValidation:rule="(size(self.identities) > 0 ? 1 : 0) + (size(self.serviceAccounts) > 0 ? 1 : 0) + (size(self.namespaces) > 0 ? 1 : 0) == 1",message="Exactly one of identities, serviceAccounts, or namespaces must be specified."
+type AuthorizationSource struct {


i could be wrong, but i thought a Type string field is used for oneof fields.

for example -

gateway-api/apis/v1alpha3/backendtlspolicy_types.go

Line 181 in 3842b4c

Type SubjectAltNameType `json:"type"`

gateway-api/apis/v1/httproute_types.go

Line 534 in 3842b4c

Type *PathMatchType `json:"type,omitempty"`

karthikbox · 2025-06-30T23:49:57Z

geps/gep-3779/index.md

+    // A request matches if its attributes are present in this list.
+    //
+    // +optional
+    Attributes []AuthorizationTCPAttributes `json:"attributes,omitempty"`


QQ: Should we rename this to DestinationAttributes to be clear that these are destination's attributes?

karthikbox · 2025-06-30T23:57:31Z

geps/gep-3779/index.md

+* The presence of any authorization policy causes the system to default to **deny-by-default** for matching workloads.
+* Another bullet to re-clarify the one above - the default behavior when no policies select a target workload is to allow all traffic. However, **as soon as at least one `AuthorizationPolicy` targets a workload, the model becomes implicitly deny-if-not-allowed**.
+


i'm a bit confused by this. Perhaps an example here might help me understand this better.

doesn't this contradict the statement from ALLOW policy above - If no ALLOW policy exists for a workload, traffic is permitted by default, unless any DENY policy applies. ?

ok, i understood this now. but would help if there's an example.

geps/gep-3779/index.md

karthikbox · 2025-07-07T18:00:48Z

geps/gep-3779/index.md

+  Kind Kind `json:"kind"`
+
+  // Selector is the label selector of target objects of the specified kind.
+  Selector *metav1.LabelSelector `json:"selector"`


should we add a Section Name here too? for example - port can be a section name . When specified, the policy would apply to the pod port specifically. This would be useful for this AuthZ Policy or some other future auth or "workload" policy.

While this is not my main point, the case for supporting this in AuthZPolicy is that it would allow the authz policy to be evaluated only for traffic destined to this port. As opposed to the current proposal, where the whole policy is evaluated for all traffic (perhaps implementations will optimize this behavior) which is going to wasteful.

I perceive Port as an attribute of the request in this case, rather than the target. Our targets are workloads and not Workload*Port.

As an example, assuming we support that, what would it mean to have allow policy thats targeting workload foo on port 8080.

If port is an attribute of the request - all requests to this workload in other ports are going to be denied (see the semantics above, if there is ANY allow policy, we deny anything thats not specifically allowed)

However, with port being an attribute of the target - do we still do that? if yes, how?

LiorLieberman · 2025-07-09T23:09:45Z

/cc @kflynn @mikemorris

k8s-ci-robot requested review from gcs278 and kflynn June 30, 2025 17:09

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 30, 2025

LiorLieberman changed the title ~~Add API details GEP-3779~~ Add API details GEP-3779 - E/W Authorization Jun 30, 2025

add api details gep-3779

a410c68

LiorLieberman force-pushed the gep-3779-apis branch from 2aada9b to a410c68 Compare June 30, 2025 22:14

robscott reviewed Jul 1, 2025

View reviewed changes

aryan16 reviewed Jul 1, 2025

View reviewed changes

youngnick reviewed Jul 2, 2025

View reviewed changes

ilrudie reviewed Jul 2, 2025

View reviewed changes

geps/gep-3779/index.md Show resolved Hide resolved

ilrudie reviewed Jul 2, 2025

View reviewed changes

fmting

776e35c

LiorLieberman force-pushed the gep-3779-apis branch from 12b8a42 to 776e35c Compare July 2, 2025 22:41

karthikbox reviewed Jul 7, 2025

View reviewed changes

some-fixes

4e4d2e9

LiorLieberman requested review from ilrudie and robscott July 9, 2025 23:09

k8s-ci-robot requested a review from mikemorris July 9, 2025 23:09

		@@ -246,10 +246,261 @@ Cilium has the concept of CiliumIdentity. Pods are assigned identities derived f
		More on https://docs.cilium.io/en/stable/internals/security-identities/ & https://docs.cilium.io/en/stable/security/network/identity/



		## API


		##### Experimental Pattern

		To mitigate some of the concerns, `LabelSelector` support in policy attachment is designated as an experimental pattern.


		The `targetRef` of the policy specifies the workload(s) to which the policy applies. Two options are available for `targetRef`:

		#### Option 1: Targeting a Service



		// AuthorizationSource specifies the source of a request.
		// Only one of its fields may be set.

		* The presence of any authorization policy causes the system to default to deny-by-default for matching workloads.
		* Another bullet to re-clarify the one above - the default behavior when no policies select a target workload is to allow all traffic. However, as soon as at least one `AuthorizationPolicy` targets a workload, the model becomes implicitly deny-if-not-allowed.

Add API details GEP-3779 - E/W Authorization #3891

Are you sure you want to change the base?

Add API details GEP-3779 - E/W Authorization #3891

Uh oh!

Conversation

LiorLieberman commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jun 30, 2025

Uh oh!

robscott left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ilrudie Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LiorLieberman Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LiorLieberman commented Jun 30, 2025 •

edited

Loading

ilrudie Jul 2, 2025 •

edited

Loading

LiorLieberman Jul 8, 2025 •

edited

Loading