Skip to content

Commit a78a619

Browse files
authored
toleration diff and nodeReadinessLabel merge with manifest matchExpressions (zalando#1729)
* include tolerations in statefulset comparison * provide alternative merge behavior of nodeSelectorTerms for node readiness label * add config option to change affinity merge behavior * reworked e2e tests around node affinity
1 parent fe34019 commit a78a619

File tree

17 files changed

+245
-104
lines changed

17 files changed

+245
-104
lines changed

charts/postgres-operator/crds/operatorconfigurations.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,11 @@ spec:
233233
type: object
234234
additionalProperties:
235235
type: string
236+
node_readiness_label_merge:
237+
type: string
238+
enum:
239+
- "AND"
240+
- "OR"
236241
oauth_token_secret_name:
237242
type: string
238243
default: "postgresql-operator"

charts/postgres-operator/values.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,9 @@ configKubernetes:
132132
# node_readiness_label:
133133
# status: ready
134134

135+
# defines how nodeAffinity from manifest should be merged with node_readiness_label
136+
# node_readiness_label_merge: "OR"
137+
135138
# namespaced name of the secret containing the OAuth2 token to pass to the teams API
136139
# oauth_token_secret_name: postgresql-operator
137140

docs/administrator.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -339,6 +339,81 @@ master pods from being evicted by the K8s runtime. To prevent eviction
339339
completely, specify the toleration by leaving out the `tolerationSeconds` value
340340
(similar to how Kubernetes' own DaemonSets are configured)
341341

342+
## Node readiness labels
343+
344+
The operator can watch on certain node labels to detect e.g. the start of a
345+
Kubernetes cluster upgrade procedure and move master pods off the nodes to be
346+
decommissioned. Key-value pairs for these node readiness labels can be
347+
specified in the configuration (option name is in singular form):
348+
349+
```yaml
350+
apiVersion: v1
351+
kind: ConfigMap
352+
metadata:
353+
name: postgres-operator
354+
data:
355+
node_readiness_label: "status1:ready,status2:ready"
356+
```
357+
358+
```yaml
359+
apiVersion: "acid.zalan.do/v1"
360+
kind: OperatorConfiguration
361+
metadata:
362+
name: postgresql-configuration
363+
configuration:
364+
kubernetes:
365+
node_readiness_label:
366+
status1: ready
367+
status2: ready
368+
```
369+
370+
The operator will create a `nodeAffinity` on the pods. This makes the
371+
`node_readiness_label` option the global configuration for defining node
372+
affinities for all Postgres clusters. You can have both, cluster-specific and
373+
global affinity, defined and they will get merged on the pods. If
374+
`node_readiness_label_merge` is configured to `"AND"` the node readiness
375+
affinity will end up under the same `matchExpressions` section(s) from the
376+
manifest affinity.
377+
378+
```yaml
379+
affinity:
380+
nodeAffinity:
381+
requiredDuringSchedulingIgnoredDuringExecution:
382+
nodeSelectorTerms:
383+
- matchExpressions:
384+
- key: environment
385+
operator: In
386+
values:
387+
- pci
388+
- key: status1
389+
operator: In
390+
values:
391+
- ready
392+
- key: status2
393+
...
394+
```
395+
396+
If `node_readiness_label_merge` is set to `"OR"` (default) the readiness label
397+
affinty will be appended with its own expressions block:
398+
399+
```yaml
400+
affinity:
401+
nodeAffinity:
402+
requiredDuringSchedulingIgnoredDuringExecution:
403+
nodeSelectorTerms:
404+
- matchExpressions:
405+
- key: environment
406+
...
407+
- matchExpressions:
408+
- key: storage
409+
...
410+
- matchExpressions:
411+
- key: status1
412+
...
413+
- key: status2
414+
...
415+
```
416+
342417
## Enable pod anti affinity
343418

344419
To ensure Postgres pods are running on different topologies, you can use

docs/reference/operator_parameters.md

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -344,11 +344,16 @@ configuration they are grouped under the `kubernetes` key.
344344

345345
* **node_readiness_label**
346346
a set of labels that a running and active node should possess to be
347-
considered `ready`. The operator uses values of those labels to detect the
348-
start of the Kubernetes cluster upgrade procedure and move master pods off
349-
the nodes to be decommissioned. When the set is not empty, the operator also
350-
assigns the `Affinity` clause to the Postgres pods to be scheduled only on
351-
`ready` nodes. The default is empty.
347+
considered `ready`. When the set is not empty, the operator assigns the
348+
`nodeAffinity` clause to the Postgres pods to be scheduled only on `ready`
349+
nodes. The default is empty.
350+
351+
* **node_readiness_label_merge**
352+
If a `nodeAffinity` is also specified in the postgres cluster manifest
353+
it will get merged with the `node_readiness_label` affinity on the pods.
354+
The merge strategy can be configured - it can either be "AND" or "OR".
355+
See [user docs](../user.md#use-taints-tolerations-and-node-affinity-for-dedicated-postgresql-nodes)
356+
for more details. Default is "OR".
352357

353358
* **toleration**
354359
a dictionary that should contain `key`, `operator`, `value` and

docs/user.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -671,7 +671,9 @@ configured [default requests](reference/operator_parameters.md#kubernetes-resour
671671

672672
To ensure Postgres pods are running on nodes without any other application pods,
673673
you can use [taints and tolerations](https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/)
674-
and configure the required toleration in the manifest.
674+
and configure the required toleration in the manifest. Tolerations can also be
675+
defined in the [operator config](administrator.md#use-taints-and-tolerations-for-dedicated-postgresql-nodes)
676+
to apply for all Postgres clusters.
675677

676678
```yaml
677679
spec:
@@ -703,6 +705,9 @@ spec:
703705
- pci
704706
```
705707

708+
If you need to define a `nodeAffinity` for all your Postgres clusters use the
709+
`node_readiness_label` [configuration](administrator.md#node-readiness-labels).
710+
706711
## In-place major version upgrade
707712

708713
Starting with Spilo 13, operator supports in-place major version upgrade to a

e2e/tests/k8s_api.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ def get_pg_nodes(self, pg_cluster_name, namespace='default'):
5353

5454
return master_pod_node, replica_pod_nodes
5555

56-
def get_cluster_nodes(self, cluster_labels='cluster-name=acid-minimal-cluster', namespace='default'):
56+
def get_cluster_nodes(self, cluster_labels='application=spilo,cluster-name=acid-minimal-cluster', namespace='default'):
5757
m = []
5858
r = []
5959
podsList = self.api.core_v1.list_namespaced_pod(namespace, label_selector=cluster_labels)

0 commit comments

Comments
 (0)