Skip to content

Commit 3a9378d

Browse files
Allow configuring the operator via the YAML manifest. (zalando#326)
* Up until now, the operator read its own configuration from the configmap. That has a number of limitations, i.e. when the configuration value is not a scalar, but a map or a list. We use a custom code based on github.com/kelseyhightower/envconfig to decode non-scalar values out of plain text keys, but that breaks when the data inside the keys contains both YAML-special elememtns (i.e. commas) and complex quotes, one good example for that is search_path inside `team_api_role_configuration`. In addition, reliance on the configmap forced a flag structure on the configuration, making it hard to write and to read (see zalando#308 (comment)). The changes allow to supply the operator configuration in a proper YAML file. That required registering a custom CRD to support the operator configuration and provide an example at manifests/postgresql-operator-default-configuration.yaml. At the moment, both old configmap and the new CRD configuration is supported, so no compatibility issues, however, in the future I'd like to deprecate the configmap-based configuration altogether. Contrary to the configmap-based configuration, the CRD one doesn't embed defaults into the operator code, however, one can use the manifests/postgresql-operator-default-configuration.yaml as a starting point in order to build a custom configuration. Since previously `ReadyWaitInterval` and `ReadyWaitTimeout` parameters used to create the CRD were taken from the operator configuration, which is not possible if the configuration itself is stored in the CRD object, I've added the ability to specify them as environment variables `CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` respectively. Per review by @zerg-junior and @Jan-M.
1 parent e90a010 commit 3a9378d

File tree

14 files changed

+583
-44
lines changed

14 files changed

+583
-44
lines changed

cmd/main.go

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ import (
77
"os/signal"
88
"sync"
99
"syscall"
10+
"time"
1011

1112
"github.com/zalando-incubator/postgres-operator/pkg/controller"
1213
"github.com/zalando-incubator/postgres-operator/pkg/spec"
@@ -20,6 +21,14 @@ var (
2021
config spec.ControllerConfig
2122
)
2223

24+
func mustParseDuration(d string) time.Duration {
25+
duration, err := time.ParseDuration(d)
26+
if err != nil {
27+
panic(err)
28+
}
29+
return duration
30+
}
31+
2332
func init() {
2433
flag.StringVar(&kubeConfigFile, "kubeconfig", "", "Path to kubeconfig file with authorization and master location information.")
2534
flag.BoolVar(&outOfCluster, "outofcluster", false, "Whether the operator runs in- our outside of the Kubernetes cluster.")
@@ -38,6 +47,17 @@ func init() {
3847
log.Printf("Fully qualified configmap name: %v", config.ConfigMapName)
3948

4049
}
50+
if crd_interval := os.Getenv("CRD_READY_WAIT_INTERVAL"); crd_interval != "" {
51+
config.CRDReadyWaitInterval = mustParseDuration(crd_interval)
52+
} else {
53+
config.CRDReadyWaitInterval = 4 * time.Second
54+
}
55+
56+
if crd_timeout := os.Getenv("CRD_READY_WAIT_TIMEOUT"); crd_timeout != "" {
57+
config.CRDReadyWaitTimeout = mustParseDuration(crd_timeout)
58+
} else {
59+
config.CRDReadyWaitTimeout = 30 * time.Second
60+
}
4161
}
4262

4363
func main() {

docs/reference/command_line_and_environment.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,3 +48,11 @@ The following environment variables are accepted by the operator:
4848
* **SCALYR_API_KEY**
4949
the value of the Scalyr API key to supply to the pods. Overrides the
5050
`scalyr_api_key` operator parameter.
51+
52+
* **CRD_READY_WAIT_TIMEOUT**
53+
defines the timeout for the complete postgres CRD creation. When not set
54+
default is 30s.
55+
56+
* **CRD_READY_WAIT_INTERVAL**
57+
defines the interval between consecutive attempts waiting for the postgres
58+
CRD to be created. The default is 5s.

docs/reference/operator_parameters.md

Lines changed: 98 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,54 @@
1-
2-
Postgres operator is configured via a ConfigMap defined by the
3-
`CONFIG_MAP_NAME` environment variable. Variable names are underscore-separated
4-
words.
1+
There are two mutually-exclusive methods to set the Postgres Operator
2+
configuration.
3+
4+
* ConfigMaps-based, the legacy one. The configuration is supplied in a
5+
key-value configmap, defined by the `CONFIG_MAP_NAME` environment variable.
6+
Non-scalar values, i.e. lists or maps, are encoded in the value strings using
7+
the comma-based syntax for lists and coma-separated `key:value` syntax for
8+
maps. String values containing ':' should be enclosed in quotes. The
9+
configuration is flat, parameter group names below are not reflected in the
10+
configuration structure. There is an
11+
[example](https://github.com/zalando-incubator/postgres-operator/blob/master/manifests/configmap.yaml)
12+
13+
* CRD-based configuration. The configuration is stored in the custom YAML
14+
manifest, an instance of the custom resource definition (CRD) called
15+
`postgresql-operator-configuration`. This CRD is registered by the operator
16+
during the start when `POSTGRES_OPERATOR_CONFIGURATION_OBJECT` variable is
17+
set to a non-empty value. The CRD-based configuration is a regular YAML
18+
document; non-scalar keys are simply represented in the usual YAML way. The
19+
usage of the CRD-based configuration is triggered by setting the
20+
`POSTGRES_OPERATOR_CONFIGURATION_OBJECT` variable, which should point to the
21+
`postgresql-operator-configuration` object name in the operators namespace.
22+
There are no default values built-in in the operator, each parameter that is
23+
not supplied in the configuration receives an empty value. In order to
24+
create your own configuration just copy the [default
25+
one](https://github.com/zalando-incubator/postgres-operator/blob/wip/operator_configuration_via_crd/manifests/postgresql-operator-default-configuration.yaml)
26+
and change it.
27+
28+
CRD-based configuration is more natural and powerful then the one based on
29+
ConfigMaps and should be used unless there is a compatibility requirement to
30+
use an already existing configuration. Even in that case, it should be rather
31+
straightforward to convert the configmap based configuration into the CRD-based
32+
one and restart the operator. The ConfigMaps-based configuration will be
33+
deprecated and subsequently removed in future releases.
34+
35+
Note that for the CRD-based configuration configuration groups below correspond
36+
to the non-leaf keys in the target YAML (i.e. for the Kubernetes resources the
37+
key is `kubernetes`). The key is mentioned alongside the group description. The
38+
ConfigMap-based configuration is flat and does not allow non-leaf keys.
39+
40+
Since in the CRD-based case the operator needs to create a CRD first, which is
41+
controlled by the `resource_check_interval` and `resource_check_timeout`
42+
parameters, those parameters have no effect and are replaced by the
43+
`CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` environment variables.
44+
They will be deprecated and removed in the future.
45+
46+
Variable names are underscore-separated words.
547

648
## General
49+
50+
Those are top-level keys, containing both leaf keys and groups.
51+
752
* **etcd_host**
853
Etcd connection string for Patroni defined as `host:port`. Not required when
954
Patroni native Kubernetes support is used. The default is empty (use
@@ -38,6 +83,10 @@ words.
3883
period between consecutive sync requests. The default is `5m`.
3984

4085
## Postgres users
86+
87+
Parameters describing Postgres users. In a CRD-configuration, they are grouped
88+
under the `users` key.
89+
4190
* **super_username**
4291
postgres `superuser` name to be created by `initdb`. The default is
4392
`postgres`.
@@ -47,6 +96,11 @@ words.
4796
`standby`.
4897

4998
## Kubernetes resources
99+
100+
Parameters to configure cluster-related Kubernetes objects created by the
101+
operator, as well as some timeouts associated with them. In a CRD-based
102+
configuration they are grouped under the `kubernetes` key.
103+
50104
* **pod_service_account_name**
51105
service account used by Patroni running on individual Pods to communicate
52106
with the operator. Required even if native Kubernetes support in Patroni is
@@ -127,6 +181,11 @@ words.
127181
operator. The default is empty.
128182

129183
## Kubernetes resource requests
184+
185+
This group allows you to configure resource requests for the Postgres pods.
186+
Those parameters are grouped under the `postgres_pod_resources` key in a
187+
CRD-based configuration.
188+
130189
* **default_cpu_request**
131190
CPU request value for the postgres containers, unless overridden by
132191
cluster-specific settings. The default is `100m`.
@@ -144,6 +203,13 @@ words.
144203
settings. The default is `1Gi`.
145204

146205
## Operator timeouts
206+
207+
This set of parameters define various timeouts related to some operator
208+
actions, affecting pod operations and CRD creation. In the CRD-based
209+
configuration `resource_check_interval` and `resource_check_timeout` have no
210+
effect, and the parameters are grouped under the `timeouts` key in the
211+
CRD-based configuration.
212+
147213
* **resource_check_interval**
148214
interval to wait between consecutive attempts to check for the presence of
149215
some Kubernetes resource (i.e. `StatefulSet` or `PodDisruptionBudget`). The
@@ -171,6 +237,10 @@ words.
171237
the timeout for the complete postgres CRD creation. The default is `30s`.
172238

173239
## Load balancer related options
240+
241+
Those options affect the behavior of load balancers created by the operator.
242+
In the CRD-based configuration they are grouped under the `load_balancer` key.
243+
174244
* **db_hosted_zone**
175245
DNS zone for the cluster DNS name when the load balancer is configured for
176246
the cluster. Only used when combined with
@@ -202,6 +272,12 @@ words.
202272
No other placeholders are allowed.
203273

204274
## AWS or GSC interaction
275+
276+
The options in this group configure operator interactions with non-Kubernetes
277+
objects from AWS or Google cloud. They have no effect unless you are using
278+
either. In the CRD-based configuration those options are grouped under the
279+
`aws_or_gcp` key.
280+
205281
* **wal_s3_bucket**
206282
S3 bucket to use for shipping WAL segments with WAL-E. A bucket has to be
207283
present and accessible by Patroni managed pods. At the moment, supported
@@ -218,9 +294,12 @@ words.
218294
[kube2iam](https://github.com/jtblin/kube2iam) project on AWS. The default is empty.
219295

220296
* **aws_region**
221-
AWS region used to store ESB volumes.
297+
AWS region used to store ESB volumes. The default is `eu-central-1`.
222298

223299
## Debugging the operator
300+
301+
Options to aid debugging of the operator itself. Grouped under the `debug` key.
302+
224303
* **debug_logging**
225304
boolean parameter that toggles verbose debug logs from the operator. The
226305
default is `true`.
@@ -230,7 +309,12 @@ words.
230309
access to the postgres database, i.e. creating databases and users. The default
231310
is `true`.
232311

233-
### Automatic creation of human users in the database
312+
## Automatic creation of human users in the database
313+
314+
Options to automate creation of human users with the aid of the teams API
315+
service. In the CRD-based configuration those are grouped under the `teams_api`
316+
key.
317+
234318
* **enable_teams_api**
235319
boolean parameter that toggles usage of the Teams API by the operator.
236320
The default is `true`.
@@ -276,6 +360,9 @@ words.
276360
infrastructure role. The default is `admin`.
277361

278362
## Logging and REST API
363+
364+
Parameters affecting logging and REST API listener. In the CRD-based configuration they are grouped under the `logging_rest_api` key.
365+
279366
* **api_port**
280367
REST API listener listens to this port. The default is `8080`.
281368

@@ -286,6 +373,11 @@ words.
286373
number of entries in the cluster history ring buffer. The default is `1000`.
287374

288375
## Scalyr options
376+
377+
Those parameters define the resource requests/limits and properties of the
378+
scalyr sidecar. In the CRD-based configuration they are grouped under the
379+
`scalyr` key.
380+
289381
* **scalyr_api_key**
290382
API key for the Scalyr sidecar. The default is empty.
291383

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
apiVersion: "acid.zalan.do/v1"
2+
kind: postgresql-operator-configuration
3+
metadata:
4+
name: postgresql-operator-default-configuration
5+
configuration:
6+
etcd_host: ""
7+
docker_image: registry.opensource.zalan.do/acid/spilo-cdp-10:1.4-p8
8+
workers: 4
9+
min_instances: -1
10+
max_instances: -1
11+
resync_period: 5m
12+
#sidecar_docker_images:
13+
# example: "exampleimage:exampletag"
14+
users:
15+
super_username: postgres
16+
replication_username: standby
17+
kubernetes:
18+
pod_service_account_name: operator
19+
pod_terminate_grace_period: 5m
20+
pdb_name_format: "postgres-{cluster}-pdb"
21+
secret_name_template: "{username}.{cluster}.credentials.{tprkind}.{tprgroup}"
22+
oauth_token_secret_name: postgresql-operator
23+
pod_role_label: spilo-role
24+
cluster_labels:
25+
application: spilo
26+
cluster_name_label: cluster-name
27+
# watched_namespace:""
28+
# node_readiness_label: ""
29+
# toleration: {}
30+
# infrastructure_roles_secret_name: ""
31+
# pod_environment_configmap: ""
32+
postgres_pod_resources:
33+
default_cpu_request: 100m
34+
default_memory_request: 100Mi
35+
default_cpu_limit: "3"
36+
default_memory_limit: 1Gi
37+
timeouts:
38+
resource_check_interval: 3s
39+
resource_check_timeout: 10m
40+
pod_label_wait_timeout: 10m
41+
pod_deletion_wait_timeout: 10m
42+
ready_wait_interval: 4s
43+
ready_wait_timeout: 30s
44+
load_balancer:
45+
enable_master_load_balancer: false
46+
enable_replica_load_balancer: false
47+
master_dns_name_format: "{cluster}.{team}.{hostedzone}"
48+
replica_dns_name_format: "{cluster}-repl.{team}.{hostedzone}"
49+
aws_or_gcp:
50+
# db_hosted_zone: ""
51+
# wal_s3_bucket: ""
52+
# log_s3_bucket: ""
53+
# kube_iam_role: ""
54+
aws_region: eu-central-1
55+
debug:
56+
debug_logging: true
57+
enable_database_access: true
58+
teams_api:
59+
enable_teams_api: false
60+
team_api_role_configuration:
61+
log_statement: all
62+
enable_team_superuser: false
63+
team_admin_role: admin
64+
pam_role_name: zalandos
65+
# pam_configuration: ""
66+
protected_role_names:
67+
- admin
68+
# teams_api_url: ""
69+
logging_rest_api:
70+
api_port: 8008
71+
ring_log_lines: 100
72+
cluster_history_entries: 1000
73+
scalyr:
74+
scalyr_cpu_request: 100m
75+
scalyr_memory_request: 50Mi
76+
scalyr_cpu_limit: "1"
77+
scalyr_memory_limit: 1Gi
78+
# scalyr_api_key: ""
79+
# scalyr_image: ""
80+
# scalyr_server_url: ""
81+

pkg/cluster/cluster.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ func (c *Cluster) setStatus(status spec.PostgresStatus) {
155155

156156
_, err = c.KubeClient.CRDREST.Patch(types.MergePatchType).
157157
Namespace(c.Namespace).
158-
Resource(constants.CRDResource).
158+
Resource(constants.PostgresCRDResource).
159159
Name(c.Name).
160160
Body(request).
161161
DoRaw()

pkg/cluster/util.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -424,7 +424,7 @@ func (c *Cluster) credentialSecretNameForCluster(username string, clusterName st
424424
return c.OpConfig.SecretNameTemplate.Format(
425425
"username", strings.Replace(username, "_", "-", -1),
426426
"cluster", clusterName,
427-
"tprkind", constants.CRDKind,
427+
"tprkind", constants.PostgresCRDKind,
428428
"tprgroup", constants.CRDGroup)
429429
}
430430

0 commit comments

Comments
 (0)