Skip to content

Commit 0dc370f

Browse files
FxKumachine424
andauthored
standby cluster that streams from a remote primary (zalando#1830)
* add the possibility to create a standby cluster that streams from a remote primary * extending unit tests * add more docs and e2e test Co-authored-by: machine424 <[email protected]>
1 parent 2dfb11a commit 0dc370f

File tree

12 files changed

+303
-84
lines changed

12 files changed

+303
-84
lines changed

charts/postgres-operator/crds/postgresqls.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,17 @@ spec:
460460
type: string
461461
gs_wal_path:
462462
type: string
463+
standby_host:
464+
type: string
465+
standby_port:
466+
type: string
467+
oneOf:
468+
- required:
469+
- s3_wal_path
470+
- required:
471+
- gs_wal_path
472+
- required:
473+
- standby_host
463474
streams:
464475
type: array
465476
nullable: true

docs/administrator.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1087,12 +1087,16 @@ data:
10871087

10881088
### Standby clusters
10891089

1090-
The setup for [standby clusters](user.md#setting-up-a-standby-cluster) is very
1091-
similar to cloning. At the moment, the operator only allows for streaming from
1092-
the S3 WAL archive of the master specified in the manifest. Like with cloning,
1093-
if you are using [additional environment variables](#custom-pod-environment-variables)
1094-
to access your backup location you have to copy those variables and prepend the
1095-
`STANDBY_` prefix for Spilo to find the backups and WAL files to stream.
1090+
The setup for [standby clusters](user.md#setting-up-a-standby-cluster) is
1091+
similar to cloning when they stream changes from a WAL archive (S3 or GCS).
1092+
If you are using [additional environment variables](#custom-pod-environment-variables)
1093+
to access your backup location you have to copy those variables and prepend
1094+
the `STANDBY_` prefix for Spilo to find the backups and WAL files to stream.
1095+
1096+
Alternatively, standby clusters can also stream from a remote primary cluster.
1097+
You have to specify the host address. Port is optional and defaults to 5432.
1098+
Note, that only one of the options (`s3_wal_path`, `gs_wal_path`,
1099+
`standby_host`) can be present under the `standby` top-level key.
10961100

10971101
## Logical backups
10981102

docs/reference/cluster_manifest.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -395,16 +395,22 @@ under the `clone` top-level key and do not affect the already running cluster.
395395
## Standby cluster
396396

397397
On startup, an existing `standby` top-level key creates a standby Postgres
398-
cluster streaming from a remote location. So far streaming from S3 and GCS WAL
399-
archives is supported.
398+
cluster streaming from a remote location - either from a S3 or GCS WAL
399+
archive or a remote primary. Only one of options is allowed and required
400+
if the `standby` key is present.
400401

401402
* **s3_wal_path**
402403
the url to S3 bucket containing the WAL archive of the remote primary.
403-
Optional, but `s3_wal_path` or `gs_wal_path` is required.
404404

405405
* **gs_wal_path**
406406
the url to GS bucket containing the WAL archive of the remote primary.
407-
Optional, but `s3_wal_path` or `gs_wal_path` is required.
407+
408+
* **standby_host**
409+
hostname or IP address of the primary to stream from.
410+
411+
* **standby_port**
412+
TCP port on which the primary is listening for connections. Patroni will
413+
use `"5432"` if not set.
408414

409415
## Volume properties
410416

docs/user.md

Lines changed: 35 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -838,15 +838,15 @@ point you should restore.
838838
## Setting up a standby cluster
839839

840840
Standby cluster is a [Patroni feature](https://github.com/zalando/patroni/blob/master/docs/replica_bootstrap.rst#standby-cluster)
841-
that first clones a database, and keeps replicating changes afterwards. As the
842-
replication is happening by the means of archived WAL files (stored on S3 or
843-
the equivalent of other cloud providers), the standby cluster can exist in a
844-
different location than its source database. Unlike cloning, the PostgreSQL
845-
version between source and target cluster has to be the same.
841+
that first clones a database, and keeps replicating changes afterwards. It can
842+
exist in a different location than its source database, but unlike cloning,
843+
the PostgreSQL version between source and target cluster has to be the same.
846844

847845
To start a cluster as standby, add the following `standby` section in the YAML
848-
file. Specify the S3/GS bucket path. Omitting both settings will result in an error
849-
and no statefulset will be created.
846+
file. You can stream changes from archived WAL files (AWS S3 or Google Cloud
847+
Storage) or from a remote primary where you specify the host address and port.
848+
If you leave out the port, Patroni will use `"5432"`. Only one option can be
849+
specfied in the manifest:
850850

851851
```yaml
852852
spec:
@@ -860,32 +860,42 @@ spec:
860860
gs_wal_path: "gs://<bucketname>/spilo/<source_db_cluster>/<UID>/wal/<PGVERSION>"
861861
```
862862

863-
At the moment, the operator only allows to stream from the WAL archive of the
864-
master. Thus, it is recommended to deploy standby clusters with only [one pod](https://github.com/zalando/postgres-operator/blob/master/manifests/standby-manifest.yaml#L10).
865-
You can raise the instance count when detaching. Note, that the same pod role
866-
labels like for normal clusters are used: The standby leader is labeled as
867-
`master`.
863+
```yaml
864+
spec:
865+
standby:
866+
standby_host: "acid-minimal-cluster.default"
867+
standby_port: "5433"
868+
```
869+
870+
Note, that the pods and services use the same role labels like for normal clusters:
871+
The standby leader is labeled as `master`. When using the `standby_host` option
872+
you have to copy the credentials from the source cluster's secrets to successfully
873+
bootstrap a standby cluster (see next chapter).
868874

869875
### Providing credentials of source cluster
870876

871877
A standby cluster is replicating the data (including users and passwords) from
872878
the source database and is read-only. The system and application users (like
873879
standby, postgres etc.) all have a password that does not match the credentials
874-
stored in secrets which are created by the operator. One solution is to create
875-
secrets beforehand and paste in the credentials of the source cluster.
876-
Otherwise, you will see errors in the Postgres logs saying users cannot log in
877-
and the operator logs will complain about not being able to sync resources.
880+
stored in secrets which are created by the operator. You have two options:
878881

879-
When you only run a standby leader, you can safely ignore this, as it will be
880-
sorted out once the cluster is detached from the source. It is also harmless if
881-
you don’t plan it. But, when you created a standby replica, too, fix the
882-
credentials right away. WAL files will pile up on the standby leader if no
883-
connection can be established between standby replica(s). You can also edit the
884-
secrets after their creation. Find them by:
882+
a. Create secrets manually beforehand and paste the credentials of the source
883+
cluster
884+
b. Let the operator create the secrets when it bootstraps the standby cluster.
885+
Patch the secrets with the credentials of the source cluster. Replace the
886+
spilo pods.
885887

886-
```bash
887-
kubectl get secrets --all-namespaces | grep <standby-cluster-name>
888-
```
888+
Otherwise, you will see errors in the Postgres logs saying users cannot log in
889+
and the operator logs will complain about not being able to sync resources.
890+
If you stream changes from a remote primary you have to align the secrets or
891+
the standby cluster will not start up.
892+
893+
If you stream changes from WAL files and you only run a standby leader, you
894+
can safely ignore the secret mismatch, as it will be sorted out once the
895+
cluster is detached from the source. It is also harmless if you do not plan it.
896+
But, when you create a standby replica, too, fix the credentials right away.
897+
WAL files will pile up on the standby leader if no connection can be
898+
established between standby replica(s).
889899

890900
### Promote the standby
891901

e2e/tests/k8s_api.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -321,9 +321,15 @@ def get_cluster_leader_pod(self, labels='application=spilo,cluster-name=acid-min
321321
def get_cluster_replica_pod(self, labels='application=spilo,cluster-name=acid-minimal-cluster', namespace='default'):
322322
return self.get_cluster_pod('replica', labels, namespace)
323323

324-
def get_secret_data(self, username, clustername='acid-minimal-cluster', namespace='default'):
325-
return self.api.core_v1.read_namespaced_secret(
326-
"{}.{}.credentials.postgresql.acid.zalan.do".format(username.replace("_","-"), clustername), namespace).data
324+
def get_secret(self, username, clustername='acid-minimal-cluster', namespace='default'):
325+
secret = self.api.core_v1.read_namespaced_secret(
326+
"{}.{}.credentials.postgresql.acid.zalan.do".format(username.replace("_","-"), clustername), namespace)
327+
secret.metadata.resource_version = None
328+
secret.metadata.uid = None
329+
return secret
330+
331+
def create_secret(self, secret, namespace='default'):
332+
return self.api.core_v1.create_namespaced_secret(namespace, secret)
327333

328334
class K8sBase:
329335
'''

e2e/tests/test_e2e.py

Lines changed: 44 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1319,8 +1319,8 @@ def test_password_rotation(self):
13191319
self.eventuallyEqual(lambda: k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")
13201320

13211321
# check if next rotation date was set in secret
1322-
secret_data = k8s.get_secret_data("zalando")
1323-
next_rotation_timestamp = datetime.strptime(str(base64.b64decode(secret_data["nextRotation"]), 'utf-8'), "%Y-%m-%dT%H:%M:%SZ")
1322+
zalando_secret = k8s.get_secret("zalando")
1323+
next_rotation_timestamp = datetime.strptime(str(base64.b64decode(zalando_secret.data["nextRotation"]), 'utf-8'), "%Y-%m-%dT%H:%M:%SZ")
13241324
today90days = today+timedelta(days=90)
13251325
self.assertEqual(today90days, next_rotation_timestamp.date(),
13261326
"Unexpected rotation date in secret of zalando user: expected {}, got {}".format(today90days, next_rotation_timestamp.date()))
@@ -1361,9 +1361,9 @@ def test_password_rotation(self):
13611361
"Operator does not get in sync")
13621362

13631363
# check if next rotation date and username have been replaced
1364-
secret_data = k8s.get_secret_data("foo_user")
1365-
secret_username = str(base64.b64decode(secret_data["username"]), 'utf-8')
1366-
next_rotation_timestamp = datetime.strptime(str(base64.b64decode(secret_data["nextRotation"]), 'utf-8'), "%Y-%m-%dT%H:%M:%SZ")
1364+
foo_user_secret = k8s.get_secret("foo_user")
1365+
secret_username = str(base64.b64decode(foo_user_secret.data["username"]), 'utf-8')
1366+
next_rotation_timestamp = datetime.strptime(str(base64.b64decode(foo_user_secret.data["nextRotation"]), 'utf-8'), "%Y-%m-%dT%H:%M:%SZ")
13671367
rotation_user = "foo_user"+today.strftime("%y%m%d")
13681368
today30days = today+timedelta(days=30)
13691369

@@ -1396,9 +1396,9 @@ def test_password_rotation(self):
13961396
"Operator does not get in sync")
13971397

13981398
# check if username in foo_user secret is reset
1399-
secret_data = k8s.get_secret_data("foo_user")
1400-
secret_username = str(base64.b64decode(secret_data["username"]), 'utf-8')
1401-
next_rotation_timestamp = str(base64.b64decode(secret_data["nextRotation"]), 'utf-8')
1399+
foo_user_secret = k8s.get_secret("foo_user")
1400+
secret_username = str(base64.b64decode(foo_user_secret.data["username"]), 'utf-8')
1401+
next_rotation_timestamp = str(base64.b64decode(foo_user_secret.data["nextRotation"]), 'utf-8')
14021402
self.assertEqual("foo_user", secret_username,
14031403
"Unexpected username in secret of foo_user: expected {}, got {}".format("foo_user", secret_username))
14041404
self.assertEqual('', next_rotation_timestamp,
@@ -1644,6 +1644,42 @@ def test_statefulset_annotation_propagation(self):
16441644
self.eventuallyEqual(lambda: k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")
16451645
self.eventuallyTrue(lambda: k8s.check_statefulset_annotations(cluster_label, annotations), "Annotations missing")
16461646

1647+
@timeout_decorator.timeout(TEST_TIMEOUT_SEC)
1648+
def test_standby_cluster(self):
1649+
'''
1650+
Create standby cluster streaming from remote primary
1651+
'''
1652+
k8s = self.k8s
1653+
standby_cluster_name = 'acid-standby-cluster'
1654+
cluster_name_label = 'cluster-name'
1655+
cluster_label = 'application=spilo,{}={}'.format(cluster_name_label, standby_cluster_name)
1656+
superuser_name = 'postgres'
1657+
replication_user = 'standby'
1658+
secret_suffix = 'credentials.postgresql.acid.zalan.do'
1659+
1660+
# copy secrets from remote cluster before operator creates them when bootstrapping the standby cluster
1661+
postgres_secret = k8s.get_secret(superuser_name)
1662+
postgres_secret.metadata.name = '{}.{}.{}'.format(superuser_name, standby_cluster_name, secret_suffix)
1663+
postgres_secret.metadata.labels[cluster_name_label] = standby_cluster_name
1664+
k8s.create_secret(postgres_secret)
1665+
standby_secret = k8s.get_secret(replication_user)
1666+
standby_secret.metadata.name = '{}.{}.{}'.format(replication_user, standby_cluster_name, secret_suffix)
1667+
standby_secret.metadata.labels[cluster_name_label] = standby_cluster_name
1668+
k8s.create_secret(standby_secret)
1669+
1670+
try:
1671+
k8s.create_with_kubectl("manifests/standby-manifest.yaml")
1672+
k8s.wait_for_pod_start("spilo-role=master," + cluster_label)
1673+
1674+
except timeout_decorator.TimeoutError:
1675+
print('Operator log: {}'.format(k8s.get_operator_log()))
1676+
raise
1677+
finally:
1678+
# delete the standby cluster so that the k8s_api.get_operator_state works correctly in subsequent tests
1679+
k8s.api.custom_objects_api.delete_namespaced_custom_object(
1680+
"acid.zalan.do", "v1", "default", "postgresqls", "acid-standby-cluster")
1681+
time.sleep(5)
1682+
16471683
@timeout_decorator.timeout(TEST_TIMEOUT_SEC)
16481684
def test_taint_based_eviction(self):
16491685
'''

manifests/postgresql.crd.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -458,6 +458,17 @@ spec:
458458
type: string
459459
gs_wal_path:
460460
type: string
461+
standby_host:
462+
type: string
463+
standby_port:
464+
type: string
465+
oneOf:
466+
- required:
467+
- s3_wal_path
468+
- required:
469+
- gs_wal_path
470+
- required:
471+
- standby_host
461472
streams:
462473
type: array
463474
nullable: true

manifests/standby-manifest.yaml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ spec:
1010
numberOfInstances: 1
1111
postgresql:
1212
version: "14"
13-
# Make this a standby cluster and provide the s3 bucket path of source cluster for continuous streaming.
13+
# Make this a standby cluster and provide either the s3 bucket path of source cluster or the remote primary host for continuous streaming.
1414
standby:
15-
s3_wal_path: "s3://path/to/bucket/containing/wal/of/source/cluster/"
15+
# s3_wal_path: "s3://mybucket/spilo/acid-minimal-cluster/abcd1234-2a4b-4b2a-8c9c-c1234defg567/wal/14/"
16+
standby_host: "acid-minimal-cluster.default"
17+
# standby_port: "5432"

pkg/apis/acid.zalan.do/v1/crds.go

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -714,6 +714,17 @@ var PostgresCRDResourceValidation = apiextv1.CustomResourceValidation{
714714
"gs_wal_path": {
715715
Type: "string",
716716
},
717+
"standby_host": {
718+
Type: "string",
719+
},
720+
"standby_port": {
721+
Type: "string",
722+
},
723+
},
724+
OneOf: []apiextv1.JSONSchemaProps{
725+
apiextv1.JSONSchemaProps{Required: []string{"s3_wal_path"}},
726+
apiextv1.JSONSchemaProps{Required: []string{"gs_wal_path"}},
727+
apiextv1.JSONSchemaProps{Required: []string{"standby_host"}},
717728
},
718729
},
719730
"streams": {

pkg/apis/acid.zalan.do/v1/postgresql_type.go

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -170,10 +170,12 @@ type Patroni struct {
170170
SynchronousNodeCount uint32 `json:"synchronous_node_count,omitempty" defaults:"1"`
171171
}
172172

173-
// StandbyDescription contains s3 wal path
173+
// StandbyDescription contains remote primary config or s3/gs wal path
174174
type StandbyDescription struct {
175-
S3WalPath string `json:"s3_wal_path,omitempty"`
176-
GSWalPath string `json:"gs_wal_path,omitempty"`
175+
S3WalPath string `json:"s3_wal_path,omitempty"`
176+
GSWalPath string `json:"gs_wal_path,omitempty"`
177+
StandbyHost string `json:"standby_host,omitempty"`
178+
StandbyPort string `json:"standby_port,omitempty"`
177179
}
178180

179181
// TLSDescription specs TLS properties

0 commit comments

Comments
 (0)