<refsynopsisdiv>
<cmdsynopsis>
- <command>pg_rewind </command>
- <arg choice="plain"><option>-D</option> --target-pgdata=<replaceable>DIRECTORY</replaceable></arg>
- <arg choice="plain"><option>--source-pgdata=<replaceable>DIRECTORY</replaceable></option></arg>
- <arg choice="plain"><option>--source-server=<replaceable>CONNSTR</replaceable></option></arg>
- <arg choice="opt"><option>-v</option></arg>
- <arg choice="opt"><option>-n</option> --dry-run</arg>
- <arg choice="opt"><option>--help</option></arg>
+ <command>pg_rewind</command>
+ <arg rep="repeat"><replaceable>option</replaceable></arg>
+ <group choice="plain">
+ <group choice="req">
+ <arg choice="plain"><option>-D </option></arg>
+ <arg choice="plain"><option>--target-pgdata</option></arg>
+ </group>
+ <replaceable> directory</replaceable>
+ <group choice="req">
+ <arg choice="plain"><option>--source-pgdata=<replaceable>directory</replaceable></option></arg>
+ <arg choice="plain"><option>--source-server=<replaceable>connstr</replaceable></option></arg>
+ </group>
+ </group>
</cmdsynopsis>
</refsynopsisdiv>
<title>Description</title>
<para>
- pg_rewind is a tool for synchronizing a PostgreSQL data directory with another
-PostgreSQL data directory that was forked from the first one. The result is
-equivalent to rsyncing the first data directory (referred to as the old cluster
-from now on) with the second one (the new cluster). The advantage of pg_rewind
-over rsync is that pg_rewind uses the WAL to determine changed data blocks,
-and does not require reading through all files in the cluster. That makes it
-a lot faster when the database is large and only a small portion of it differs
-between the clusters.
+ <application>pg_rewind</> is a tool for synchronizing a PostgreSQL cluster
+ with another copy of the same cluster, after the cluster's timelines have
+ diverged. A typical scenario is to bring an old master server back online
+ after failover, as a standby that follows the new master.
</para>
- </refsect1>
+
+ <para>
+ The result is equivalent to replacing the target data directory with the
+ source one. All files are copied, including configuration files. The advantage
+ of pg_rewind over restoring a base backup, or tools like rsync, is that
+ pg_rewind does not require reading through all unchanged files in the cluster.
+ That makes it a lot faster when the database is large and only a small portion
+ of it differs between the clusters.
+ </para>
+
+ <para>
+ pg_rewind examines the timeline histories of the source and target clusters
+ to determine the point where they diverged, and expects to find WAL in the
+ target cluster's pg_xlog directory reaching all the way back to the point of
+ divergence. In the typical failover scenario where the target cluster was shut
+ down soon after the divergence, that is not a problem, but if the target cluster
+ had run for a long time after the divergence, the old WAL files might not be
+ present anymore. In that case, they can be manually copied from the WAL archive
+ to the pg_xlog directory. Fetching missing files from a WAL archive automatically
+ is currently not supported.
+ </para>
+</refsect1>
<refsect1>
<title>Options</title>
<application>pg_rewind</application> accepts the following command-line arguments:
<variablelist>
+ <varlistentry>
+ <term><option>-D</option></term>
+ <term><option>--target-pgdata</option></term>
+ <listitem>
+ <para>
+ This option specifies the target data directory that is synchronized with
+ the source. The target server must cleanly shut down before running
+ <application>pg_rewind</application>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>--source-pgdata</option></term>
+ <listitem>
+ <para>
+ Specifies a source data directory to synchronize the target with. When
+ --source-pgdata is used, the source server master be cleanly shut down.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>--source-server</option></term>
+ <listitem>
+ <para>
+ Specifies a libpq connection string to connect to the source PostgreSQL
+ server to synchronize the target with. The server must be up and running,
+ and must not be in recovery mode.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-n</option></term>
+ <term><option>--dry-run</option></term>
+ <listitem>
+ <para>
+ Do everything except actually modifying the target directory.
+ </para>
+ </listitem>
+ </varlistentry>
<varlistentry>
<term><option>-v</option></term>
<term><option>--verbose</option></term>
- <listitem><para>enable verbose progress information</para></listitem>
+ <listitem><para>Enable verbose progress information</para></listitem>
</varlistentry>
<varlistentry>
<term><option>-V</option></term>
<term><option>--version</option></term>
- <listitem><para>display version information, then exit</para></listitem>
+ <listitem><para>Display version information, then exit</para></listitem>
</varlistentry>
<varlistentry>
<term><option>-?</option></term>
<term><option>--help</option></term>
- <listitem><para>show help, then exit</para></listitem>
+ <listitem><para>Show help, then exit</para></listitem>
</varlistentry>
</variablelist>
</para>
-
</refsect1>
<refsect1>
- <title>Usage</title>
-
- <para>
-<synopsis>
- pg_rewind --target-pgdata=<replaceable>path</replaceable> --source-server=<replaceable>new server's conn string</replaceable>
-</synopsis>
-The contents of the old data directory will be overwritten with the new data
-so that after pg_rewind finishes, the old data directory is equal to the new
-one.
- </para>
+ <title>Environment</title>
<para>
- pg_rewind expects to find all the necessary WAL files in the pg_xlog
- directories of the clusters. This includes all the WAL on both clusters
- starting from the last common checkpoint preceding the fork. Fetching
- missing files from a WAL archive is currently not supported. However, you
- can copy any missing files manually from the WAL archive to the pg_xlog
- directory.
+ When --source-server option is used, <command>pg_rewind</command> also uses the
+ environment variables supported by <application>libpq</> (see
+ <xref linkend="libpq-envars">).
</para>
</refsect1>
<title>Theory of operation</title>
<para>
-
-The basic idea is to copy everything from the new cluster to the old cluster,
-except for the blocks that we know to be the same.
-
-1. Scan the WAL log of the old cluster, starting from the last checkpoint before
-the point where the new cluster's timeline history forked off from the old cluster.
-For each WAL record, make a note of the data blocks that were touched. This yields
-a list of all the data blocks that were changed in the old cluster, after the new
-cluster forked off.
-
-2. Copy all those changed blocks from the new cluster to the old cluster.
-
-3. Copy all other files like clog, conf files etc. from the new cluster to old cluster.
-Everything except the relation files.
-
-4. Apply the WAL from the new cluster, starting from the checkpoint created at
-failover. (pg_rewind doesn't actually apply the WAL, it just creates a backup
-label file indicating that when PostgreSQL is started, it will start replay
-from that checkpoint and apply all the required WAL)
+ The basic idea is to copy everything from the new cluster to the old cluster,
+ except for the blocks that we know to be the same.
</para>
+
+ <procedure>
+ <step>
+ <para>
+ Scan the WAL log of the old cluster, starting from the last checkpoint
+ before the point where the new cluster's timeline history forked off
+ from the old cluster. For each WAL record, make a note of the data
+ blocks that were touched. This yields a list of all the data blocks
+ that were changed in the old cluster, after the new cluster forked off.
+ </para>
+ </step>
+ <step>
+ <para>
+ Copy all those changed blocks from the new cluster to the old cluster.
+ </para>
+ </step>
+ <step>
+ <para>
+ Copy all other files like clog, conf files etc. from the new cluster
+ to old cluster. Everything except the relation files.
+ </para>
+ </step>
+ <step>
+ <para>
+ Apply the WAL from the new cluster, starting from the checkpoint
+ created at failover. (pg_rewind doesn't actually apply the WAL, it
+ just creates a backup label file indicating that when PostgreSQL is
+ started, it will start replay from that checkpoint and apply all the
+ required WAL)
+ </para>
+ </step>
+ </procedure>
</refsect1>
<refsect1>
<title>Restrictions</title>
<para>
- <application>pg_reind</> needs that cluster uses either data checksums that
+ <application>pg_rewind</> needs that cluster uses either data checksums that
can be enabled at server initialization with initdb or WAL logging of hint
- bits that can be enabled by settings "wal_log_hints = on" in postgresql.conf.
+ bits that can be enabled by settings <varname>wal_log_hints</> to <literal>on</>
+ in <filename>postgresql.conf</>.
</para>
</refsect1>