Use a fd opened for read/write when syncing slots during startup.
authorAndres Freund <[email protected]>
Mon, 27 Apr 2015 22:12:38 +0000 (00:12 +0200)
committerAndres Freund <[email protected]>
Mon, 27 Apr 2015 22:17:43 +0000 (00:17 +0200)
Some operating systems, including the reporter's windows, return EBADFD
or similar when fsync() is invoked on a O_RDONLY file descriptor.
Unfortunately RestoreSlotFromDisk() does exactly that; which causes
failures after restarts in at least some scenarios.

If you hit the bug the error message will be something like
ERROR: could not fsync file "pg_replslot/$name/state": Bad file descriptor

Simply use O_RDWR instead of O_RDONLY when opening the relevant file
descriptor to fix the bug.  Unfortunately I have no way of verifying the
fix, but we've seen similar problems in the past.

This bug goes back to 9.4 where slots were introduced. Backpatch
accordingly.

Reported-By: Patrice Drolet
Bug: #13143:
Discussion: 20150424101006[email protected]

src/backend/replication/slot.c

index fa1f07b3f3e9a36a8830c2bd6f5ef00bbd902e01..d2e184237462eb81db949a82d4b494c31f110e79 100644 (file)
@@ -1092,7 +1092,7 @@ RestoreSlotFromDisk(const char *name)
 
        elog(DEBUG1, "restoring replication slot from \"%s\"", path);
 
-       fd = OpenTransientFile(path, O_RDONLY | PG_BINARY, 0);
+       fd = OpenTransientFile(path, O_RDWR | PG_BINARY, 0);
 
        /*
         * We do not need to handle this as we are rename()ing the directory into