Fujii Masao [Fri, 30 May 2025 15:08:40 +0000 (00:08 +0900)]
Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable more.
Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter
a non-interruptible loop when they successfully acquired a lock on a transaction
but the transaction still appeared to be running. Since this loop continued
until the transaction completed, it could result in long, uninterruptible waits.
Although this scenario is generally unlikely since XactLockTableWait() and
ConditionalXactLockTableWait() can basically acquire a transaction lock
only when the transaction is not running, it can occur in a hot standby.
In such cases, the transaction may still appear active due to
the KnownAssignedXids list, even while no lock on the transaction exists.
For example, this situation can happen when creating a logical replication
slot on a standby.
The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS()
within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions,
ensuring they can be interrupted safely.
Back-patch to all supported branches.
Author: Kevin K Biju <
[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com
Backpatch-through: 13
Álvaro Herrera [Fri, 30 May 2025 14:18:18 +0000 (16:18 +0200)]
Fix broken-FK-detection query in release notes
Commits
53af9491a043 and
2d5fe514052a fixed a number of problems with
foreign keys that reference partitioned tables, and a query to detect
already broken FKs was supplied with the release notes for 17.1, 16.5,
15.9, 14.14, 13.17. However, that query has a bug that causes it to
wrongly report self-referential foreign keys even when they are correct,
so if the user was to drop and rebuild the FKs as indicated, the query
would continue to report them as needing to be repaired. Here we fix
the query to not have that problem.
Reported-by: Paul Foerster <[email protected]>
Discussion: https://postgr.es/m/
5456A1D0-CD47-4315-9C65-
71B27E7A2906@gmail.com
Backpatch-through: 13-17
Tom Lane [Thu, 29 May 2025 14:39:55 +0000 (10:39 -0400)]
Avoid resource leaks when a dblink connection fails.
If we hit out-of-memory between creating the PGconn and inserting
it into dblink's hashtable, we'd lose track of the PGconn, which
is quite bad since it represents a live connection to a remote DB.
Fix by rearranging things so that we create the hashtable entry
first.
Also reduce the number of states we have to deal with by getting rid
of the separately-allocated remoteConn object, instead allocating it
in-line in the hashtable entries. (That incidentally removes a
session-lifespan memory leak observed in the regression tests.)
There is an apparently-irreducible remaining OOM hazard, which
is that if the connection fails at the libpq level (ie it's
CONNECTION_BAD) then we have to pstrdup the PGconn's error message
before we can release it, and theoretically that could fail. However,
in such cases we're only leaking memory not a live remote connection,
so I'm not convinced that it's worth sweating over.
This is a pretty low-probability failure mode of course, but losing
a live connection seems bad enough to justify back-patching.
Author: Tom Lane <
[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Discussion: https://postgr.es/m/
1346940.
1748381911@sss.pgh.pa.us
Backpatch-through: 13
Fujii Masao [Thu, 29 May 2025 08:50:32 +0000 (17:50 +0900)]
Fix assertion failure in pg_prewarm() on objects without storage.
An assertion test added in commit
049ef33 could fail when pg_prewarm()
was called on objects without storage, such as partitioned tables.
This resulted in the following failure in assert-enabled builds:
Failed Assert("RelFileNumberIsValid(rlocator.relNumber)")
Note that, in non-assert builds, pg_prewarm() just failed with an error
in that case, so there was no ill effect in practice.
This commit fixes the issue by having pg_prewarm() raise an error early
if the specified object has no storage. This approach is similar to
the fix in commit
4623d7144 for pg_freespacemap.
Back-patched to v17, where the issue was introduced.
Author: Masahiro Ikeda <
[email protected]>
Reviewed-by: Dilip Kumar <[email protected]>
Reviewed-by: Richard Guo <[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/
e082e6027610fd0a4091ae6d033aa117@oss.nttdata.com
Backpatch-through: 17
Michael Paquier [Thu, 29 May 2025 02:26:23 +0000 (11:26 +0900)]
pg_stat_statements: Fix parameter number gaps in normalized queries
pg_stat_statements anticipates that certain constant locations may be
recorded multiple times and attempts to avoid calculating a length for
these locations in fill_in_constant_lengths().
However, during generate_normalized_query() where normalized query
strings are generated, these locations are not excluded from
consideration. This could increment the parameter number counter for
every recorded occurrence at such a location, leading to an incorrect
normalization in certain cases with gaps in the numbers reported.
For example, take this query:
SELECT WHERE '1' IN ('2'::int, '3'::int::text)
Before this commit, it would be normalized like that, with gaps in the
parameter numbers:
SELECT WHERE $1 IN ($3::int, $4::int::text)
However the correct, less confusing one should be like that:
SELECT WHERE $1 IN ($2::int, $3::int::text)
This commit fixes the computation of the parameter numbers to track the
number of constants replaced with an $n by a separate counter instead of
the iterator used to loop through the list of locations.
The underlying query IDs are not changed, neither are the normalized
strings for existing PGSS hash entries. New entries with fresh
normalized queries would automatically get reshaped based on the new
parameter numbering.
Issue discovered while discussing a separate problem for HEAD, but this
affects all the stable branches.
Author: Sami Imseih <
[email protected]>
Discussion: https://postgr.es/m/CAA5RZ0tzxvWXsacGyxrixdhy3tTTDfJQqxyFBRFh31nNHBQ5qA@mail.gmail.com
Backpatch-through: 13
Michael Paquier [Wed, 28 May 2025 00:43:45 +0000 (09:43 +0900)]
Adjust regex for test with opening parenthesis in character classes
As written, the test was throwing an error because of an unbalanced
parenthesis. The regex used in the test is adjusted to not fail and to
test the case of an opening parenthesis in a character class after some
nested square brackets.
Oversight in
d46911e584d4.
Discussion: https://postgr.es/m/
16ab039d1af455652bdf4173402ddda145f2c73b[email protected]
Michael Paquier [Tue, 27 May 2025 23:59:22 +0000 (08:59 +0900)]
Fix conversion of SIMILAR TO regexes for character classes
The code that translates SIMILAR TO pattern matching expressions to
POSIX-style regular expressions did not consider that square brackets
can be nested. For example, in an expression like [[:alpha:]%_], the
logic replaced the placeholders '_' and '%' but it should not.
This commit fixes the conversion logic by tracking the nesting level of
square brackets marking character class areas, while considering that
in expressions like []] or [^]] the first closing square bracket is a
regular character. Multiple tests are added to show how the conversions
should or should not apply applied while in a character class area, with
specific cases added for all the characters converted outside character
classes like an opening parenthesis '(', dollar sign '$', etc.
Author: Laurenz Albe <
[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/
16ab039d1af455652bdf4173402ddda145f2c73b[email protected]
Backpatch-through: 13
Michael Paquier [Mon, 26 May 2025 08:28:40 +0000 (17:28 +0900)]
Fix race condition in subscription TAP test 021_twophase
The test did not wait for all the subscriptions to have caught up when
dropping the subscription "tab_copy". In a slow environment, it could
be possible for the replay of the COMMIT PREPARED transaction "mygid"
to not be confirmed yet, causing one prepared transaction to be left
around before moving to the next steps of the test.
One failure noticed is a transaction found in pg_prepared_xacts for the
cases where copy_data = false and two_phase = true, but there should be
none after dropping the subscription.
As an extra safety measure, a check is added before dropping the
subscription, scanning pg_prepared_xacts to make sure that no prepared
transactions are left once both subscriptions have caught up.
Issue introduced by
a8fd13cab0ba, fixing a problem similar to
eaf5321c3524.
Per buildfarm member kestrel.
Author: Vignesh C <
[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Discussion: https://postgr.es/m/CALDaNm329QaZ+bwU--bW6GjbNSZ8-38cDE8QWofafub7NV67oA@mail.gmail.com
Backpatch-through: 15
Fujii Masao [Mon, 26 May 2025 03:47:33 +0000 (12:47 +0900)]
doc: Fix documenation for snapshot export in logical decoding.
The documentation for exported snapshots in logical decoding previously
stated that snapshot creation may fail on a hot standby. This is no longer
accurate, as snapshot exporting on standbys has been supported since
PostgreSQL 10. This commit removes the outdated description.
Additionally, the docs referred to the NOEXPORT_SNAPSHOT option to
suppress snapshot exporting in CREATE_REPLICATION_SLOT. However,
since PostgreSQL 15, NOEXPORT_SNAPSHOT is considered legacy syntax
and retained only for backward compatibility. This commit updates
the documentation for v15 and later to use the modern equivalent:
SNAPSHOT 'nothing'. The older syntax is preserved in documentation for
v14 and earlier.
Back-patched to all supported branches.
Reported-by: Kevin K Biju <[email protected]>
Author: Fujii Masao <
[email protected]>
Reviewed-by: Kevin K Biju <[email protected]>
Discussion: https://postgr.es/m/
174791480466.798.
17122832105389395178@wrigleys.postgresql.org
Backpatch-through: 13
Tom Lane [Fri, 23 May 2025 18:43:43 +0000 (14:43 -0400)]
Fix per-relation memory leakage in autovacuum.
PgStat_StatTabEntry and AutoVacOpts structs were leaked until
the end of the autovacuum worker's run, which is bad news if
there are a lot of relations in the database.
Note: pfree'ing the PgStat_StatTabEntry structs here seems a bit
risky, because pgstat_fetch_stat_tabentry_ext does not guarantee
anything about whether its result is long-lived. It appears okay
so long as autovacuum forces PGSTAT_FETCH_CONSISTENCY_NONE, but
I think that API could use a re-think.
Also ensure that the VacuumRelation structure passed to
vacuum() is in recoverable storage.
Back-patch to v15 where we started to manage table statistics
this way. (The AutoVacOpts leakage is probably older, but
I'm not excited enough to worry about just that part.)
Author: Tom Lane <
[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/285483.
1746756246@sss.pgh.pa.us
Backpatch-through: 15
Tom Lane [Fri, 23 May 2025 15:47:33 +0000 (11:47 -0400)]
Fix AlignedAllocRealloc to cope sanely with OOM.
If the inner allocation call returns NULL, we should restore the
previous state and return NULL. Previously this code pfree'd
the old chunk anyway, which is surely wrong.
Also, make it call MemoryContextAllocationFailure rather than
summarily returning NULL. The fact that we got control back from the
inner call proves that MCXT_ALLOC_NO_OOM was passed, so this change
is just cosmetic, but someday it might be less so.
This is just a latent bug at present: AFAICT no in-core callers use
this function at all, let alone call it with MCXT_ALLOC_NO_OOM.
Still, it's the kind of bug that might bite back-patched code pretty
hard someday, so let's back-patch to v17 where the bug was introduced
(by commit
743112a2e).
Author: Tom Lane <
[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/285483.
1746756246@sss.pgh.pa.us
Backpatch-through: 17
Tom Lane [Thu, 22 May 2025 17:52:46 +0000 (13:52 -0400)]
Fix memory leak in XMLSERIALIZE(... INDENT).
xmltotext_with_options sometimes tries to replace the existing
root node of a libxml2 document. In that case xmlDocSetRootElement
will unlink and return the old root node; if we fail to free it,
it's leaked for the remainder of the session. The amount of memory
at stake is not large, a couple hundred bytes per occurrence, but
that could still become annoying in heavy usage.
Our only other xmlDocSetRootElement call is not at risk because
it's working on a just-created document, but let's modify that
code too to make it clear that it's dependent on that.
Author: Tom Lane <
[email protected]>
Reviewed-by: Jim Jones <[email protected]>
Discussion: https://postgr.es/m/
1358967.
1747858817@sss.pgh.pa.us
Backpatch-through: 16
Fujii Masao [Wed, 21 May 2025 02:55:14 +0000 (11:55 +0900)]
Fix incorrect WAL description for PREPARE TRANSACTION record.
Since commit
8b1dccd37c7, the PREPARE TRANSACTION WAL record includes
information about dropped statistics entries. However, the WAL resource
manager description function for PREPARE TRANSACTION record failed to
parse this information correctly and always assumed there were
no such entries.
As a result, for example, pg_waldump could not display the dropped
statistics entries stored in PREPARE TRANSACTION records.
The root cause was that ParsePrepareRecord() did not set the number of
statistics entries to drop on commit or abort. These values remained
zero-initialized and were never updated from the parsed record.
This commit fixes the issue by properly setting those values during parsing.
With this fix, pg_waldump can now correctly report dropped statistics
entries in PREPARE TRANSACTION records.
Back-patch to v15, where commit
8b1dccd37c7 was introduced.
Author: Daniil Davydov <
[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/CAJDiXgh-6Epb2XiJe4uL0zF-cf0_s_7Lw1TfEHDMLzYjEmfGOw@mail.gmail.com
Backpatch-through: 15
Heikki Linnakangas [Tue, 20 May 2025 07:39:14 +0000 (10:39 +0300)]
Fix cross-version upgrade test failure
Commit
29f7ce6fe7 added another view that needs adjustment in the
cross-version upgrade test. This should fix the XversionUpgrade
failures in the buildfarm.
Backpatch-through: 16
Discussion: https://www.postgresql.org/message-id/18929-
077d6b7093b176e2@postgresql.org
Michael Paquier [Tue, 20 May 2025 05:39:10 +0000 (14:39 +0900)]
doc: Clarify use of _ccnew and _ccold in REINDEX CONCURRENTLY
Invalid indexes are suffixed with "_ccnew" or "_ccold". The
documentation missed to mention the initial underscore.
ChooseRelationName() may also append an extra number if indexes with a
similar name already exist; let's add a note about that too.
Author: Alec Cozens <
[email protected]>
Discussion: https://postgr.es/m/
174733277404.
1455388.
11471370288789479593@wrigleys.postgresql.org
Backpatch-through: 13
Heikki Linnakangas [Mon, 19 May 2025 15:50:26 +0000 (18:50 +0300)]
Fix deparsing FETCH FIRST <expr> ROWS WITH TIES
In the grammar, <expr> is a c_expr, which accepts only a limited set
of integer literals and simple expressions without parens. The
deparsing logic didn't quite match the grammar rule, and failed to use
parens e.g. for "5::bigint".
To fix, always surround the expression with parens. Would be nice to
omit the parens in simple cases, but unfortunately it's non-trivial to
detect such simple cases. Even if the expression is a simple literal
123 in the original query, after parse analysis it becomes a FuncExpr
with COERCE_IMPLICIT_CAST rather than a simple Const.
Reported-by: yonghao lee
Backpatch-through: 13
Discussion: https://www.postgresql.org/message-id/18929-
077d6b7093b176e2@postgresql.org
Amit Kapila [Mon, 19 May 2025 06:25:55 +0000 (11:55 +0530)]
Don't retreat slot's confirmed_flush LSN.
Prevent moving the confirmed_flush backwards, as this could lead to data
duplication issues caused by replicating already replicated changes.
This can happen when a client acknowledges an LSN it doesn't have to do
anything for, and thus didn't store persistently. After a restart, the
client can send the prior LSN that it stored persistently as an
acknowledgement, but we need to ignore such an LSN to avoid retreating
confirm_flush LSN.
Diagnosed-by: Zhijie Hou <[email protected]>
Author: shveta malik <
[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Reviewed-by: Dilip Kumar <[email protected]>
Tested-by: Nisha Moond <[email protected]>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAJpy0uDZ29P=BYB1JDWMCh-6wXaNqMwG1u1mB4=10Ly0x7HhwQ@mail.gmail.com
Discussion: https://postgr.es/m/OS0PR01MB57164AB5716AF2E477D53F6F9489A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Tom Lane [Sun, 18 May 2025 16:45:55 +0000 (12:45 -0400)]
Make our usage of memset_s() conform strictly to the C11 standard.
Per the letter of the C11 standard, one must #define
__STDC_WANT_LIB_EXT1__ as 1 before including <string.h> in order to
have access to memset_s(). It appears that many platforms are lenient
about this, because we weren't doing it and yet the code appeared to
work anyway. But we now find that with -std=c11, macOS is strict and
doesn't declare memset_s, leading to compile failures since we try to
use it anyway. (Given the lack of prior reports, perhaps this is new
behavior in the latest SDK? No matter, we're clearly in the wrong.)
In addition to the immediate problem, which could be fixed merely by
adding the needed #define to explicit_bzero.c, it seems possible that
our configure-time probe for memset_s() could fail in case a platform
implements the function in some odd way due to this spec requirement.
This concern can be fixed in largely the same way that we dealt with
strchrnul() in
6da2ba1d8: switch to using a declaration-based
configure probe instead of a does-it-link probe.
Back-patch to v13 where we started using memset_s().
Reported-by: Lakshmi Narayana Velayudam <[email protected]>
Author: Tom Lane <
[email protected]>
Discussion: https://postgr.es/m/CAA4pTnLcKGG78xeOjiBr5yS7ZeE-Rh=FaFQQGOO=nPzA1L8yEA@mail.gmail.com
Backpatch-through: 13
Daniel Gustafsson [Fri, 16 May 2025 15:20:07 +0000 (11:20 -0400)]
Align organization wording in copyright statement
This aligns the copyright and legal notice wordig with commit
a233a603bab8 and pgweb commit
2d764dbc083ab8. Backpatch down
to all supported versions.
Author: Daniel Gustafsson <
[email protected]>
Reviewed-by: Dave Page <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/
744E414E-3F52-404C-97FB-
ED9B3AA37DC8@yesql.se
Backpatch-through: 13
Richard Guo [Thu, 15 May 2025 08:09:04 +0000 (17:09 +0900)]
Fix Assert failure in XMLTABLE parser
In an XMLTABLE expression, columns can be marked NOT NULL, and the
parser internally fabricates an option named "is_not_null" to
represent this. However, the parser also allows users to specify
arbitrary option names. This creates a conflict: a user can
explicitly use "is_not_null" as an option name and assign it a
non-Boolean value, which violates internal assumptions and triggers an
assertion failure.
To fix, this patch checks whether a user-supplied name collides with
the internally reserved option name and raises an error if so.
Additionally, the internal name is renamed to "__pg__is_not_null" to
further reduce the risk of collision with user-defined names.
Reported-by: Евгений Горбанев <[email protected]>
Author: Richard Guo <
[email protected]>
Reviewed-by: Alvaro Herrera <[email protected]>
Discussion: https://postgr.es/m/
6bac9886-65bf-4cec-96bd-
e304159f28db@basealt.ru
Backpatch-through: 15
Daniel Gustafsson [Tue, 13 May 2025 11:29:14 +0000 (07:29 -0400)]
Fix order of parameters in POD documentation
The documentation for log_check() had the parameters in the wrong
order. Also while there, rename %parameters to %params to better
documentation for similar functions which use %params. Backpatch
down to v14 where this was introduced.
Author: Daniel Gustafsson <
[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/
9F503B5-32F2-45D7-A0AE-
952879AD65F1@yesql.se
Backpatch-through: 14
Álvaro Herrera [Sun, 11 May 2025 13:47:10 +0000 (09:47 -0400)]
Fix comment of tsquerysend()
The comment describes the order in which fields are sent, and it had one
of the fields in the wrong place.
This has been wrong since
e6dbcb72fafa (2008), so backpatch all the way
back.
Author: Emre Hasegeli <
[email protected]>
Discussion: https://postgr.es/m/CAE2gYzzf38bR_R=izhpMxAmqHXKeM5ajkmukh4mNs_oXfxcMCA@mail.gmail.com
Tom Lane [Sun, 11 May 2025 00:22:39 +0000 (20:22 -0400)]
Fix incorrect "return NULL" in BumpAllocLarge().
This must be "return MemoryContextAllocationFailure(context, size, flags)"
instead. The effect of this oversight is that if we got a malloc
failure right here, the code would act as though MCXT_ALLOC_NO_OOM
had been specified, whether it was or not. That would likely lead
to a null-pointer-dereference crash at the unsuspecting call site.
Noted while messing with a patch to improve our Valgrind leak
detection support. Back-patch to v17 where this code came in.
Tom Lane [Fri, 9 May 2025 16:29:01 +0000 (12:29 -0400)]
Skip RSA-PSS ssl test when using LibreSSL.
Presently, LibreSSL does not have working support for RSA-PSS,
so disable that test. Per discussion at
https://marc.info/?l=libressl&m=
174664225002441&w=2
they do intend to fix this, but it's a ways off yet.
Reported-by: Thomas Munro <[email protected]>
Author: Tom Lane <
[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com
Backpatch-through: 15
Tom Lane [Fri, 9 May 2025 15:50:33 +0000 (11:50 -0400)]
Centralize ssl tests' check for whether we're using LibreSSL.
Right now there's only one caller, so that this is merely
an exercise in shoving code from one module to another,
but there will shortly be another one. It seems better to
avoid having two copies of this highly-subject-to-change test.
Back-patch to v15, where we first introduced some tests that
don't work with LibreSSL.
Reported-by: Thomas Munro <[email protected]>
Author: Tom Lane <
[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com
Backpatch-through: 15
Daniel Gustafsson [Thu, 8 May 2025 11:53:16 +0000 (13:53 +0200)]
doc: Fix title markup for AT TIME ZONE and AT LOCAL
The title for AT TIME ZONE and AT LOCAL was accidentally wrapping the
"and" in the <literal> tag. Backpatch to v17 where it was introduced
in
97957fdbaa42.
Author: Noboru Saito <
[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Reviewed-by: Tatsuo Ishii <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/CAAM3qn+7QUWW9R6_YwPKXmky0xGE4n63U3EsxZeWE_QtogeU8g@mail.gmail.com
Backpatch-through: 17
Tom Lane [Mon, 5 May 2025 20:28:35 +0000 (16:28 -0400)]
Stamp 17.5.
Tom Lane [Mon, 5 May 2025 15:29:49 +0000 (11:29 -0400)]
Last-minute updates for release notes.
Security: CVE-2025-4207
Noah Misch [Mon, 5 May 2025 11:52:04 +0000 (04:52 -0700)]
With GB18030, prevent SIGSEGV from reading past end of allocation.
With GB18030 as source encoding, applications could crash the server via
SQL functions convert() or convert_from(). Applications themselves
could crash after passing unterminated GB18030 input to libpq functions
PQescapeLiteral(), PQescapeIdentifier(), PQescapeStringConn(), or
PQescapeString(). Extension code could crash by passing unterminated
GB18030 input to jsonapi.h functions. All those functions have been
intended to handle untrusted, unterminated input safely.
A crash required allocating the input such that the last byte of the
allocation was the last byte of a virtual memory page. Some malloc()
implementations take measures against that, making the SIGSEGV hard to
reach. Back-patch to v13 (all supported versions).
Author: Noah Misch <
[email protected]>
Author: Andres Freund <
[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Backpatch-through: 13
Security: CVE-2025-4207
Noah Misch [Mon, 5 May 2025 11:52:04 +0000 (04:52 -0700)]
Refactor test_escape.c for additional ways of testing.
Start the file with static functions not specific to pe_test_vectors
tests. This way, new tests can use them without disrupting the file's
layout. Change report_result() PQExpBuffer arguments to plain strings.
Back-patch to v13 (all supported versions), for the next commit.
Reviewed-by: Masahiko Sawada <[email protected]>
Backpatch-through: 13
Security: CVE-2025-4207
Peter Eisentraut [Mon, 5 May 2025 10:14:36 +0000 (12:14 +0200)]
Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash:
ff466b83eb6fbf9434ff087426546e3dc988135d
Tom Lane [Sun, 4 May 2025 17:52:59 +0000 (13:52 -0400)]
Release notes for 17.5, 16.9, 15.13, 14.18, 13.21.
Etsuro Fujita [Sat, 3 May 2025 10:10:01 +0000 (19:10 +0900)]
Fix typos in comments.
Also adjust the phrasing in the comments.
Author: Etsuro Fujita <
[email protected]>
Author: Heikki Linnakangas <
[email protected]>
Reviewed-by: Tender Wang <[email protected]>
Reviewed-by: Gurjeet Singh <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/CAPmGK17%3DPHSDZ%2B0G6jcj12buyyE1bQQc3sbp1Wxri7tODT-SDw%40mail.gmail.com
Backpatch-through: 15
Álvaro Herrera [Fri, 2 May 2025 19:25:50 +0000 (21:25 +0200)]
Handle self-referencing FKs correctly in partitioned tables
For self-referencing foreign keys in partitioned tables, we weren't
handling creation of pg_constraint rows during CREATE TABLE PARTITION AS
as well as ALTER TABLE ATTACH PARTITION. This is an old bug -- mostly,
we broke this in
614a406b4ff1 while trying to fix it (so 12.13, 13.9,
14.6 and 15.0 and up all behave incorrectly). This commit reverts part
of that with additional fixes for full correctness, and installs more
tests to verify the parts we broke, not just the catalog contents but
also the user-visible behavior.
Backpatch to all live branches. In branches 13 and 14, commit
46a8c27a7226 changed the behavior during DETACH to drop a FK
constraint rather than trying to repair it, because the complete fix of
repairing catalog constraints was problematic due to lack of previous
fixes. For this reason, the test behavior in those branches is a bit
different. However, as best as I can tell, the fix works correctly
there.
In release notes we have to recommend that all self-referencing foreign
keys on partitioned tables be recreated if partitions have been created
or attached after the FK was created, keeping in mind that violating
rows might already be present on the referencing side.
Reported-by: Guillaume Lelarge <[email protected]>
Reported-by: Matthew Gabeler-Lee <[email protected]>
Reported-by: Luca Vallisa <[email protected]>
Discussion: https://postgr.es/m/CAECtzeWHCA+6tTcm2Oh2+g7fURUJpLZb-=pRXgeWJ-Pi+VU=_w@mail.gmail.com
Discussion: https://postgr.es/m/18156-
a44bc7096f0683e6@postgresql.org
Discussion: https://postgr.es/m/CAAT=myvsiF-Attja5DcWoUWh21R12R-sfXECY2-3ynt8kaOqjw@mail.gmail.com
Tom Lane [Fri, 2 May 2025 19:12:49 +0000 (15:12 -0400)]
Doc: correct spelling of meson switch.
It's --auto-features not --auto_features.
Reported-by: Egor Chindyaskin <[email protected]>
Discussion: https://postgr.es/m/
172465652540.862882.
17808523044292761256@wrigleys.postgresql.org
Discussion: https://postgr.es/m/
1979661.
1746212726@sss.pgh.pa.us
Backpatch-through: 16
Tom Lane [Fri, 2 May 2025 16:35:36 +0000 (12:35 -0400)]
Doc: forgot to run add_commit_links.pl.
Tom Lane [Fri, 2 May 2025 16:27:01 +0000 (12:27 -0400)]
First-draft release notes for 17.5.
As usual, the release notes for other branches will be made by cutting
these down, but put them up for community review first.
Noah Misch [Thu, 1 May 2025 23:51:59 +0000 (16:51 -0700)]
Doc: stop implying recommendation of insecure search_path value.
SQL "SET search_path = 'pg_catalog, pg_temp'" is silently equivalent to
"SET search_path = pg_temp, pg_catalog, "pg_catalog, pg_temp"" instead
of the intended "SET search_path = pg_catalog, pg_temp". (The intent
was a two-element search path. With the single quotes, it instead
specifies one element with a comma and a space in the middle of the
element.) In addition to the SET statement, this affects SET clauses of
CREATE FUNCTION, ALTER ROLE, and ALTER DATABASE. It does not affect the
set_config() SQL function.
Though the documentation did not show an insecure command, remove single
quotes that could entice a reader to write an insecure command.
Back-patch to v13 (all supported versions).
Reported-by: Sven Klemm <[email protected]>
Author: Sven Klemm <
[email protected]>
Backpatch-through: 13
Dean Rasheed [Thu, 1 May 2025 10:06:21 +0000 (11:06 +0100)]
doc: Warn that ts_headline() output is not HTML-safe.
Add a documentation warning to ts_headline() pointing out that, when
working with untrusted input documents, the output is not guaranteed
to be safe for direct inclusion in web pages. This is because, while
it does remove some XML tags from the input, it doesn't remove all
HTML markup, and so the result may be unsafe (e.g., it might permit
XSS attacks).
To guard against that, all HTML markup should be removed from the
input, making it plain text, or the output should be passed through an
HTML sanitizer.
In addition, document precisely what the default text search parser
recognises as valid XML tags, since that's what determines which XML
tags ts_headline() will remove.
Reported-by: Richard Neill <[email protected]>
Author: Dean Rasheed <
[email protected]>
Reviewed-by: Noah Misch <[email protected]>
Backpatch-through: 13
Tom Lane [Wed, 30 Apr 2025 15:13:49 +0000 (11:13 -0400)]
Update time zone data files to tzdata release 2025b.
DST law changes in Chile: there is a new time zone America/Coyhaique
for Chile's Aysén Region, to account for it changing to UTC-03
year-round and thus diverging from America/Santiago.
Historical corrections for Iran.
Backpatch-through: 13
Amit Kapila [Mon, 28 Apr 2025 05:52:07 +0000 (11:22 +0530)]
Fix xmin advancement during fast_forward decoding.
During logical decoding, we advance catalog_xmin of logical too early in
fast_forward mode, resulting in required catalog data being removed by
vacuum. This mode is normally used to advance the slot without processing
the changes, but we still can't let the slot's xmin to advance to an
incorrect value.
Commit
f49a80c481 fixed a similar issue where the logical slot's
catalog_xmin was getting advanced prematurely during non-fast-forward
mode. During xl_running_xacts processing, instead of directly advancing
the slot's xmin to the oldest running xid in the record, it allowed the
xmin to be held back for snapshots that can be used for
not-yet-replayed transactions, as those might consider older txns as
running too. However, it missed the fact that the same problem can happen
during fast_forward mode decoding, as we won't build a base snapshot in
that mode, and the future call to get_changes from the same slot can miss
seeing the required catalog changes leading to incorrect reslts.
This commit allows building the base snapshot even in fast_forward mode to
prevent the early advancement of xmin.
Reported-by: Amit Kapila <[email protected]>
Author: Zhijie Hou <
[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: shveta malik <[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAA4eK1LqWncUOqKijiafe+Ypt1gQAQRjctKLMY953J79xDBgAg@mail.gmail.com
Discussion: https://postgr.es/m/OS0PR01MB57163087F86621D44D9A72BF94BB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Amit Kapila [Fri, 25 Apr 2025 07:02:00 +0000 (12:32 +0530)]
Fix typo in test file name added in commit
4909b38af0.
Author: Shlok Kyal <
[email protected]>
Backpatch-through: 13
Discussion: https://postgr.es/m/CANhcyEXsObdjkjxEnq10aJumDpa5J6aiPzgTh_w4KCWRYHLw6Q@mail.gmail.com
Tom Lane [Wed, 23 Apr 2025 20:04:42 +0000 (16:04 -0400)]
Avoid possibly-theoretical OOM crash hazard in hash_create().
One place in hash_create() used DynaHashAlloc() as a convenient
shorthand for MemoryContextAlloc(). That was fine when it was
written, but it stopped being fine when
9c911ec06 changed
DynaHashAlloc() to use MCXT_ALLOC_NO_OOM (mea culpa). Change
the code to call plain MemoryContextAlloc() as intended.
I think that this bug may be unreachable in practice, since we now
always create AllocSets with some space already allocated, so that
an OOM failure here for a non-shared hash table should be impossible
(with a hash table name of reasonable length anyway). And there
aren't enough shared hash tables to make a crash for one of those
probable. Nonetheless it's clearly not operating as designed, so
back-patch to v16 where
9c911ec06 came in.
Reported-by: Maksim Korotkov <[email protected]>
Author: Tom Lane <
[email protected]>
Discussion: https://postgr.es/m/
219bdccd460510efaccf90b57e5e5ef2@postgrespro.ru
Backpatch-through: 16
Amit Kapila [Wed, 23 Apr 2025 05:22:36 +0000 (10:52 +0530)]
Fix an oversight in
3f28b2fcac.
Commit
3f28b2fcac tried to ensure that the replication origin shouldn't be
advanced in case of an ERROR in the apply worker, so that it can request
the same data again after restart. However, it is possible that an ERROR
was caught and handled by a (say PL/pgSQL) function, and the apply worker
continues to apply further changes, in which case, we shouldn't reset the
replication origin.
Ensure to reset the origin only when the apply worker exits after an
ERROR.
Commit
3f28b2fcac added new function geterrlevel, which we removed in HEAD
as part of this commit, but kept it in backbranches to avoid breaking any
applications. A separate case can be made to have such a function even for
HEAD.
Reported-by: Shawn McCoy <[email protected]>
Author: Hayato Kuroda <
[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: vignesh C <[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/CALsgZNCGARa2mcYNVTSj9uoPcJo-tPuWUGECReKpNgTpo31_Pw@mail.gmail.com
Michael Paquier [Wed, 23 Apr 2025 04:54:53 +0000 (13:54 +0900)]
Remove assertion based on pending_since in pgstat_report_stat()
This assertion, based on pending_since (timestamp used to prevent stats
reports to be too frequent or should a partial flush happen), is reached
when it is found that no data can be flushed but a previous call of
pgstat_report_stat() determined that some stats data has been found as
in need of a flush. So pending_since is set when some stats data is
pending (in non-force mode) or if report attempts are too frequent, and
reset to 0 once all stats have been flushed.
Since
5cbbe70a9cc6, WAL senders have begun to report their stats on a
periodic basis for IO stats in v16~ and backend stats on HEAD, creating
some friction with the concurrent pgstat_report_stat() calls that can
happen in the context of a WAL sender (shutdown callback doing a final
report or backend-related code paths). This problem is the cause of
spurious failures in the TAP tests.
In theory, this assertion can be also reached in v15, even if that's
very unlikely. For example, a process, say a background worker, could
do periodic and direct stats flushes with concurrent calls of
pgstat_report_stat() that could cause conflicting values of
pending_since. This can be done with WAL or SLRU stats flushes using
pgstat_flush_wal() or pgstat_slru_flush(). HEAD makes this situation
easier to happen with custom cumulative stats.
This commit removes the assertion altogether, per discussion, as it is
more useful to keep the state of things as they are for the WAL sender.
The assertion could use a special state based on for example
am_walsender, but I doubt that this would be meaningful in the long run
based on the other arguments raised while discussing this issue.
Reported-by: Tom Lane <[email protected]>
Reported-by: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/
1489124.
1744685908@sss.pgh.pa.us
Discussion: https://postgr.es/m/dwrkeszz6czvtkxzr5mqlciy652zau5qqnm3cp5f3p2po74ppk@omg4g3cc6dgq
Backpatch-through: 15
Michael Paquier [Tue, 22 Apr 2025 03:41:58 +0000 (12:41 +0900)]
doc: Mention naming convention used by injection points
All the injection points used in the tree have relied on an implied
rule: their names should be made of lower-case characters, with dashes
between the words used.
This commit adds a light mention about that in the docs, encouraging the
practice.
Author: Hayato Kuroda <
[email protected]>
Reviewed-by: Aleksander Alekseev <[email protected]>
Discussion: https://postgr.es/m/OSCPR01MB14966E14C1378DEE51FB7B7C5F5B32@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 17
David Rowley [Tue, 22 Apr 2025 02:56:39 +0000 (14:56 +1200)]
Doc: reword text explaining the --maintenance-db option
The previous text was a little clumsy. Here we improve that.
Author: David Rowley <
[email protected]>
Reported-by: Noboru Saito <[email protected]>
Reviewed-by: David G. Johnston <[email protected]>
Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com
Backpatch-through: 13
Michael Paquier [Tue, 22 Apr 2025 01:01:42 +0000 (10:01 +0900)]
Rename injection point for invalidation messages at end of transaction
This injection point was named "AtEOXact_Inval-with-transInvalInfo", not
respecting the implied naming convention that injection points should
use lower-case characters, with terms separated by dashes. All the
other points defined in the tree follow this style, so let's be more
consistent.
Author: Hayato Kuroda <
[email protected]>
Reviewed-by: Aleksander Alekseev <[email protected]>
Discussion: https://postgr.es/m/OSCPR01MB14966E14C1378DEE51FB7B7C5F5B32@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 17
David Rowley [Mon, 21 Apr 2025 23:04:44 +0000 (11:04 +1200)]
Doc: fix incorrect punctuation
Author: Noboru Saito <
[email protected]>
Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com
Backpatch-through: 17
Noah Misch [Sun, 20 Apr 2025 15:28:48 +0000 (08:28 -0700)]
Test restartpoints in archive recovery.
v14 commit
1f95181b44c843729caaa688f74babe9403b5850 and its v13
equivalent caused timing-dependent failures in archive recovery, at
restartpoints. The symptom was "invalid magic number 0000 in log
segment X, offset 0", "unexpected pageaddr X in log segment Y, offset 0"
[X < Y], or an assertion failure. Commit
3635a0a35aafd3bfa80b7a809bc6e91ccd36606a and predecessors back-patched
v15 changes to fix that. This test reproduces the problem
probabilistically, typically in less than 1000 iterations of the test.
Hence, buildfarm and CI runs would have surfaced enough failures to get
attention within a day.
Reported-by: Arun Thirupathi <[email protected]>
Discussion: https://postgr.es/m/
20250306193013[email protected]
Backpatch-through: 13
Noah Misch [Sun, 20 Apr 2025 15:28:48 +0000 (08:28 -0700)]
Avoid ERROR at ON COMMIT DELETE ROWS after relhassubclass=f.
Commit
7102070329d8147246d2791321f9915c3b5abf31 fixed a similar bug, but
it missed the case of database-wide ANALYZE ("use_own_xacts" mode).
Commit
a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed consequences
from silent discard of a pg_class stats (relpages et al.) update to
ERROR "tuple to be updated was already modified". Losing a relpages
update of an ON COMMIT DELETE ROWS table was negligible, but a
COMMIT-time error isn't negligible. Back-patch to v13 (all supported
versions).
Reported-by: Richard Guo <[email protected]
Reported-by: Robins Tharakan <[email protected]>
Discussion: https://postgr.es/m/CAMbWs4-XwMKMKJ_GT=p3_-_=j9rQSEs1FbDFUnW9zHuKPsPNEQ@mail.gmail.com
Backpatch-through: 13
David Rowley [Sun, 20 Apr 2025 10:12:37 +0000 (22:12 +1200)]
Fix issue with ORDER BY / DISTINCT aggregates and FILTER
1349d2790 added support so that aggregate functions with an ORDER BY or
DISTINCT clause could make use of presorted inputs to avoid an implicit
sort within nodeAgg.c. That commit failed to consider that a FILTER
clause may exist that filters rows before the aggregate function
arguments are evaluated. That can be problematic if an aggregate
argument contains an expression which could error out during evaluation.
It's perfectly valid to want to have a FILTER clause which eliminates
such values, and with the pre-sorted path added in
1349d2790, it was
possible that the planner would produce a plan with a Sort node above
the Aggregate to perform the sort on the aggregate's arguments long before
the Aggregate node would filter out the non-matching values.
Here we fix this by inspecting ORDER BY / DISTINCT aggregate functions
which have a FILTER clause to see if the aggregate's arguments are
anything more complex than a Var or a Const. Evaluating these isn't
going to cause an error. If we find any non-Var, non-Const parameters
then the planner will now opt to perform the sort in the Aggregate node
for these aggregates, i.e. disable the presorted aggregate optimization.
An alternative fix would have been to completely disallow the presorted
optimization for Aggrefs with any FILTER clause, but that wasn't done as
that could cause large performance regressions for queries that see
significant gains from
1349d2790 due to presorted results coming in from
an Index Scan.
Backpatch to 16, where
1349d2790 was introduced
Author: David Rowley <
[email protected]>
Reported-by: Kaimeh <[email protected]>
Diagnosed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/CAK-%2BJz9J%3DQ06-M7cDJoPNeYbz5EZDqkjQbJnmRyQyzkbRGsYkA%40mail.gmail.com
Backpatch-through: 16
Tom Lane [Sat, 19 Apr 2025 20:37:42 +0000 (16:37 -0400)]
Be more wary of corrupt data in pageinspect's heap_page_items().
The original intent in heap_page_items() was to return nulls, not
throw an error or crash, if an item was sufficiently corrupt that
we couldn't safely extract data from it. However, commit
d6061f83a
utterly missed that memo, and not only put in an un-length-checked
copy of the tuple's data section, but also managed to break the check
on sane nulls-bitmap length. Either mistake could possibly lead to
a SIGSEGV crash if the tuple is corrupt.
Bug: #18896
Reported-by: Dmitry Kovalenko <[email protected]>
Author: Dmitry Kovalenko <
[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/18896-
add267b8e06663e3@postgresql.org
Backpatch-through: 13
Tatsuo Ishii [Fri, 18 Apr 2025 00:35:35 +0000 (09:35 +0900)]
Doc: fix missing comma at the end of a line.
Backpatch to 17, where the line was added.
Reported by Noboru Saito while he was working on translating the file
into Japanese.
Discussion: https://postgr.es/m/
20250417.203047.
1321297410457834775.ishii%40postgresql.org
Reported-by: Noboru Saito <[email protected]>
Reviewed-by: Daniel Gustafs <[email protected]>
Backpatch-through: 17
Tom Lane [Wed, 16 Apr 2025 17:31:44 +0000 (13:31 -0400)]
Fix pg_dump --clean with partitioned indexes.
We'd try to drop the partitions of a partitioned index separately,
which is disallowed by the backend, leading to an error during
restore. While the error is harmless, it causes problems if you
try to use --single-transaction mode.
Fortunately, there seems no need to do a DROP at all, since the
partition will go away silently when we drop either the parent index
or the partition's table. So just make the DROP conditional on not
being a partition.
Reported-by: jian he <[email protected]>
Author: jian he <
[email protected]>
Reviewed-by: Pavel Stehule <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/CACJufxF0QSdkjFKF4di-JGWN6CSdQYEAhGPmQJJCdkSZtd=oLg@mail.gmail.com
Backpatch-through: 13
Tom Lane [Tue, 15 Apr 2025 16:08:34 +0000 (12:08 -0400)]
Fix failure for generated column with a not-null domain constraint.
If a GENERATED column is declared to have a domain data type where
the domain's constraints disallow null values, INSERT commands failed
because we built a targetlist that included coercing a null constant
to the domain's type. The failure occurred even when the generated
value would have been perfectly OK. This is adjacent to the issues
fixed in
0da39aa76, but we didn't notice for lack of testing a domain
with such a constraint.
We aren't going to use the result of the targetlist entry for the
generated column --- ExecComputeStoredGenerated will overwrite it.
So it's not really necessary that it have the exact datatype of
the generated column. This patch fixes the problem by changing
the targetlist entry to be a null Const of the domain's base type,
which should be sufficiently legal. (We do have to tweak
ExecCheckPlanOutput to accept the situation, though.)
This has been broken since we implemented generated columns.
However, this patch only applies easily as far back as v14, partly
because I (tgl) only carried
0da39aa76 back that far, but mostly
because v14 significantly refactored the handling of INSERT/UPDATE
targetlists. Given the lack of field complaints and the short
remaining support lifetime of v13, I judge the cost-benefit ratio
not good for devising a version that would work in v13.
Reported-by: jian he <[email protected]>
Author: jian he <
[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/CACJufxG59tip2+9h=rEv-ykOFjt0cbsPVchhi0RTij8bABBA0Q@mail.gmail.com
Backpatch-through: 14
Fujii Masao [Tue, 15 Apr 2025 14:15:06 +0000 (23:15 +0900)]
doc: Fix missing whitespace in pg_restore documentation.
Previously, a space was missing between "<option>--exclude-schema</option>"
and "for" in the pg_restore documentation. This commit fixes the typo by
adding the missing whitespace.
Back-patch to v17 where the typo was added.
Author: Lele Gaifax <
[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/
[email protected]
Backpatch-through: 17
Daniel Gustafsson [Tue, 15 Apr 2025 13:27:08 +0000 (15:27 +0200)]
pg_combinebackup: Fix incorrect code documentation
The code comment for parse_oid accidentally used the wrong parameter
when referring to the location of the last backup. Also, while there,
improve sentence wording by removing a superfluous word.
Backpatch to v17 where pg_combinebackup was addedd
Author: Amul Sul <
[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://postgr.es/m/CAAJ_b95ecWgzcS4K3Dx0E_Yp-SLwK5JBasFgioKMSjhQLw9xvg@mail.gmail.com
Backpatch-through: 17
Tom Lane [Sat, 12 Apr 2025 16:27:46 +0000 (12:27 -0400)]
Fix GIN's shimTriConsistentFn to not corrupt its input.
Commit
0f21db36d made an assumption that GIN triConsistentFns
would not modify their input entryRes[] arrays. But in fact,
the "shim" triConsistentFn that we use for opclasses that don't
supply their own did exactly that, potentially leading to wrong
answers from a GIN index search. Through bad luck, none of the
test cases that we have for such opclasses exposed the bug.
One response to this could be that the assumption of consistency check
functions not modifying entryRes[] arrays is a bad one, but it still
seems reasonable to me. Notably, shimTriConsistentFn is itself
assuming that with respect to the underlying boolean consistentFn,
so it's sure being self-centered in supposing that it gets to do so.
Fortunately, it's quite simple to fix shimTriConsistentFn to restore
the entry-time state of entryRes[], so let's do that instead.
This issue doesn't affect any core GIN opclasses, since they all
supply their own triConsistentFns. It does affect contrib modules
btree_gin, hstore, and intarray.
Along the way, I (tgl) noticed that shimTriConsistentFn failed to
pick up on a "recheck" flag returned by its first call to the boolean
consistentFn. This may be only a latent problem, since it would be
unlikely for a consistentFn to set recheck for the all-false case
and not any other cases. (Indeed, none of our contrib modules do
that.) Nonetheless, it's formally wrong.
Reported-by: Vinod Sridharan <[email protected]>
Author: Vinod Sridharan <
[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com
Backpatch-through: 13
Michael Paquier [Fri, 11 Apr 2025 01:02:15 +0000 (10:02 +0900)]
Fix race with synchronous_standby_names at startup
synchronous_standby_names cannot be reloaded safely by backends, and the
checkpointer is in charge of updating a state in shared memory if the
GUC is enabled in WalSndCtl, to let the backends know if they should
wait or not for a given LSN. This provides a strict control on the
timing of the waiting queues if the GUC is enabled or disabled, then
reloaded. The checkpointer is also in charge of waking up the backends
that could be waiting for a LSN when the GUC is disabled.
This logic had a race condition at startup, where it would be possible
for backends to not wait for a LSN even if synchronous_standby_names is
enabled. This would cause visibility issues with transactions that we
should be waiting for but they were not. The problem lasts until the
checkpointer does its initial update of the shared memory state when it
loads synchronous_standby_names.
In order to take care of this problem, the shared memory state in
WalSndCtl is extended to detect if it has been initialized by the
checkpointer, and not only check if synchronous_standby_names is
defined. In WalSndCtlData, sync_standbys_defined is renamed to
sync_standbys_status, a bits8 able to know about two states:
- If the shared memory state has been initialized. This flag is set by
the checkpointer at startup once, and never removed.
- If synchronous_standby_names is known as defined in the shared memory
state. This is the same as the previous sync_standbys_defined in
WalSndCtl.
This method gives a way for backends to decide what they should do until
the shared memory area is initialized, and they now ultimately fall back
to a check on the GUC value in this case, which is the best thing that
can be done.
Fortunately, SyncRepUpdateSyncStandbysDefined() is called immediately by
the checkpointer when this process starts, so the window is very narrow.
It is possible to enlarge the problematic window by making the
checkpointer wait at the beginning of SyncRepUpdateSyncStandbysDefined()
with a hardcoded sleep for example, and doing so has showed that a 2PC
visibility test is indeed failing. On machines slow enough, this bug
would cause spurious failures.
In 17~, we have looked at the possibility of adding an injection point
to have a reproducible test, but as the problematic window happens at
early startup, we would need to invent a way to make an injection point
optionally persistent across restarts when attached, something that
would be fine for this case as it would involve the checkpointer. This
issue is quite old, and can be reproduced on all the stable branches.
Author: Melnikov Maksim <
[email protected]>
Co-authored-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/
163fcbec-900b-4b07-beaa-
d2ead8634bec@postgrespro.ru
Backpatch-through: 13
Tom Lane [Thu, 10 Apr 2025 18:49:10 +0000 (14:49 -0400)]
Doc: remove long-obsolete advice about generated constraint names.
It's been twenty years since we generated constraint names that
look like "$N". So this advice about double-quoting such names
is well past its sell-by date, and now it merely seems confusing.
Reported-by: Yaroslav Saburov <[email protected]>
Author: "David G. Johnston" <
[email protected]>
Discussion: https://postgr.es/m/
174393459040.678.
17810152410419444783@wrigleys.postgresql.org
Backpatch-through: 13
Amit Kapila [Thu, 10 Apr 2025 07:27:10 +0000 (12:57 +0530)]
Fix data loss in logical replication.
Data loss can happen when the DDLs like ALTER PUBLICATION ... ADD TABLE ...
or ALTER TYPE ... that don't take a strong lock on table happens
concurrently to DMLs on the tables involved in the DDL. This happens
because logical decoding doesn't distribute invalidations to concurrent
transactions and those transactions use stale cache data to decode the
changes. The problem becomes bigger because we keep using the stale cache
even after those in-progress transactions are finished and skip the
changes required to be sent to the client.
This commit fixes the issue by distributing invalidation messages from
catalog-modifying transactions to all concurrent in-progress transactions.
This allows the necessary rebuild of the catalog cache when decoding new
changes after concurrent DDL.
We observed performance regression primarily during frequent execution of
*publication DDL* statements that modify the published tables. The
regression is minor or nearly nonexistent for DDLs that do not affect the
published tables or occur infrequently, making this a worthwhile cost to
resolve a longstanding data loss issue.
An alternative approach considered was to take a strong lock on each
affected table during publication modification. However, this would only
address issues related to publication DDLs (but not the ALTER TYPE ...)
and require locking every relation in the database for publications
created as FOR ALL TABLES, which is impractical.
The bug exists in all supported branches, but we are backpatching till 14.
The fix for 13 requires somewhat bigger changes than this fix, so the fix
for that branch is still under discussion.
Reported-by: hubert depesz lubaczewski <[email protected]>
Reported-by: Tomas Vondra <[email protected]>
Author: Shlok Kyal <
[email protected]>
Author: Hayato Kuroda <
[email protected]>
Reviewed-by: Zhijie Hou <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Tested-by: Benoit Lobréau <[email protected]>
Backpatch-through: 14
Discussion: https://postgr.es/m/
de52b282-1166-1180-45a2-
8d8917ca74c6@enterprisedb.com
Discussion: https://postgr.es/m/CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com
Noah Misch [Wed, 9 Apr 2025 14:23:39 +0000 (07:23 -0700)]
Fix test races between syscache-update-pruned.spec and autovacuum.
This spec fails ~3% of my Valgrind runs, and the spec has failed on Valgrind
buildfarm member skink at a similar rate. Two problems contributed to that:
- A competing buffer pin triggered VACUUM's lazy_scan_noprune() path, causing
"tuples missed: 1 dead from 1 pages not removed due to cleanup lock
contention". FREEZE fixes that.
- The spec ran lazy VACUUM immediately after VACUUM FULL. The spec implicitly
assumed lazy VACUUM prunes the one tuple that VACUUM FULL made dead. First
wait for old snapshots, making that assumption reliable.
This also adds two forms of defense in depth:
- Wait for snapshots using shared catalog pruning rules (VISHORIZON_SHARED).
This avoids the removable cutoff moving backward when an XID-bearing
autoanalyze process runs in another database. That may never happen in this
test, but it's cheap insurance.
- Use lazy VACUUM option DISABLE_PAGE_SKIPPING. Commit
c2dc1a79767a0f947e1145f82eb65dfe4360d25f did this for a related requirement
in other tests, but I suspect FREEZE is necessary and sufficient in all
these tests.
Back-patch to v17, where the test first appeared.
Reported-by: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/sv3taq4e6ea4qckimien3nxp3sz4b6cw6sfcy4nhwl52zpur4g@h6i6tohxmizu
Backpatch-through: 17
Amit Kapila [Tue, 8 Apr 2025 03:53:07 +0000 (09:23 +0530)]
Stabilize 035_standby_logical_decoding.pl.
Some tests try to invalidate logical slots on the standby server by
running VACUUM on the primary. The problem is that xl_running_xacts was
getting generated and replayed before the VACUUM command, leading to the
advancement of the active slot's catalog_xmin. Due to this, active slots
were not getting invalidated, leading to test failures.
We fix it by skipping the generation of xl_running_xacts for the required
tests with the help of injection points. As the required interface for
injection points was not present in back branches, we fixed the failing
tests in them by disallowing the slot to become active for the required
cases (where rows_removed conflict could be generated).
Author: Hayato Kuroda <
[email protected]>
Reviewed-by: Bertrand Drouvot <[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/
[email protected]
Bruce Momjian [Tue, 8 Apr 2025 01:33:41 +0000 (21:33 -0400)]
Fix PG 17 [NOT] NULL optimization bug for domains
A PG 17 optimization allowed columns with NOT NULL constraints to skip
table scans for IS NULL queries, and to skip IS NOT NULL checks for IS
NOT NULL queries. This didn't work for domain types, since domain types
don't follow the IS NULL/IS NOT NULL constraint logic. To fix, disable
this optimization for domains for PG 17+.
Reported-by: Jan Behrens
Diagnosed-by: Tom Lane
Discussion: https://postgr.es/m/
[email protected]
Backpatch-through: 17
Michael Paquier [Mon, 7 Apr 2025 22:58:47 +0000 (07:58 +0900)]
Flush the IO statistics of active WAL senders more frequently
WAL senders do not flush their statistics until they exit, limiting the
monitoring possible for live processes. This is penalizing when WAL
senders are running for a long time, like in streaming or logical
replication setups, because it is not possible to know the amount of IO
they generate while running.
This commit makes WAL senders more aggressive with their statistics
flush, using an internal of 1 second, with the flush timing calculated
based on the existing GetCurrentTimestamp() done before the sleeps done
to wait for some activity. Note that the sleep done for logical and
physical WAL senders happens in two different code paths, so the stats
flushes need to happen in these two places.
One test is added for the physical WAL sender case, and one for the
logical WAL sender case. This can be done in a stable fashion by
relying on the WAL generated by the TAP tests in combination with a
stats reset while a server is running, but only on HEAD as WAL data has
been added to pg_stat_io in
a051e71e28a1.
This issue exists since
a9c70b46dbe and the introduction of pg_stat_io,
so backpatch down to v16.
Author: Bertrand Drouvot <
[email protected]>
Reviewed-by: vignesh C <[email protected]>
Reviewed-by: Xuneng Zhou <[email protected]>
Discussion: https://postgr.es/m/
[email protected]
Backpatch-through: 16
Daniel Gustafsson [Sun, 6 Apr 2025 22:03:18 +0000 (00:03 +0200)]
doc: Clarify project naming
Clarify the project naming in the history section of the docs
to match the recent license preamble changes.
Backpatch to all supported versions.
Author: Dave Page <
[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://postgr.es/m/CA+OCxozLzK2+Jc14XZyWXSp6L9Ot+3efwXUE35FJG=fsbib2EA@mail.gmail.com
Backpatch-through: 13
Jeff Davis [Sun, 6 Apr 2025 16:13:43 +0000 (09:13 -0700)]
Fix unintentional 'NULL' string literal in pg_upgrade.
Introduced in
2a083ab807.
Note: backport of commit
945126234b, which was missed at the time.
Discussion: https://postgr.es/m/
e852442da35b4f31acc600ed98bbee0f12e65e0c[email protected]
Reviewed-by: Michael Paquier <[email protected]>
Backpatch-through: 16
Tom Lane [Sat, 5 Apr 2025 19:01:33 +0000 (15:01 -0400)]
Fix parse_cte.c's failure to examine sub-WITHs in DML statements.
makeDependencyGraphWalker thought that only SelectStmt nodes could
contain a WithClause. Which was true in our original implementation
of WITH, but astonishingly we missed updating this code when we added
the ability to attach WITH to INSERT/UPDATE/DELETE (and later MERGE).
Moreover, since it was coded to deliberately block recursion to a
WithClause, even updating raw_expression_tree_walker didn't save it.
The upshot of this was that we didn't see references to outer CTE
names appearing within an inner WITH, and would neither complain about
disallowed recursion nor account for such references when sorting CTEs
into a usable order. The lack of complaints about this is perhaps not
so surprising, because typical usage of WITH wouldn't hit either case.
Still, it's pretty broken; failing to detect recursion here leads to
assert failures or worse later on.
Fix by factoring out the processing of sub-WITHs into a new function
WalkInnerWith, and invoking that for all the statement types that
can have WITH.
Bug: #18878
Reported-by: Yu Liang <[email protected]>
Author: Tom Lane <
[email protected]>
Discussion: https://postgr.es/m/18878-
a26fa5ab6be2f2cf@postgresql.org
Backpatch-through: 13
Tom Lane [Sat, 5 Apr 2025 16:13:35 +0000 (12:13 -0400)]
Avoid double transformation of json_array()'s subquery.
transformJsonArrayQueryConstructor() applied transformStmt() to
the same subquery tree twice. While this causes no issue in many
cases, there are some where it causes a coredump, thanks to the
parser's habit of scribbling on its input.
Fix by making a copy before the first transformation (compare
0f43083d1). This is quite brute-force, but then so is the
whole business of transforming the input twice. Per discussion
in the bug thread, this implementation of json_array() parsing
should be replaced completely. But that will take some work
and will surely not be back-patchable, so for the moment let's
take the easy way out.
Oversight in
7081ac46a. Back-patch to v16 where that came in.
Bug: #18877
Reported-by: Yu Liang <[email protected]>
Author: Tom Lane <
[email protected]>
Discussion: https://postgr.es/m/18877-
c3c3ad75845833bb@postgresql.org
Backpatch-through: 16
Tom Lane [Sat, 5 Apr 2025 00:11:48 +0000 (20:11 -0400)]
Repair misbehavior with duplicate entries in FK SET column lists.
Since v15 we've had an option to apply a foreign key constraint's
ON DELETE SET DEFAULT or SET NULL action to just some of the
referencing columns. There was not a check for duplicate entries in
the list of columns-to-set, though. That caused a potential memory
stomp in CreateConstraintEntry(), which incautiously assumed that
the list of columns-to-set couldn't be longer than the number of key
columns. Even after fixing that, the case doesn't work because you
get an error like "multiple assignments to same column" from the SQL
command that is generated to do the update.
We could either raise an error for duplicate columns or silently
suppress the dups, and after a bit of thought I chose to do the
latter. This is motivated by the fact that duplicates in the FK
column list are legal, so it's not real clear why duplicates
in the columns-to-set list shouldn't be. Of course there's no
need to actually set the column more than once.
I left in the fix in CreateConstraintEntry() too, just because
it didn't seem like such low-level code ought to be making
assumptions about what it's handed.
Bug: #18879
Reported-by: Yu Liang <[email protected]>
Author: Tom Lane <
[email protected]>
Discussion: https://postgr.es/m/18879-
259fc59d072bd4d7@postgresql.org
Backpatch-through: 15
Heikki Linnakangas [Fri, 4 Apr 2025 10:49:00 +0000 (13:49 +0300)]
Relax assertion in finding correct GiST parent
Commit
28d3c2ddcf introduced an assertion that if the memorized
downlink location in the insertion stack isn't valid, the parent's
LSN should've changed too. Turns out that was too strict. In
gistFindCorrectParent(), if we walk right, we update the parent's
block number and clear its memorized 'downlinkoffnum'. That triggered
the assertion on next call to gistFindCorrectParent(), if the parent
needed to be split too. Relax the assertion, so that it's OK if
downlinkOffnum is InvalidOffsetNumber.
Backpatch to v13-, all supported versions. The assertion was added in
commit
28d3c2ddcf in v12.
Reported-by: Alexander Lakhin <[email protected]>
Reviewed-by: Tender Wang <[email protected]>
Discussion: https://www.postgresql.org/message-id/18396-
03cac9beb2f7aac3@postgresql.org
Daniel Gustafsson [Fri, 4 Apr 2025 07:47:36 +0000 (09:47 +0200)]
doc: Clarify the system value for sslrootcert
The documentation for the special value "system" for sslrootcert could
be misinterpreted to mean the default operating system CA store, which
it may be, but it's defined to be the default CA store of the SSL lib
used.
Backpatch down to v16 where support for the system value was added.
Author: Daniel Gustafsson <
[email protected]>
Reviewed-by: George MacKerron <[email protected]>
Discussion: https://postgr.es/m/
B3CBBAA3-6EA3-4AB7-8619-
4BBFAB93DDB4@yesql.se
Backpatch-through: 16
Fujii Masao [Fri, 4 Apr 2025 04:32:46 +0000 (13:32 +0900)]
Fix logical decoding test to correctly check slot removal on standby.
The regression test for logical decoding verifies whether a logical slot
is correctly dropped on a standby when its associated database is dropped.
However, the test mistakenly retrieved slot information from the primary
instead of the standby, causing incorrect behavior.
This commit fixes the issue by ensuring the test correctly checks the slot
on the standby.
Back-patch to all supported versions.
Author: Hayato Kuroda <
[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/
1fdfd020-a509-403c-bd8f-
a04664aba148@oss.nttdata.com
Backpatch-through: 13
Fujii Masao [Fri, 4 Apr 2025 04:09:06 +0000 (13:09 +0900)]
Fix logical decoding regression tests to correctly check slot existence.
The regression tests for logical decoding verify whether a logical slot
exists or has been dropped. Previously, these tests attempted to
retrieve "slot_name" from the result of slot(), but since "slot_name" was
not included in the result, slot()->{'slot_name'} always returned undef,
leading to incorrect behavior.
This commit fixes the issue by checking the "plugin" field in the result
of slot() instead, ensuring the tests properly verify slot existence.
Back-patch to all supported versions.
Author: Hayato Kuroda <
[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/OSCPR01MB149667EC4E738769CA80B7EA5F5AE2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 13
Masahiko Sawada [Thu, 3 Apr 2025 17:30:02 +0000 (10:30 -0700)]
Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots that
were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
For v15 and earlier, this check is not required since slots can only
be invalidated due to WAL removal, and existing checks already handle
this issue.
Author: Shlok Kyal <
[email protected]>
Reviewed-by: vignesh C <[email protected]>
Reviewed-by: Zhijie Hou <[email protected]>
Reviewed-by: Peter Smith <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
Tom Lane [Wed, 2 Apr 2025 20:17:43 +0000 (16:17 -0400)]
Remove unnecessary type violation in tsvectorrecv().
compareentry() is declared to work on WordEntryIN structs, but
tsvectorrecv() is using it in two places to work on WordEntry
structs. This is almost okay, since WordEntry is the first
field of WordEntryIN. But on machines with 8-byte pointers,
WordEntryIN will have a larger alignment spec than WordEntry,
and it's at least theoretically possible that the compiler
could generate code that depends on the larger alignment.
Given the lack of field reports, this may be just a hypothetical bug
that upsets nothing except sanitizer tools. Or it may be real on
certain hardware but nobody's tried to use tsvectorrecv() on such
hardware. In any case we should fix it, and the fix is trivial:
just change compareentry() so that it works on WordEntry without any
mention of WordEntryIN. We can also get rid of the quite-useless
intermediate function WordEntryCMP.
Bug: #18875
Reported-by: Alexander Lakhin <[email protected]>
Author: Tom Lane <
[email protected]>
Discussion: https://postgr.es/m/18875-
07a29c49c825a608@postgresql.org
Backpatch-through: 13
Andres Freund [Wed, 2 Apr 2025 18:25:17 +0000 (14:25 -0400)]
Remove HeapBitmapScan's skip_fetch optimization
The optimization does not take the removal of TIDs by a concurrent vacuum into
account. The concurrent vacuum can remove dead TIDs and make pages ALL_VISIBLE
while those dead TIDs are referenced in the bitmap. This can lead to a
skip_fetch scan returning too many tuples.
It likely would be possible to implement this optimization safely, but we
don't have the necessary infrastructure in place. Nor is it clear that it's
worth building that infrastructure, given how limited the skip_fetch
optimization is.
In the backbranches we just disable the optimization by always passing
need_tuples=true to table_beginscan_bm(). We can't perform API/ABI changes in
the backbranches and we want to make the change as minimal as possible.
Author: Matthias van de Meent <
[email protected]>
Reported-By: Konstantin Knizhnik <[email protected]>
Discussion: https://postgr.es/m/CAEze2Wg3gXXZTr6_rwC+s4-o2ZVFB5F985uUSgJTsECx6AmGcQ@mail.gmail.com
Backpatch-through: 13
Tom Lane [Wed, 2 Apr 2025 15:13:01 +0000 (11:13 -0400)]
Need to do CommandCounterIncrement after StoreAttrMissingVal.
Without this, an additional change to the same pg_attribute row
within the same command will fail. This is possible at least with
ALTER TABLE ADD COLUMN on a multiple-inheritance-pathway structure.
(Another potential hazard is that immediately-following operations
might not see the missingval.)
Introduced by
95f650674, which split the former coding that
used a single pg_attribute update to change both atthasdef and
atthasmissing/attmissingval into two updates, but missed that
this should entail two CommandCounterIncrements as well. Like
that fix, back-patch through v13.
Reported-by: Alexander Lakhin <[email protected]>
Author: Tender Wang <
[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/
025a3ffa-5eff-4a88-97fb-
8f583b015965@gmail.com
Backpatch-through: 13
Peter Eisentraut [Wed, 2 Apr 2025 12:34:24 +0000 (14:34 +0200)]
Fix code comment
The changes made in commit
d2b4b4c2259 contained incorrect comments:
They said that certain forward declarations were necessary to "avoid
including pathnodes.h here", but the file is itself pathnodes.h! So
change the comment to just say it's a forward declaration in one case,
and in the other case we don't need the declaration at all because it
already appeared earlier in the file.
David Rowley [Wed, 2 Apr 2025 01:03:48 +0000 (14:03 +1300)]
Doc: add information about partition locking
The documentation around locking of partitions for the executor startup
phase of run-time partition pruning wasn't clear about which partitions
were being locked. Fix that.
Reviewed-by: Tender Wang <[email protected]>
Discussion: https://postgr.es/m/CAApHDvp738G75HfkKcfXaf3a8s%3D6mmtOLh46tMD0D2hAo1UCzA%40mail.gmail.com
Backpatch-through: 13
David Rowley [Tue, 1 Apr 2025 22:57:27 +0000 (11:57 +1300)]
Fix planner's failure to identify multiple hashable ScalarArrayOpExprs
50e17ad28 (v14) and
29f45e299 (v15) made it so the planner could identify
IN and NOT IN clauses which have Const lists as right-hand arguments and
when an appropriate hash function is available for the data types, mark
the ScalarArrayOpExpr as hashable so the executor could execute it more
optimally by building and probing a hash table during expression
evaluation.
These commits both worked correctly when there was only a single
ScalarArrayOpExpr in the given expression being processed by the
planner, but when there were multiple, only the first was checked and any
subsequent ones were not identified, which resulted in less optimal
expression evaluation during query execution for all but the first found
ScalarArrayOpExpr.
Backpatch to 14, where
50e17ad28 was introduced.
Author: David Geier <
[email protected]>
Discussion: https://postgr.es/m/
29a76f51-97b0-4c07-87b7-
ec8e3b5345c9@gmail.com
Backpatch-through: 14
Tom Lane [Tue, 1 Apr 2025 20:49:51 +0000 (16:49 -0400)]
Fix detection and handling of strchrnul() for macOS 15.4.
As of 15.4, macOS has strchrnul(), but access to it is blocked behind
a check for MACOSX_DEPLOYMENT_TARGET >= 15.4. But our does-it-link
configure check finds it, so we try to use it, and fail with the
present default deployment target (namely 15.0). This accounts for
today's buildfarm failures on indri and sifaka.
This is the identical problem that we faced some years ago when Apple
introduced preadv and pwritev in the same way. We solved that in
commit
f014b1b9b by using AC_CHECK_DECLS instead of AC_CHECK_FUNCS
to check the functions' availability. So do the same now for
strchrnul(). Interestingly, we already had a workaround for
"the link check doesn't agree with <string.h>" cases with glibc,
which we no longer need since only the header declaration is being
checked.
Testing this revealed that the meson version of this check has never
worked, because it failed to use "-Werror=unguarded-availability-new".
(Apparently nobody's tried to build with meson on macOS versions that
lack preadv/pwritev as standard.) Adjust that while at it. Also,
we had never put support for "-Werror=unguarded-availability-new"
into v13, but we need that now.
Co-authored-by: Tom Lane <[email protected]>
Co-authored-by: Peter Eisentraut <[email protected]>
Discussion: https://postgr.es/m/385134.
1743523038@sss.pgh.pa.us
Backpatch-through: 13
Dean Rasheed [Sat, 29 Mar 2025 09:50:14 +0000 (09:50 +0000)]
Fix MERGE with DO NOTHING actions into a partitioned table.
ExecInitPartitionInfo() duplicates much of the logic in
ExecInitMerge(), except that it failed to handle DO NOTHING
actions. This would cause an "unknown action in MERGE WHEN clause"
error if a MERGE with any DO NOTHING actions attempted to insert into
a partition not already initialised by ExecInitModifyTable().
Bug: #18871
Reported-by: Alexander Lakhin <[email protected]>
Author: Tender Wang <
[email protected]>
Reviewed-by: Gurjeet Singh <[email protected]>
Discussion: https://postgr.es/m/18871-
b44e3c96de3bd2e8%40postgresql.org
Backpatch-through: 15
Daniel Gustafsson [Thu, 27 Mar 2025 21:57:34 +0000 (22:57 +0100)]
Fix guc_malloc calls for consistency and OOM checks
check_createrole_self_grant and check_synchronized_standby_slots
were allocating memory on a LOG elevel without checking if the
allocation succeeded or not, which would have led to a segfault
on allocation failure.
On top of that, a number of callsites were using the ERROR level,
relying on erroring out rather than returning false to allow the
GUC machinery handle it gracefully. Other callsites used WARNING
instead of LOG. While neither being not wrong, this changes all
check_ functions do it consistently with LOG.
init_custom_variable gets a promoted elevel to FATAL to keep
the guc_malloc error handling in line with the rest of the
error handling in that function which already call FATAL. If
we encounter an OOM in this callsite there is no graceful
handling to be had, better to error out hard.
Backpatch the fix to check_createrole_self_grant down to v16
and the fix to check_synchronized_standby_slots down to v17
where they were introduced.
Author: Daniel Gustafsson <
[email protected]>
Reported-by: Nikita <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Bug: #18845
Discussion: https://postgr.es/m/18845-
582c6e10247377ec@postgresql.org
Backpatch-through: 16
Tom Lane [Thu, 27 Mar 2025 17:20:23 +0000 (13:20 -0400)]
Prevent assertion failure in contrib/pg_freespacemap.
Applying pg_freespacemap() to a relation lacking storage (such as a
view) caused an assertion failure, although there was no ill effect
in non-assert builds. Add an error check for that case.
Bug: #18866
Reported-by: Robins Tharakan <[email protected]>
Author: Tender Wang <
[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Discussion: https://postgr.es/m/18866-
d68926d0f1c72d44@postgresql.org
Backpatch-through: 13
Michael Paquier [Thu, 27 Mar 2025 01:20:45 +0000 (10:20 +0900)]
doc: Correct description of values used in FSM for indexes
The implementation of FSM for indexes is simpler than heap, where 0 is
used to track if a page is in-use and (BLCKSZ - 1) if a page is free.
One comment in indexfsm.c and one description in the documentation of
pg_freespacemap were incorrect about that.
Author: Alex Friedman <
[email protected]>
Discussion: https://postgr.es/m/
71eef655-c192-453f-ac45-
2772fec2cb04@gmail.com
Backpatch-through: 13
Tomas Vondra [Wed, 26 Mar 2025 15:50:13 +0000 (16:50 +0100)]
Keep the decompressed filter in brin_bloom_union
The brin_bloom_union() function combines two BRIN summaries, by merging
one filter into the other. With bloom, we have to decompress the filters
first, but the function failed to update the summary to store the merged
filter. As a consequence, the index may be missing some of the data, and
return false negatives.
This issue exists since BRIN bloom indexes were introduced in Postgres
14, but at that point the union function was called only when two
sessions happened to summarize a range concurrently, which is rare. It
got much easier to hit in 17, as parallel builds use the union function
to merge summaries built by workers.
Fixed by storing a pointer to the decompressed filter, and freeing the
original one. Free the second filter too, if it was decompressed. The
freeing is not strictly necessary, because the union is called in
short-lived contexts, but it's tidy.
Backpatch to 14, where BRIN bloom indexes were introduced.
Reported by Arseniy Mukhin, investigation and fix by me.
Reported-by: Arseniy Mukhin
Discussion: https://postgr.es/m/18855-
1cf1c8bcc22150e6%40postgresql.org
Backpatch-through: 14
Richard Guo [Wed, 26 Mar 2025 08:46:51 +0000 (17:46 +0900)]
Fix integer-overflow problem in scram_SaltedPassword()
Setting the iteration count for SCRAM secret generation to INT_MAX
will cause an infinite loop in scram_SaltedPassword() due to integer
overflow, as the loop uses the "i <= iterations" comparison. To fix,
use "i < iterations" instead.
Back-patch to v16 where the user-settable GUC scram_iterations has
been added.
Author: Kevin K Biju <
[email protected]>
Reviewed-by: Richard Guo <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/CAM45KeEMm8hnxdTOxA98qhfZ9CzGDdgy3mxgJmy0c+2WwjA6Zg@mail.gmail.com
Tom Lane [Wed, 26 Mar 2025 00:03:56 +0000 (20:03 -0400)]
Fix order of -I switches for building pg_regress.o.
We need the -I switch for libpq_srcdir to come before any -I switches
injected by configure. Otherwise there is a risk of pulling in a
mismatched version of libpq_fe.h from someplace like
/usr/local/include, if the platform has another Postgres version
installed there. This evidently accounts for today's buildfarm
failures on "anaconda".
In principle the -I switch for src/port/ is at similar hazard, and has
been for a very long time. But the only .h files we keep there are
pg_config_paths.h and pthread-win32.h, neither of which get installed
on Unix-ish systems, so the odds of picking up a conflicting header
seem pretty small. That doubtless accounts for the lack of prior
reports.
Back-patch to v17 where pg_regress acquired a build dependency on
libpq_fe.h. We could go back further to fix the hazard for src/port/
in older branches, but it seems unlikely to be worth troubling over.
Reported-by: Nathan Bossart <[email protected]>
Author: Tom Lane <
[email protected]>
Discussion: https://postgr.es/m/Z-MhRzoc7t-nPUQG@nathan
Backpatch-through: 17
Alexander Korotkov [Tue, 25 Mar 2025 10:48:48 +0000 (12:48 +0200)]
postgres_fdw: Remove redundant check in semijoin_target_ok()
If a var belongs to the innerrel of the joinrel, it's not possible that
it belongs to the outerrel. This commit removes the redundant check from
the if-clause but keeps it as an assertion.
Discussion: https://postgr.es/m/flat/CAHewXN=8aW4hd_W71F7Ua4+_w0=bppuvvTEBFBF6G0NuSXLwUw@mail.gmail.com
Author: Tender Wang <
[email protected]>
Reviewed-by: Alexander Pyhalov <[email protected]>
Backpatch-through: 17
Alexander Korotkov [Tue, 25 Mar 2025 03:49:47 +0000 (05:49 +0200)]
postgres_fdw: Avoid pulling up restrict infos from subqueries
Semi-join joins below left/right join are deparsed as
subqueries. Thus, we can't refer to subqueries vars from upper relations.
This commit avoids pulling conditions from them.
Reported-by: Robins Tharakan <[email protected]>
Bug: #18852
Discussion: https://postgr.es/m/CAEP4nAzryLd3gwcUpFBAG9MWyDfMRX8ZjuyY2XXjyC_C6k%2B_Zw%40mail.gmail.com
Author: Alexander Pyhalov <
[email protected]>
Reviewed-by: Alexander Korotkov <[email protected]>
Backpatch-through: 17
Heikki Linnakangas [Sun, 23 Mar 2025 18:41:16 +0000 (20:41 +0200)]
Fix rare assertion failure in standby, if primary is restarted
During hot standby, ExpireAllKnownAssignedTransactionIds() and
ExpireOldKnownAssignedTransactionIds() functions mark old transactions
as no-longer running, but they failed to update xactCompletionCount
and latestCompletedXid. AFAICS it would not lead to incorrect query
results, because those functions effectively turn in-progress
transactions into aborted transactions and an MVCC snapshot considers
both as "not visible". But it could surprise GetSnapshotDataReuse()
and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin,
RecentXmin))" assertion in it, if the apparent xmin in a backend would
move backwards. We saw this happen when GetCatalogSnapshot() would
reuse an older catalog snapshot, when GetTransactionSnapshot() had
already advanced TransactionXmin.
The bug goes back all the way to commit
623a9ba79b in v14 that
introduced the snapshot reuse mechanism, but it started to happen more
frequently with commit
952365cded6 which removed a
GetTransactionSnapshot() call from backend startup. That made it more
likely for ExpireOldKnownAssignedTransactionIds() to be called between
GetCatalogSnapshot() and the first GetTransactionSnapshot() in a
backend.
Andres Freund first spotted this assertion failure on buildfarm member
'skink'. Reproduction and analysis by Tomas Vondra.
Backpatch-through: 14
Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5zmdj
Tom Lane [Fri, 21 Mar 2025 15:30:42 +0000 (11:30 -0400)]
Fix plpgsql's handling of simple expressions in scrollable cursors.
exec_save_simple_expr did not account for the possibility that
standard_planner would stick a Materialize node atop the plan
of even a simple Result, if CURSOR_OPT_SCROLL is set. This led
to an "unexpected plan node type" error.
This is a very old bug, but it'd only be reached by declaring a
cursor for a "SELECT simple-expression" query and explicitly
marking it scrollable, which is an odd thing to do. So the lack
of prior reports isn't too surprising.
Bug: #18859
Reported-by: Olleg Samoylov <[email protected]>
Author: Andrei Lepikhov <
[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/18859-
0d5f28ac99a37059@postgresql.org
Backpatch-through: 13
Fujii Masao [Fri, 21 Mar 2025 03:56:39 +0000 (12:56 +0900)]
doc: Remove incorrect description about dropping replication slots.
pg_drop_replication_slot() can drop replication slots created on
a different database than the one where it is executed. This behavior
has been in place since PostgreSQL 9.4, when pg_drop_replication_slot()
was introduced.
However, commit ff539d mistakenly added the following incorrect
description in the documentation:
For logical slots, this must be called when connected to
the same database the slot was created on.
This commit removes that incorrect statement. A similar mistake was
also present in the documentation for the DROP_REPLICATION_SLOT
command, which has now been corrected as well.
Back-patch to all supported versions.
Author: Hayato Kuroda <
[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/OSCPR01MB14966C6BE304B5BB2E58D4009F5DE2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 13
Andres Freund [Wed, 19 Mar 2025 13:04:09 +0000 (09:04 -0400)]
meson: Flush stdout in testwrap
Otherwise the progress won't reliably be displayed during a test.
Reviewed-by: Noah Misch <[email protected]>
Discussion: https://postgr.es/m/kx6xu7suexal5vwsxpy7ybgkcznx6hgywbuhkr6qabcwxjqax2@i4pcpk75jvaa
Backpatch-through: 16
Masahiko Sawada [Tue, 18 Mar 2025 23:36:59 +0000 (16:36 -0700)]
Fix assertion failure in parallel vacuum with minimal maintenance_work_mem setting.
bbf668d66fbf lowered the minimum value of maintenance_work_mem to
64kB. However, in parallel vacuum cases, since the initial underlying
DSA size is 256kB, it attempts to perform a cycle of index vacuuming
and table vacuuming with an empty TID store, resulting in an assertion
failure.
This commit ensures that at least one page is processed before index
vacuuming and table vacuuming begins.
Backpatch to 17, where the minimum maintenance_work_mem value was
lowered.
Reviewed-by: David Rowley <[email protected]>
Discussion: https://postgr.es/m/CAD21AoCEAmbkkXSKbj4dB+5pJDRL4ZHxrCiLBgES_g_g8mVi1Q@mail.gmail.com
Backpatch-through: 17
Andres Freund [Tue, 18 Mar 2025 17:43:10 +0000 (13:43 -0400)]
smgr: Make SMgrRelation initialization safer against errors
In case the smgr_open callback failed, the ->pincount field would not be
initialized and the relation would not be put onto the unpinned_relns list.
This buglet was introduced in
21d9c3ee4ef7, in 17.
Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl
Backpatch-through: 17
Alexander Korotkov [Sun, 16 Mar 2025 11:28:22 +0000 (13:28 +0200)]
reindexdb: Fix the index-level REINDEX with multiple jobs
47f99a407d introduced a parallel index-level REINDEX. The code was written
assuming that running run_reindex_command() with 'async == true' can schedule
a number of queries for a connection. That's not true, and the second query
sent using run_reindex_command() will wait for the completion of the previous
one.
This commit fixes that by putting REINDEX commands for the same table into a
single query.
Also, this commit removes the 'async' argument from run_reindex_command(),
as only its call always passes 'async == true'.
Reported-by: Álvaro Herrera <[email protected]>
Discussion: https://postgr.es/m/
202503071820.j25zn3lo4hvn%40alvherre.pgsql
Reviewed-by: Álvaro Herrera <[email protected]>
Backpatch-through: 17
Tom Lane [Thu, 13 Mar 2025 20:07:55 +0000 (16:07 -0400)]
Fix ARRAY_SUBLINK and ARRAY[] for int2vector and oidvector input.
If the given input_type yields valid results from both
get_element_type and get_array_type, initArrayResultAny believed the
former and treated the input as an array type. However this is
inconsistent with what get_promoted_array_type does, leading to
situations where the output of an ARRAY() subquery is labeled with
the wrong type: it's labeled as oidvector[] but is really a 2-D
array of OID. That at least results in strange output, and can
result in crashes if further processing such as unnest() is applied.
AFAIK this is only possible with the int2vector and oidvector
types, which are special-cased to be treated mostly as true arrays
even though they aren't quite.
Fix by switching the logic to match get_promoted_array_type by
testing get_array_type not get_element_type, and remove an Assert
thereby made pointless. (We need not introduce a symmetrical
check for get_element_type in the other if-branch, because
initArrayResultArr will check it.) This restores the behavior
that existed before
bac27394a introduced initArrayResultAny:
the output really is int2vector[] or oidvector[].
Comparable confusion exists when an input of an ARRAY[] construct
is int2vector or oidvector: transformArrayExpr decides it's dealing
with a multidimensional array constructor, and we end up with
something that's a multidimensional OID array but is alleged to be
of type oidvector. I have not found a crashing case here, but it's
easy to demonstrate totally-wrong results. Adjust that code so
that what you get is an oidvector[] instead, for consistency with
ARRAY() subqueries. (This change also makes these types work like
domains-over-arrays in this context, which seems correct.)
Bug: #18840
Reported-by: yang lei <[email protected]>
Author: Tom Lane <
[email protected]>
Discussion: https://postgr.es/m/18840-
fbc9505f066e50d6@postgresql.org
Backpatch-through: 13