git.postgresql.org Git - postgresql-pgindent.git/log

Add info in WAL records in preparation for logical slot conflict handling

This commit only implements one prerequisite part for allowing logical
decoding. The commit message contains an explanation of the overall design,
which later commits will refer back to.

Overall design:

1. We want to enable logical decoding on standbys, but replay of WAL
from the primary might remove data that is needed by logical decoding,
causing error(s) on the standby. To prevent those errors, a new replication
conflict scenario needs to be addressed (as much as hot standby does).

2. Our chosen strategy for dealing with this type of replication slot
is to invalidate logical slots for which needed data has been removed.

3. To do this we need the latestRemovedXid for each change, just as we
do for physical replication conflicts, but we also need to know
whether any particular change was to data that logical replication
might access. That way, during WAL replay, we know when there is a risk of
conflict and, if so, if there is a conflict.

4. We can't rely on the standby's relcache entries for this purpose in
any way, because the startup process can't access catalog contents.

5. Therefore every WAL record that potentially removes data from the
index or heap must carry a flag indicating whether or not it is one
that might be accessed during logical decoding.

Why do we need this for logical decoding on standby?

First, let's forget about logical decoding on standby and recall that
on a primary database, any catalog rows that may be needed by a logical
decoding replication slot are not removed.

This is done thanks to the catalog_xmin associated with the logical
replication slot.

But, with logical decoding on standby, in the following cases:

- hot_standby_feedback is off
- hot_standby_feedback is on but there is no a physical slot between
  the primary and the standby. Then, hot_standby_feedback will work,
  but only while the connection is alive (for example a node restart
  would break it)

Then, the primary may delete system catalog rows that could be needed
by the logical decoding on the standby (as it does not know about the
catalog_xmin on the standby).

So, it’s mandatory to identify those rows and invalidate the slots
that may need them if any. Identifying those rows is the purpose of
this commit.

Implementation:

When a WAL replay on standby indicates that a catalog table tuple is
to be deleted by an xid that is greater than a logical slot's
catalog_xmin, then that means the slot's catalog_xmin conflicts with
the xid, and we need to handle the conflict. While subsequent commits
will do the actual conflict handling, this commit adds a new field
isCatalogRel in such WAL records (and a new bit set in the
xl_heap_visible flags field), that is true for catalog tables, so as to
arrange for conflict handling.

The affected WAL records are the ones that already contain the
snapshotConflictHorizon field, namely:

- gistxlogDelete
- gistxlogPageReuse
- xl_hash_vacuum_one_page
- xl_heap_prune
- xl_heap_freeze_page
- xl_heap_visible
- xl_btree_reuse_page
- xl_btree_delete
- spgxlogVacuumRedirect

Due to this new field being added, xl_hash_vacuum_one_page and
gistxlogDelete do now contain the offsets to be deleted as a
FLEXIBLE_ARRAY_MEMBER. This is needed to ensure correct alignment.
It's not needed on the others struct where isCatalogRel has
been added.

This commit just introduces the WAL format changes mentioned above. Handling
the actual conflicts will follow in future commits.

Bumps XLOG_PAGE_MAGIC as the several WAL records are changed.

Author: "Drouvot, Bertrand" <[email protected]>
Author: Andres Freund <[email protected]> (in an older version)
Author: Amit Khandekar <[email protected]>  (in an older version)
Reviewed-by: "Drouvot, Bertrand" <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Fabrízio de Royes Mello <[email protected]>
Reviewed-by: Melanie Plageman <[email protected]>

Fix copy-pasto in contrib/auth_delay/meson.build variable name.

Use PG_TEST_TIMEOUT_DEFAULT in 019_replslot_limit.pl.

Oversight in f2698ea02ca8a56f38935d2b300ac54936712558, which introduced
the variable.  This lowers some 1000s timeouts to the configurable
default of 180s, due to a lack of evidence for needing the longer
timeout.  The alternative was 6*PG_TEST_TIMEOUT_DEFAULT, which we can
adopt if the need arises.  Given the lack of observed trouble with these
timeouts, no back-patch.

Pass down table relation into more index relation functions

This is done in preparation for logical decoding on standby, which needs to
include whether visibility affecting WAL records are about a (user) catalog
table. Which is only known for the table, not the indexes.

It's also nice to be able to pass the heap relation to GlobalVisTestFor() in
vacuumRedirectAndPlaceholder().

Author: "Drouvot, Bertrand" <[email protected]>
Discussion: https://postgr.es/m/21b700c3-eecf-2e05-a699-f8c78dd31ec7@gmail.com

Assert only valid flag bits are passed to visibilitymap_set()

If visibilitymap_set() is called with flags containing a higher bit than
VISIBILITYMAP_ALL_FROZEN, the state of neighboring pages is affected. While
there was an assertion that *some* valid bits were set, it did not check
that *only* valid bits were. Change that.

Discussion: https://postgr.es/m/20230331043300 [email protected]

hio: Release extension lock before initializing page / pinning VM

PageInit() while holding the extension lock is unnecessary after 0d1fe9f74e3
started to use RBM_ZERO_AND_LOCK - nobody can look at the new page before we
release the page lock. PageInit() zeroes the page, which isn't that cheap, so
deferring it until after the extension lock is released seems like a good idea.

Doing visibilitymap_pin() while holding the extension lock, introduced in
7db0cd2145f2, looks like an accident. Due to the restrictions on
HEAP_INSERT_FROZEN it's unlikely to be a performance issue, but it still seems
better to move it out. We also are doing the visibilitymap_pin() while
holding the buffer lock, which will be fixed in a separate commit.

Reviewed-by: Heikki Linnakangas <[email protected]>
Discussion: http://postgr.es/m/419312fd-9255-078c-c3e3-f0525f911d7f@iki.fi

pg_dump: Use only LZ4 frame format for compression

After 0da243fed0 got committed, it was reported that in some cases the
compression ratio is rather poor - particularly for custom format with
narrow tables - due to writing the LZ4 header/footer for each row.

This commit switches to LZ4F (LZ4 frame format), eliminating most of the
overhead and greatly improving the compression ratio. This makes the
compressed size about the same for plain and custom formats (just like
for gzip, for example).

LZ4F is now used by both compression APIs, which allowed refactoring and
reusing more of the code. For consistency this also renames the LZ4File
struct to LZ4State, and a number of functions are now prefixed with
LZ4Stream_ (instead of LZ4File_).

Patch by Georgios Kokolatos, based on report and initial patch by Justin
Pryzby. Review and minor cleanups by me.

Author: Georgios Kokolatos, Justin Pryzby
Reported-by: Justin Pryzby
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/20230227044910.GO1653%40telsasoft.com

Doc: add Buffer Access Strategy to the glossary

It seems useful to add this to the glossary as there's discussion around
adding an option to VACUUM to disable and adjust the size of the buffer
access strategy that VACUUM uses.

Author: Melanie Plageman
Reviewed-by: Justin Pryzby, David Rowley
Discussion: https://postgr.es/m/ZBYDTrD1kyGg%2BHkS%40telsasoft.com

Add show_data option to pg_get_wal_block_info.

Allow users to opt out of returning FPI data and block data from
pg_get_wal_block_info as an optimization.  Testing has shown that this
can make function execution over twice as fast in some cases.

When pg_get_wal_block_info is called with "show_data := false", it
always returns NULL values for its block_data and block_fpi_data bytea
output parameters.  Nothing else changes.  In particular, the function
will still return the usual per-block summary of block data/FPI space
overhead.  Use of "show_data := false" is therefore feasible with all
queries that don't specifically require these raw binary strings.

Follow-up to recent work in commit 122376f0.  There still hasn't been a
stable release with the pg_get_wal_block_info function, so no bump in
the pg_walinspect extension version.

Per suggestion from Melanie Plageman.

Author: Peter Geoghegan <[email protected]>
Discussion: https://postgr.es/m/CAAKRu_bJvbcYBRj2cN6G2xV7B7-Ja+pjTO1nEnEhRR8OXYiABA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-Wzm9shOkEDM10_+qOZkRSQhKVxwBFiehH6EHWQQRd_rDPw@mail.gmail.com

SQL/JSON: support the IS JSON predicate

This patch introduces the SQL standard IS JSON predicate. It operates
on text and bytea values representing JSON, as well as on the json and
jsonb types. Each test has IS and IS NOT variants and supports a WITH
UNIQUE KEYS flag. The tests are:

IS JSON [VALUE]
IS JSON ARRAY
IS JSON OBJECT
IS JSON SCALAR

These should be self-explanatory.

The WITH UNIQUE KEYS flag makes these return false when duplicate keys
exist in any object within the value, not necessarily directly contained
in the outermost object.

Author: Nikita Glukhov <[email protected]>
Author: Teodor Sigaev <[email protected]>
Author: Oleg Bartunov <[email protected]>
Author: Alexander Korotkov <[email protected]>
Author: Amit Langote <[email protected]>
Author: Andrew Dunstan <[email protected]>

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130 [email protected]
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org

Further tweaking of width_bucket() edge cases.

I realized that the third overflow case I posited in commit b0e9e4d76
actually should be handled in a different way: rather than tolerating
the idea that the quotient could round to 1, we should clamp so that
the output cannot be more than "count" when we know that the operand is
less than bound2. That being the case, we don't need an overflow-aware
increment in that code path, which leads me to revert the movement of
the pg_add_s32_overflow() call. (The diff in width_bucket_float8
might be easier to read by comparing against b0e9e4d76^.)

What's more, width_bucket_numeric also has this problem of the quotient
potentially rounding to 1, so add a clamp there too.

As before, I'm not quite convinced that a back-patch is warranted.

Discussion: https://postgr.es/m/391415.1680268470@sss.pgh.pa.us

Reject system columns as elements of foreign keys.

Up through v11 it was sensible to use the "oid" system column as
a foreign key column, but since that was removed there's no visible
usefulness in making any of the remaining system columns a foreign
key.  Moreover, since the TupleTableSlot rewrites in v12, such cases
actively fail because of implicit assumptions that only user columns
appear in foreign keys.  The lack of complaints about that seems
like good evidence that no one is trying to do it.  Hence, rather
than trying to repair those assumptions (of which there are at least
two, maybe more), let's just forbid the case up front.

Per this patch, a system column in either the referenced or
referencing side of a foreign key will draw this error; however,
putting one in the referenced side would have failed later anyway,
since we don't allow unique indexes to be made on system columns.

Per bug #17877 from Alexander Lakhin.  Back-patch to v12; the
case still appears to work in v11, so we shouldn't break it there.

Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org

Ensure acquire_inherited_sample_rows sets its output parameters.

The totalrows/totaldeadrows outputs were left uninitialized in cases
where we find no analyzable child tables of a partitioned table. This
could lead to setting the partitioned table's pg_class.reltuples value
to garbage. It's not clear that that would have any very bad effects
in practice, but fix it anyway because it's making valgrind unhappy.

Reported and diagnosed by Alexander Lakhin (bug #17880).

Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org

pg_regress: Emit TAP compliant output

This converts pg_regress output format to emit TAP compliant output
while keeping it as human readable as possible for use without TAP
test harnesses. As verbose harness related information isn't really
supported by TAP this also reduces the verbosity of pg_regress runs
which makes scrolling through log output in buildfarm/CI runs a bit
easier as well.

As the meson TAP parser conumes whitespace, the leading indentation
for differentiating parallel tests from sequential tests has been
changed to a single character prefix.

This patch has been around for an extended period of time, reviewers
listed below may have been involved in reviewing a version quite
different from the version in this commit. The original idea for
this patch was a hacking session with Jinbao Chen.

TAP format testing is also enabled in meson as of this.

Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Nikolay Shaplov <[email protected]>
Reviewed-by: Dagfinn Ilmari Mannsåker <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Discussion: https://postgr.es/m/BD4B107D-7E53-4794-ACBA-275BEB4327C9@yesql.se
Discussion: https://postgr.es/m/20220221164736 [email protected]

Move ExecEvalJsonConstructor new function to a more natural place

Commit 7081ac46ace8 put it at the end of the file, but that doesn't look
very nice.

No need to add FORMAT to the keyword precedence list

Commit 7081ac46ace8 put it there. Remove it.

Add XML ID attributes to create_publication.sgml.

This commit adds XML ID attributes to all varlistentries in
create_publication.sgml. This allows us to include links to refer to
publication options, making documents more readable.

Author: Kuroda Hayato
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/TYAPR01MB58668219FEA4EC231486A433F58E9@TYAPR01MB5866.jpnprd01.prod.outlook.com

Bump PGSTAT_FILE_FORMAT_ID, omitted in 8aaa04b32d7

I forgot to do so in the referenced commit. While the consequences of omitting
the version change are likely to be harmless (besides discarding stats, as a
PGSTAT_FILE_FORMAT_ID bump also does), it still seems worth doing.

Track shared buffer hits in pg_stat_io

Among other things, this should make it easier to calculate a useful cache hit
ratio by excluding buffer reads via buffer access strategies. As buffer access
strategies reuse buffers (and thus evict the prior buffer contents), it is
normal to see reads on repeated scans of the same data.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Bertrand Drouvot <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/CAAKRu_beMa9Hzih40%3DXPYqhDVz6tsgUGTrhZXRo%3Dunp%2Bszb%3DUA%40mail.gmail.com

Fix List memory issue in transformColumnDefinition

When calling generateSerialExtraStmts(), we would pass in the
constraint->options. In some cases, generateSerialExtraStmts() would
modify the referenced List to remove elements from it, but doing so is
invalid without assigning the list back to all variables that point to it.
In the particular reported problem case, the List became empty, in which
cases it became NIL, but the passed in constraint->options didn't get to
find out about that and was left pointing to free'd memory.

To fix this, just perform a list_copy() inside generateSerialExtraStmts().
We could just do a list_copy() just before we perform the delete from the
list, however, that seems less robust. Let's make sure the generated
CreateSeqStmt gets a completely different copy of the list to be safe.

Bug: #17879
Reported-by: Fei Changhong
Diagnosed-by: Fei Changhong
Discussion: https://postgr.es/m/17879-b7dfb5debee58ff5@postgresql.org
Backpatch-through: 11, all supported versions

Parallel Hash Full Join.

Full and right outer joins were not supported in the initial
implementation of Parallel Hash Join because of deadlock hazards (see
discussion).  Therefore FULL JOIN inhibited parallelism, as the other
join strategies can't do that in parallel either.

Add a new PHJ phase PHJ_BATCH_SCAN that scans for unmatched tuples on
the inner side of one batch's hash table.  For now, sidestep the
deadlock problem by terminating parallelism there.  The last process to
arrive at that phase emits the unmatched tuples, while others detach and
are free to go and work on other batches, if there are any, but
otherwise they finish the join early.

That unfairness is considered acceptable for now, because it's better
than no parallelism at all.  The build and probe phases are run in
parallel, and the new scan-for-unmatched phase, while serial, is usually
applied to the smaller of the two relations and is either limited by
some multiple of work_mem, or it's too big and is partitioned into
batches and then the situation is improved by batch-level parallelism.

Author: Melanie Plageman <[email protected]>
Author: Thomas Munro <[email protected]>
Reviewed-by: Thomas Munro <[email protected]>
Discussion: https://postgr.es/m/CA%2BhUKG%2BA6ftXPz4oe92%2Bx8Er%2BxpGZqto70-Q_ERwRaSyA%3DafNg%40mail.gmail.com

pg_stat_wal: Accumulate time as instr_time instead of microseconds

In instr_time.h it is stated that:

* When summing multiple measurements, it's recommended to leave the
* running sum in instr_time form (ie, use INSTR_TIME_ADD or
* INSTR_TIME_ACCUM_DIFF) and convert to a result format only at the end.

The reason for that is that converting to microseconds is not cheap, and can
loose precision. Therefore this commit changes 'PendingWalStats' to use
'instr_time' instead of 'PgStat_Counter' while accumulating 'wal_write_time'
and 'wal_sync_time'.

Author: Nazir Bilal Yavuz <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Kyotaro Horiguchi <[email protected]>
Reviewed-by: Melanie Plageman <[email protected]>
Discussion: https://postgr.es/m/1feedb83-7aa9-cb4b-5086-598349d3f555@gmail.com

Show record information in pg_get_wal_block_info.

Expand the output parameters in pg_walinspect's pg_get_wal_block_info
function to return additional information that was previously only
available from pg_walinspect's pg_get_wal_records_info function.  Some
of the details are attributed to individual block references, rather
than aggregated into whole-record values, since the function returns one
row per block reference per WAL record (unlike pg_get_wal_records_info,
which always returns one row per WAL record).

This structure is much easier to work with when writing queries that
track how individual blocks changed over time, or when attributing costs
to individual blocks (not WAL records) is useful.

This is the second time that pg_get_wal_block_info has been enhanced in
recent weeks.  Commit 9ecb134a expanded on the original version of the
function added in commit c31cf1c0 (where it first appeared under the
name pg_get_wal_fpi_info).  There still hasn't been a stable release
since commit c31cf1c0, so no bump in the pg_walinspect extension
version.

Author: Bharath Rupireddy <[email protected]>
Author: Peter Geoghegan <[email protected]>
Reviewed-By: Peter Geoghegan <[email protected]>
Reviewed-By: Kyotaro HORIGUCHI <[email protected]>
Discussion: https://postgr.es/m/CALj2ACVRK5=Z+2ZVsjgTTSkfEnQzCuwny7iigpG7g1btk4Ws2A@mail.gmail.com

Simplify transformJsonAggConstructor() API

There's no need for callers to pass aggregate names so that the function
can resolve them to OIDs, when callers can just pass aggregate OIDs
directly to begin with.

Fix inconsistencies and style issues in new SQL/JSON code

Reported by Alexander Lakhin.

Discussion: https://postgr.es/m/60483139-5c34-851d-baee-6c0d014e1710@gmail.com

Fix setrefs.c code for adjusting partPruneInfos

We were transferring partPruneInfos from PlannerInfo into PlannerGlobal
wrong, essentially relying on all of them being transferred, and
adjusting their list indexes based on that.  But apparently it's
possible that some of them are skipped, so that strategy leads to a
corrupted execution tree.  Instead, adjust each Append/MergeAppend's
partpruneinfo index as we copy from one list to the other, which seems
safer anyway.  This requires adjusting the RT offset of the RTE
referenced in each partPruneInfo ahead of actually adjusting the RTE
itself, which seems a bit too ad-hoc.

This problem was introduced by commit ec386948948c.  However, it may be
that we no longer require the change introduced there, so perhaps we
should revert both the present commit and that one.

Problem noticed by sqlsmith.

Author: Amit Langote <[email protected]
Discussion: https://postgr.es/m/CA+HiwqG6tbc2oadsbyyy24b2AL295XHQgyLRWghmA7u_SL1K8A@mail.gmail.com

Fix format code in fd.c debugging infrastructure

These were not sufficiently adjusted in 2d4f1ba6cfc.

bufmgr: Fix undefined behaviour with, unrealistically, large temp_buffers

Quoting Melanie:
> Since if buffer is INT_MAX, then the -(buffer + 1) version invokes
> undefined behavior while the -buffer - 1 version doesn't.

All other places were already using the correct version. I (Andres), copied
the code into more places in a patch. Melanie caught it in review, but to
prevent more people from copying the bad code, fix it. Even if it is a
theoretical issue.

We really ought to wrap these accesses in a helper function...

As this is a theoretical issue, don't backpatch.

Reported-by: Melanie Plageman <[email protected]>
Discussion: https://postgr.es/m/CAAKRu_aW2SX_LWtwHgfnqYpBrunMLfE9PD6-ioPpkh92XH0qpg@mail.gmail.com

Clean up role created in new subscription test.

This oversight broke repeated runs of "make installcheck".

Fix documentation build for c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6.

This documentation hunk was intended to be part of that commit,
but I goofed.

Add new predefined role pg_create_subscription.

This role can be granted to non-superusers to allow them to issue
CREATE SUBSCRIPTION. The non-superuser must additionally have CREATE
permissions on the database in which the subscription is to be
created.

Most forms of ALTER SUBSCRIPTION, including ALTER SUBSCRIPTION .. SKIP,
now require only that the role performing the operation own the
subscription, or inherit the privileges of the owner. However, to
use ALTER SUBSCRIPTION ... RENAME or ALTER SUBSCRIPTION ... OWNER TO,
you also need CREATE permission on the database. This is similar to
what we do for schemas. To change the owner of a schema, you must also
have permission to SET ROLE to the new owner, similar to what we do
for other object types.

Non-superusers are required to specify a password for authentication
and the remote side must use the password, similar to what is required
for postgres_fdw and dblink. A superuser who wants a non-superuser to
own a subscription that does not rely on password authentication may
set the new password_required=false property on that subscription. A
non-superuser may not set password_required=false and may not modify a
subscription that already has password_required=false.

This new password_required subscription property works much like the
eponymous postgres_fdw property. In both cases, the actual semantics
are that a password is not required if either (1) the property is set
to false or (2) the relevant user is the superuser.

Patch by me, reviewed by Andres Freund, Jeff Davis, Mark Dilger,
and Stephen Frost (but some of those people did not fully endorse
all of the decisions that the patch makes).

Discussion: http://postgr.es/m/CA+TgmoaDH=0Xj7OBiQnsHTKcF2c4L+=gzPBUKSJLh8zed2_+Dg@mail.gmail.com

Avoid overflow in width_bucket_float8().

The original coding of this function paid little attention to the
possibility of overflow.  There were actually three different hazards:

1. The range from bound1 to bound2 could exceed DBL_MAX, which on
IEEE-compliant machines produces +Infinity in the subtraction.
At best we'd lose all precision in the result, and at worst
produce NaN due to dividing Inf/Inf.  The range can't exceed
twice DBL_MAX though, so we can fix this case by scaling all the
inputs by 0.5.

2. We computed count * (operand - bound1), which is also at risk of
float overflow, before dividing.  Safer is to do the division first,
producing a quotient that should be in [0,1), and even after allowing
for roundoff error can't be outside [0,1]; then multiplying by count
can't produce a result overflowing an int.  (width_bucket_numeric does
the multiplication first on the grounds that that improves accuracy of
its result, but I don't think that a similar argument can be made in
float arithmetic.)

3. If the division result does round to 1, and count is INT_MAX,
the final addition of 1 would overflow an int.  We took care
of that in the operand >= bound2 case but did not consider that
it could be possible in the main path.  Fix that by moving the
overflow-aware addition of 1 so it is done that way in all cases.

The fix for point 2 creates a possibility that values very close to
a bucket boundary will be rounded differently than they were before.
I'm not troubled by that for HEAD, but it is an argument against
putting this into the stable branches.  Given that the cases being
fixed here are fairly extreme and unlikely to be hit in normal use,
it seems best not to back-patch.

Mats Kindahl and Tom Lane

Discussion: https://postgr.es/m/17876-61f280d1601f978d@postgresql.org

Fix pointer cast for seed calculation on 32-bit systems

The fallback seed for when pg_strong_random cannot generate a high
quality seed mixes in the address of the conn object, but the cast
failed to take the word size into consideration. Fix by casting to
a uintptr_t instead. The seed calculation was added in 7f5b19817e.

The code as it stood generated the following warning on mamba and
lapwing in the buildfarm:

fe-connect.c: In function 'libpq_prng_init':
fe-connect.c:1048:11: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
1048 | rseed = ((uint64) conn) ^
| ^

Author: Hayato Kuroda <[email protected]>
Discussion: https://postgr.es/m/TYAPR01MB58665250EDCD551CCA9AD117F58E9@TYAPR01MB5866.jpnprd01.prod.outlook.com

Fix incorrect format placeholders

Refactor pgoutput_change().

Instead of mostly-duplicate code for different operation
(insert/update/delete) types, write a common code to compute old/new
tuples, and check the row filter.

Author: Hou Zhijie
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB5716194A47FFA8D91133687D94BF9@OS0PR01MB5716.jpnprd01.prod.outlook.com

Fix outdated comments regarding TupleTableSlots

The tts_flag is named TTS_FLAG_SHOULDFREE, so use that instead of
TTS_SHOULDFREE, which is the name of the macro that checks for that flag.

Additionally, 4da597edf got rid of the TupleTableSlot.tts_tuple field but
forgot to update a comment which referenced that field. Fix that.

Reported-by: Zhen Mingyang <[email protected]>
Reported-by: Richard Guo <[email protected]>
Discussion: https://postgr.es/m/1a96696c.9d3.187193989c3 [email protected]

Support connection load balancing in libpq

This adds support for load balancing connections with libpq using a
connection parameter: load_balance_hosts=<string>. When setting the
param to random, hosts and addresses will be connected to in random
order. This then results in load balancing across these addresses and
hosts when multiple clients or frequent connection setups are used.

The randomization employed performs two levels of shuffling:

  1. The given hosts are randomly shuffled, before resolving them
     one-by-one.
  2. Once a host its addresses get resolved, the returned addresses
     are shuffled, before trying to connect to them one-by-one.

Author: Jelte Fennema <[email protected]>
Reviewed-by: Aleksander Alekseev <[email protected]>
Reviewed-by: Michael Banck <[email protected]>
Reviewed-by: Andrey Borodin <[email protected]>
Discussion: https://postgr.es/m/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.

Copy and store addrinfo in libpq-owned private memory

This refactors libpq to copy addrinfos returned by getaddrinfo to
memory owned by libpq such that future improvements can alter for
example the order of entries.

As a nice side effect of this refactor the mechanism for iteration
over addresses in PQconnectPoll is now identical to its iteration
over hosts.

Author: Jelte Fennema <[email protected]>
Reviewed-by: Aleksander Alekseev <[email protected]>
Reviewed-by: Michael Banck <[email protected]>
Reviewed-by: Andrey Borodin <[email protected]>
Discussion: https://postgr.es/m/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.com

Fix dereference of dangling pointer in GiST index buffering build.

gistBuildCallback tried to fetch the size of an index tuple that
might have already been freed by gistProcessEmptyingQueue.
While this seems to usually be harmless in production builds,
in principle it could result in a SIGSEGV, or more likely a bogus
value for indtuplesSize leading to poor page-split decisions later
in the build.

The memory management here is confusing and could stand to be
refactored, but for the moment it seems to be enough to fetch
the tuple size sooner. AFAICT the indtuples[Size] totals aren't
used in between these places; even if they were, the updated
values shouldn't be any worse to use. So just move the
incrementing of the totals up.

It's not very clear why our valgrind-using buildfarm animals
haven't noticed this problem, because the relevant code path
does seem to be exercised according to the code coverage report.
I think the reason that we didn't fix this bug after the first
report is that I'd wanted to try to understand that better.
However, now that it's been re-discovered let's just be pragmatic
and fix it already.

Original report by Alexander Lakhin (bug #16329),
later rediscovered by Egor Chindyaskin (bug #17874).

Patch by Alexander Lakhin (commentary by Pavel Borisov and me).
Back-patch to all supported branches.

Discussion: https://postgr.es/m/16329-7a6aa9b6fa1118a1@postgresql.org
Discussion: https://postgr.es/m/17874-63ca6c7ce42d2103@postgresql.org

Add missing .gitignore entries.

Oversight in commit 7081ac46ace8c459966174400b53418683c9fe5c.

Remove empty function BufmgrCommit().

This function has been a no-op for over a decade. Even if bufmgr
regains a need to be called during commit, it seems unlikely that
the most appropriate call points would be precisely here, so it's not
doing us much good as a placeholder either. Now, removing it probably
doesn't save any noticeable number of cycles --- but the main call is
inside the commit critical section, and the less work done there the
better.

Matthias van de Meent

Discussion: https://postgr.es/m/CAEze2Wi1=tLKbxZnXzcD+8fYKyKqBtivVakLQC_mYBsP4Y8qVA@mail.gmail.com

SQL/JSON: add standard JSON constructor functions

This commit introduces the SQL/JSON standard-conforming constructors for
JSON types:

JSON_ARRAY()
JSON_ARRAYAGG()
JSON_OBJECT()
JSON_OBJECTAGG()

Most of the functionality was already present in PostgreSQL-specific
functions, but these include some new functionality such as the ability
to skip or include NULL values, and to allow duplicate keys or throw
error when they are found, as well as the standard specified syntax to
specify output type and format.

Author: Nikita Glukhov <[email protected]>
Author: Teodor Sigaev <[email protected]>
Author: Oleg Bartunov <[email protected]>
Author: Alexander Korotkov <[email protected]>
Author: Amit Langote <[email protected]>

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130 [email protected]
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org

Fix some section numbers in information_schema.sql

Some of the section numbers that appeared multiple times were not
updated completely by previous changes d61d9aa750 and eb3a1376c9.

meson: Change default buildtype to debugoptimized

This matches the Autoconf default (-O2 + debug) better. The previous
default setting "release" used -O3, which resulted in different
compiler warnings. At least for now, we want to avoid such
divergence.

Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRBJD_Y-XcqwXSbWS24z%2B84FFX7ajhCan9ixc_m4bD63sA%40mail.gmail.com

Move definition of standard collations from initdb to pg_collation.dat

The standard collations "ucs_basic" and "unicode" were defined in
initdb, even though pg_collation.dat seems like the correct place for
them. It seems this was just forgotten during various reorganizations
of initdb and pg_collation.dat/.h over time.

Reviewed-by: Tom Lane <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/08b58ecd-0d50-9395-ed51-dc8294e3fd2b%40enterprisedb.com

Simplify useless 0L constants

In ancient times, these belonged to arguments or fields that were
actually of type long, but now they are not anymore, so this "L"
decoration is just confusing. (Some other 0L and other "L" constants
remain, where they are actually associated with a long type.)

Avoid syncing data twice for the 'publish_via_partition_root' option.

When there are multiple publications for a subscription and one of those
publishes via the parent table by using publish_via_partition_root and the
other one directly publishes the child table, we end up copying the same
data twice during initial synchronization. The reason for this was that we
get both the parent and child tables from the publisher and try to copy
the data for both of them.

This patch extends the function pg_get_publication_tables() to take a
publication list as its input parameter. This allows us to exclude a
partition table whose ancestor is published by the same publication list.

This problem does exist in back-branches but we decide to fix it there in
a separate commit if required. The fix for back-branches requires quite
complicated changes to fetch the required table information from the
publisher as we can't update the function pg_get_publication_tables() in
back-branches. We are not sure whether we want to deviate and complicate
the code in back-branches for this problem as there are no field reports
yet.

Author: Wang wei
Reviewed-by: Peter Smith, Jacob Champion, Kuroda Hayato, Vignesh C, Osumi Takamichi, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB57167F45D481F78CDC5986F794B99@OS0PR01MB5716.jpnprd01.prod.outlook.com

Add XML ID attributes to create_subscription.sgml.

Commit ecb696527c added an XML ID attribute to one varlistentry in
create_subscription.sgml. Following 78ee60ed84, this commit adds XML ID
attributes to all varlistentries in create_subscription.sgml.
Additionally, links are added to refer to the subscription options,
enhancing the readability of documents.

Author: Kuroda Hayato
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/TYAPR01MB58667AE04D291924671E2051F5879@TYAPR01MB5866.jpnprd01.prod.outlook.com

pg_dump: Fix gzip compression of empty data

The pg_dump Compressor API has three basic callbacks - Allocate, Write
and End.  The gzip implementation (since e9960732a) wrongly assumed the
Write function would always be called, and deferred the initialization
of the internal compression system until the first such call.  But when
there's no data to compress (e.g. for empty LO), this would result in
not finalizing the compression state (because it was not actually
initialized), producing invalid dump.

Fixed by initializing the internal compression system in the Allocate
call, whenever the caller provides the Write.  For decompression the
state is not needed, so we leave the private_data member unpopulated.

Introduces a pg_dump TAP test compressing an empty large object.

This also rearranges the functions to their original order, to make
diffs against older code simpler to understand.  Finally, replace an
unreachable pg_fatal() with a simple assert check.

Reported-by: Justin Pryzby
Author: Justin Pryzby, Georgios Kokolatos
Reviewed-by: Georgios Kokolatos, Tomas Vondra
https://postgr.es/m/20230228235834.GC30529%40telsasoft.com

Validate ICU locales.

For ICU collations, ensure that the locale's language exists in ICU,
and that the locale can be opened.

Basic validation helps avoid minor mistakes and misspellings, which
often fall back to the root locale instead of the intended
locale. It's even more important to avoid such mistakes in ICU
versions 54 and earlier, where the same (misspelled) locale string
could fall back to different locales depending on the environment.

Discussion: https://postgr.es/m/11b1eeb7e7667fdd4178497aeb796c48d26e69b9 [email protected]
Discussion: https://postgr.es/m/df2efad0cae7c65180df8e5ebb709e5eb4f2a82b [email protected]
Reviewed-by: Peter Eisentraut

amcheck: In verify_heapam, allows tuples with xmin 0.

Commit e88754a1965c0f40a723e6e46d670cacda9e19bd caused that case
to be reported as corruption, but Peter Geoghegan pointed out that
it can legitimately happen in the case of a speculative insertion
that aborts, so we'd better not flag it as corruption after all.

Back-patch to v14, like the commit that introduced the issue.

Discussion: http://postgr.es/m/CAH2-WzmEabzcPTxSY-NXKH6Qt3FkAPYHGQSe2PtvGgj17ZQkCw@mail.gmail.com

Fix recent pg_walinspect fpi_length bug.

Commit 0276ae42dd taught pg_walinspect's pg_get_wal_record_info()
function to output NULLs rather than empty strings for its record
description and block_ref output parameters. However, it inadvertently
moved the function call that sets fpi_length until after it was already
set. As a result, pg_get_wal_record_info() always output spurious
fpi_length values of 0.

Fix by switching the order back (but keep the behavioral change).

Author: Bharath Rupireddy <[email protected]>
Discussion: https://postgr.es/m/CAH2-WzkJmgSYkt6-smQ+57SxSmov+EKqFZdSimFewosoL_JKoA@mail.gmail.com

Fix corner-case planner failure for MERGE.

MERGE planning could fail with "variable not found in subplan target
list" if the target table is partitioned and all its partitions are
excluded at plan time, or in the case where it has no partitions but
used to have some. This happened because distribute_row_identity_vars
thought it didn't need to make the target table's reltarget list
fully valid; but if we generate a join plan then that is required
because the dummy Result node's tlist will be made from the reltarget.

The same logic appears in distribute_row_identity_vars in v14,
but AFAICS the problem is unreachable in that branch for lack of
MERGE. In other updating statements, the target table is always
inner-joined to any other tables, so if the target is known dummy
then the whole plan reduces to dummy, so no join nodes are created.
So I'll refrain from back-patching this code change to v14 for now.

Per report from Alvaro Herrera.

Discussion: https://postgr.es/m/20230328112248 [email protected]

initdb: emit message when using default ICU locale.

Helpful to determine from test logs whether the locale came from the
environment or a command-line option.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut

initdb: replace check_icu_locale() with default_icu_locale().

The extra checks done in check_icu_locale() are not necessary. An
existing comment already pointed out that the checks would be done
during post-bootstrap initialization, when the locale is opened by the
backend. This was a mistake in commit 27b62377b4.

This commit creates a simpler function default_icu_locale() to just
return the locale of the default collator.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut

Fix error inconsistency in older ICU versions.

To support older ICU versions, we rely on
icu_set_collation_attributes() to do error checking that is handled
directly by ucol_open() in newer ICU versions. Commit 3b50275b12
introduced a slight inconsistency, where the error report includes the
fixed-up locale string, rather than the locale string passed to
pg_ucol_open().

Refactor slightly so that pg_ucol_open() handles the errors from both
ucol_open() and icu_set_collation_attributes(), making it easier to
see any differences between the error reports. It also makes
pg_ucol_open() responsible for closing the UCollator on error, which
seems like the right place.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut

Save a few bytes in pg_attribute

Change the columns attndims, attstattarget, and attinhcount from int32
to int16, and reorder a bit. This saves some space (currently 4
bytes) in pg_attribute and tuple descriptors, which translates into
small performance benefits and/or room for new columns in pg_attribute
needed by future features.

attndims and attinhcount are never realistically used with values
larger than int16. Just to be sure, add some overflow checks.
attstattarget is currently limited explicitly to 10000.

For consistency, pg_constraint.coninhcount is also changed like
attinhcount.

Discussion: https://www.postgresql.org/message-id/flat/d07ffc2b-e0e8-77f7-38fb-be921dff71af%40enterprisedb.com

pg_walinspect: Adjust memory context name.

Correct the name of the memory context used by the
pg_get_wal_block_info() SQL-callable function.

Oversight in commit 9ecb134a93.

Generate a few more functions of pgstatfuncs.c with macros

Two new macros are added with their respective functions switched to
use them. These are for functions with millisecond stats, with and
without "xact" in their names (for the stats that can be tracked within
a transaction).

While on it, prefix the macro for float8 on database entries with "_MS",
as it does a us->ms conversion, based on a suggestion from Andres
Freund.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/6e2efb4f-6fd0-807e-f6bf-94207db8183a@gmail.com

Reject attempts to alter composite types used in indexes.

find_composite_type_dependencies() ignored indexes, which is a poor
decision because an expression index could have a stored column of
a composite (or other container) type even when the underlying table
does not.  Teach it to detect such cases and error out.  We have to
work a bit harder than for other relations because the pg_depend entry
won't identify the specific index column of concern, but it's not much
new code.

This does not address bug #17872's original complaint that dropping
a column in such a type might lead to violations of the uniqueness
property that a unique index is supposed to ensure.  That seems of
much less concern to me because it won't lead to crashes.

Per bug #17872 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org

amcheck: Generalize one of the recently-added update chain checks.

Commit bbc1376b39627c6bddd8a0dc0a7dda24c91a97a0 checked that if
a redirected line pointer pointed to a tuple, the tuple should be
marked both HEAP_ONLY_TUPLE and HEAP_UPDATED. But Andres Freund
pointed out that *any* tuple that is marked HEAP_ONLY_TUPLE should
be marked HEAP_UPDATED, not just one that is the target of a
redirected line pointer. Do that instead.

To see why this is better, consider a redirect line pointer A
which points to a heap-only tuple B which points (via CTID)
to another heap-only tuple C. With the old code, we'd complain
if B was not marked HEAP_UPDATED, but with this change, we'll
complain if either B or C is not marked HEAP_UPDATED.

(Note that, with or without this commit, if either B or C were
not marked HEAP_ONLY_TUPLE, we would also complain about that.)

Discussion: http://postgr.es/m/CA%2BTgmobLypZx%3DcOH%2ByY1GZmCruaoucHm77A6y_-Bo%3Dh-_3H28g%40mail.gmail.com

amcheck: Tighten up validation of redirect line pointers.

Commit bbc1376b39627c6bddd8a0dc0a7dda24c91a97a0 added a new lp_valid[]
array which records whether or not a line pointer was thought to be
valid, but entries could sometimes get set to true in cases where that
wasn't actually safe. Fix that.

Suppose A is a redirect line pointer and B is the other line pointer
to which it points. The old code could mishandle this situation in a
couple of different ways. First, if B was unused, we'd complain about
corruption but still set lp_valid[A] = true, causing later code
to try to access the B as if it were pointing to a tuple. Second,
if B was dead, we wouldn't complain about corruption at all, which is
an oversight, and would also set lp_valid[A] = true, which would
again confuse later code. Fix all that.

In the case where B is a redirect, the old code was correct, but
refactor things a bit anyway so that all of these cases are handled
more symmetrically. Also add an Assert() and some comments.

Andres Freund and Robert Haas

Discussion: http://postgr.es/m/20230323172607.y3lejpntjnuis5vv%40awork3.anarazel.de

doc: fix Apple Silicon Homebrew prefix change documentation

Commit 4c8d65408 incorrectly stated that Homebrew has changed its
prefix for Apple M1 machines, but the prefix change applies to all
Apple Silicon based machines. Fix by writing Apple Silicon instead
of Apple M1.

Reported-by: Dagfinn Ilmari Mannsåker <[email protected]>
Discussion: https://postgr.es/m/[email protected]

doc: Fix XML_CATALOG_FILES env var for Apple M1 machines

Homebrew changed the prefix for Apple M1 based machines, so our
advice for XML_CATALOG_FILES needs to mention both. More info
on the Homebrew change can be found at:

https://github.com/Homebrew/brew/issues/9177

Author: Julien Rouhaud <[email protected]>
Discussion: https://postgr.es/m/20230327082441.h7pa2vqiobbyo7rd@jrouhaud

Make SCRAM iteration count configurable

Replace the hardcoded value with a GUC such that the iteration
count can be raised in order to increase protection against
brute-force attacks.  The hardcoded value for SCRAM iteration
count was defined to be 4096, which is taken from RFC 7677, so
set the default for the GUC to 4096 to match.  In RFC 7677 the
recommendation is at least 15000 iterations but 4096 is listed
as a SHOULD requirement given that it's estimated to yield a
0.5s processing time on a mobile handset of the time of RFC
writing (late 2015).

Raising the iteration count of SCRAM will make stored passwords
more resilient to brute-force attacks at a higher computational
cost during connection establishment.  Lowering the count will
reduce computational overhead during connections at the tradeoff
of reducing strength against brute-force attacks.

There are however platforms where even a modest iteration count
yields a too high computational overhead, with weaker password
encryption schemes chosen as a result.  In these situations,
SCRAM with a very low iteration count still gives benefits over
weaker schemes like md5, so we allow the iteration count to be
set to one at the low end.

The new GUC is intentionally generically named such that it can
be made to support future SCRAM standards should they emerge.
At that point the value can be made into key:value pairs with
an undefined key as a default which will be backwards compatible
with this.

Reviewed-by: Michael Paquier <[email protected]>
Reviewed-by: Jonathan S. Katz <[email protected]>
Discussion: https://postgr.es/m/F72E7BC7-189F-4B17-BF47-9735EB72C364@yesql.se

meson: Fix support for empty darwin sysroot

The -isysroot options should only be added if the sysroot resolved to
a nonempty string. This matches the behavior in src/template/darwin
(also documented in installation.sgml).

Discussion: https://www.postgresql.org/message-id/flat/60765bf0-5027-4b23-9f78-4a365d28823f%40enterprisedb.com

Improve a few things in pg_walinspect

This improves a few things in pg_walinspect:
- Return NULL rather than empty strings in pg_get_wal_records_info() for
the block references and the record description if there is no
information provided by the fallback. This point has been raised by
Peter Geoghegan.
- Add a check on XLogRecHasAnyBlockRefs() for pg_get_wal_block_info(),
to directly skip records that have no block references. This speeds up
the function a bit, depending on the number of records that have no
block references.

Author: Bharath Rupireddy
Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACWL9RG8sGJHinggRNBTxgRWJTSxCkB+cE6=t3Phh=Ey+A@mail.gmail.com

Generate pg_stat_get_xact*() functions for relations using macros

This change replaces seven functions definitions by macros.

This is the same idea as 8018ffb or 83a1a1b, taking advantage of the
variable rename done in 8089517 for relation entries.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/631e3084-c5d9-8463-7540-fcff4674caa5@gmail.com

Doc: clarify introduction to database roles.

Word-smith section 22.1 ("Database Roles") a little bit in hopes
of removing confusion about how the bootstrap superuser's name
is chosen.

While here, I couldn't help noticing that the claim that the bootstrap
superuser is the only initially-existing role has been a lie since
we started to invent predefined roles. We don't want too much detail
in this very introductory text, but it seems worth changing it to say
that it's the only initially-existing login-capable role.

Per documentation comment from Maja Zaloznik.

Discussion: https://postgr.es/m/167931662853.3349090.18217722739345182859@wrigleys.postgresql.org

Fix oversights in array manipulation.

The nested-arrays code path in ExecEvalArrayExpr() used palloc to
allocate the result array, whereas every other array-creating function
has used palloc0 since 18c0b4ecc.  This mostly works, but unused bits
past the end of the nulls bitmap may end up undefined.  That causes
valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could
cause planner misbehavior as cited in 18c0b4ecc.  There seems no very
good reason why we should strive to avoid palloc0 in just this one case,
so fix it the easy way with s/palloc/palloc0/.

While looking at that I noted that we also failed to check for overflow
of "nbytes" and "nitems" while summing the sizes of the sub-arrays,
potentially allowing a crash due to undersized output allocation.
For "nbytes", follow the policy used by other array-munging code of
checking for overflow after each addition.  (As elsewhere, the last
addition of the array's overhead space doesn't need an extra check,
since palloc itself will catch a value between 1Gb and 2Gb.)
For "nitems", there's no very good reason to sum the inputs at all,
since we can perfectly well use ArrayGetNItems' result instead of
ignoring it.

Per discussion of this bug, also remove redundant zeroing of the
nulls bitmap in array_set_element and array_set_slice.

Patch by Alexander Lakhin and myself, per bug #17858 from Alexander
Lakhin; thanks also to Richard Guo.  These bugs are a dozen years old,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17858-8fd287fd3663d051@postgresql.org

Add SysCacheGetAttrNotNull for guaranteed not-null attrs

When extracting an attr from a cached tuple in the syscache with
SysCacheGetAttr the isnull parameter must be checked in case the
attr cannot be NULL. For cases when this is known beforehand, a
wrapper is introduced which perform the errorhandling internally
on behalf of the caller, invoking an elog in case of a NULL attr.

Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Reviewed-by: David Rowley <[email protected]>
Discussion: https://postgr.es/m/AD76405E-DB45-46B6-941F-17B1EB3A9076@yesql.se

Comment on expectations for AutoVacuumWorkItem handlers.

This might prevent a repeat of the brin_summarize_range() vulnerability
that commit a117cebd638dd02e5c2e791c25e43745f233111b fixed.

Fix CREATE INDEX progress reporting for multi-level partitioning.

The "partitions_total" and "partitions_done" fields were updated
as though the current level of partitioning was the only one.
In multi-level cases, not only could partitions_total change
over the course of the command, but partitions_done could go
backwards or exceed the currently-reported partitions_total.

Fix by setting partitions_total to the total number of direct
and indirect children once at command start, and then just
incrementing partitions_done at appropriate points.  Invent
a new progress monitoring function "pgstat_progress_incr_param"
to simplify doing the latter.  We can avoid adding cost for the
former when doing CREATE INDEX, because ProcessUtility already
enumerates the children and it's pretty easy to pass the count
down to DefineIndex.  In principle the same could be done in
ALTER TABLE, but that's structurally difficult; for now, just
eat the cost of an extra find_all_inheritors scan in that case.

Ilya Gladyshev and Justin Pryzby

Discussion: https://postgr.es/m/a15f904a70924ffa4ca25c3c744cff31e0e6e143 [email protected]

Fix abbreviated keys bug introduced in d87d548cd03.

Discussion: http://postgr.es/m/CAMkU=1z17XJatF-rMCY3Cjqcxer-Kyn57x6h3OSCpJ0LpAp0ig@mail.gmail.com
Reported-by: Jeff Janes

Doc: fix another "contents...exceed the available area" PDF warning.

New since yesterday :-(

Doc: fix examples for pg_input_error_info().

These were causing "contents ... exceed the available area"
warnings in PDF builds, and also didn't quite follow our markup
conventions for function examples. To fix the overwidth
problem, reduce the number of fields shown in one example,
and also insert &zwsp; to let the header line be broken in
a reasonable place.

Discussion: https://postgr.es/m/20230324194701 [email protected]

docs: Explain how to silence overly verbose messages by fop

Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/20230324194701 [email protected]

Invent GENERIC_PLAN option for EXPLAIN.

This provides a very simple way to see the generic plan for a
parameterized query. Without this, it's necessary to define
a prepared statement and temporarily change plan_cache_mode,
which is a bit tedious.

One thing that's a bit of a hack perhaps is that we disable
execution-time partition pruning when the GENERIC_PLAN option
is given. That's because the pruning code may attempt to
fetch the value of one of the parameters, which would fail.

Laurenz Albe, reviewed by Julien Rouhaud, Christoph Berg,
Michel Pelletier, Jim Jones, and myself

Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0 [email protected]

meson: Fix oversight in install-quiet

In e522049f239 I accidentally forgot to add meson_bin to the argument list for
install-quiet. That kind of works on some platforms because the executable is
just 'python', wich the path to meson in an argument. But on windows meson
might be installed as an executable.

Reported-by: Andrew Dunstan <[email protected]>
Discussion: https://postgr.es/m/b36dd6a4-748a-4737-54d5-dc8a50fdbe4b@dunslane.net

Avoid potential UCollator leak for older ICU versions.

ICU versions 53 and earlier rely on icu_set_collation_attributes() to
process the attributes in the locale string. Avoid leaking the
already-opened UCollator object if an error is encountered.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut

pg_locale.c: change ereport() to elog().

Discussion: https://postgr.es/m/73553013-3926-0f34-0fb8-f37909fe4902@enterprisedb.com
Reported-by: Peter Eisentraut

amcheck: Fix verify_heapam for tuples where xmin or xmax is 0.

In such cases, get_xid_status() doesn't set its output parameter (the
third argument), so we shouldn't fall through to code which will test
the value of that parameter. There are five existing calls to
get_xid_status(), three of which seem to already handle this case
properly.  This commit tries to fix the other two.

If we're checking xmin and find that it is invalid (i.e. 0) just
report that as corruption, similar to what's already done in the
three cases that seem correct. If we're checking xmax and find
that's invalid, that's fine: it just means that the tuple hasn't
been updated or deleted.

Thanks to Andres Freund and valgrind for finding this problem, and
also to Andres for having a look at the patch.  This bug seems to go
all the way back to where verify_heapam was first introduced, but
wasn't detected until recently, possibly because of the new test cases
added for update chain verification.  Back-patch to v14, where this
code showed up.

Discussion: http://postgr.es/m/CA+TgmoZAYzQZqyUparXy_ks3OEOfLD9-bEXt8N-2tS1qghX9gQ@mail.gmail.com

Fix typo in header comment

Commit 4c04be9b0 accidentally left off the _id portion of the function
name in the header comment.

Author: Junwang Zhao <[email protected]>
Discussion: https://postgr.es/m/CAEG8a3LP+ytnAXSzR=yiEaQrde+iCybMHsuPn9n=UN3puV_1tw@mail.gmail.com

Fix incorrect format placeholders

The fields of NLSVERSIONINFOEX are of type DWORD, which is unsigned
long, so the results of the computations being printed are also of
type unsigned long.

meson: docs: add texinfo target

libpq: Add sslcertmode option to control client certificates

The sslcertmode option controls whether the server is allowed and/or
required to request a certificate from the client.  There are three
modes:
- "allow" is the default and follows the current behavior, where a
configured client certificate is sent if the server requests one
(via one of its default locations or sslcert).  With the current
implementation, will happen whenever TLS is negotiated.
- "disable" causes the client to refuse to send a client certificate
even if sslcert is configured or if a client certificate is available in
one of its default locations.
- "require" causes the client to fail if a client certificate is never
sent and the server opens a connection anyway.  This doesn't add any
additional security, since there is no guarantee that the server is
validating the certificate correctly, but it may helpful to troubleshoot
more complicated TLS setups.

sslcertmode=require requires SSL_CTX_set_cert_cb(), available since
OpenSSL 1.0.2.  Note that LibreSSL does not include it.

Using a connection parameter different than require_auth has come up as
the simplest design because certificate authentication does not rely
directly on any of the AUTH_REQ_* codes, and one may want to require a
certificate to be sent in combination of a given authentication method,
like SCRAM-SHA-256.

TAP tests are added in src/test/ssl/, some of them relying on sslinfo to
check if a certificate has been set.  These are compatible across all
the versions of OpenSSL supported on HEAD (currently down to 1.0.1).

Author: Jacob Champion
Reviewed-by: Aleksander Alekseev, Peter Eisentraut, David G. Johnston,
Michael Paquier
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3 [email protected]

meson: add install-{quiet, world} targets

To define our own install target, we need dependencies on the i18n targets,
which we did not collect so far.

Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com

meson: add install-{docs,doc-html,doc-man} targets

Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com

meson: make install_test_files more generic, rename to install_files

Now it supports installing directories and directory contents as well. This
will be used in a subsequent patch to install documentation.

Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com

Doc: Improve description of the "batch_size" option for postgres_fdw.

Document that the actual number of rows postgres_fdw inserts at once in
the COPY case is determined in a similar way to the INSERT case, but it
has a restriction that does not apply to the INSERT case.

Follow-up for commit 97da48246.

Reviewed-by: Daniel Gustafsson and Tatsuo Ishii
Discussion: https://postgr.es/m/CAPmGK14NMXDMW4qK9kHUzudN9t71uvrMKPna02X6zwgQJ6E1_g%40mail.gmail.com

Rewrite error message related to sslmode in libpq

The same error message will be used for a different option, to be
introduced in a separate patch. Reshaping the error message as done
here saves in translation.

Extracted from a larger patch by the same author.

Author: Jacob Champion
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3 [email protected]

Rename fields in pgstat structures for functions and relations

This commit renames the members of a few pgstat structures related to
functions and relations, by respectively removing their prefix "f_" and
"t_". The statistics for functions and relations and handled in their
own file, and pgstatfuncs.c associates each field in a structure
variable named based on the object type handled, so no information is
lost with this rename.

This will help with some of the refactoring aimed for pgstatfuncs.c, as
this makes more consistent the field names with the SQL functions
retrieving them.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Melanie Plageman
Discussion: https://postgr.es/m/9142f62a-a422-145c-bde0-b5bc498a4ada@gmail.com

Implement find_my_exec()'s path normalization using realpath(3).

Replace the symlink-chasing logic in find_my_exec with realpath(3),
which has been required by POSIX since SUSv2.  (Windows lacks
realpath(), but there we can use _fullpath() which is functionally
equivalent.)  The main benefit of this is that -- on all modern
platforms at least -- realpath() avoids the chdir() shenanigans
we used to perform while interpreting symlinks.  That had various
corner-case failure modes so it's good to get rid of it.

There is still ongoing discussion about whether we could skip the
replacement of symlinks in some cases, but that's really matter
for a separate patch.  Meanwhile I want to push this before we get
too close to feature freeze, so that we can find out if there are
showstopper portability issues.

Discussion: https://postgr.es/m/797232.1662075573@sss.pgh.pa.us

doc: fix another case of missing productname markup

As a follow-up commit to 0f85db92b9, this adds <productname> markup
to another case of "PostgreSQL".

Author: Hayato Kuroda <[email protected]>
Discussion: https://postgr.es/m/TYAPR01MB58667A7C8317E267467CC599F5869@TYAPR01MB5866.jpnprd01.prod.outlook.com

Count updates that move row to a new page.

Add pgstat counter to track row updates that result in the successor
version going to a new heap page, leaving behind an original version
whose t_ctid points to the new version. The current count is shown by
the n_tup_newpage_upd column of each of the pg_stat_*_tables views.

The new n_tup_newpage_upd column complements the existing n_tup_hot_upd
and n_tup_upd columns. Tables that have high n_tup_newpage_upd values
(relative to n_tup_upd) are good candidates for tuning heap fillfactor.

Corey Huinker, with small tweaks by me.

Author: Corey Huinker <[email protected]>
Reviewed-By: Peter Geoghegan <[email protected]>
Reviewed-By: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/CADkLM=ded21M9iZ36hHm-vj2rE2d=zcKpUQMds__Xm2pxLfHKA@mail.gmail.com

Handle the "und" locale in ICU versions 54 and older.

The "und" locale is an alternative spelling of the root locale, but it
was not recognized until ICU 55. To maintain common behavior across
all supported ICU versions, check for "und" and replace with "root"
before opening.

Previously, the lack of support for "und" was dangerous, because
versions 54 and older fall back to the environment when a locale is
not found. If the user specified "und" for the language (which is
expected and documented), it could not only resolve to the wrong
collator, but it could unexpectedly change (which could lead to
corrupt indexes).

This effectively reverts commit d72900bded, which worked around the
problem for the built-in "unicode" collation, and is no longer
necessary.

Discussion: https://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73 [email protected]
Discussion: https://postgr.es/m/0c6fa66f2753217d2a40480a96bd2ccf023536a1 [email protected]
Reviewed-by: Peter Eisentraut

amcheck: Fix a few bugs in new update chain validation.

We shouldn't set successor[whatever] to an offset number that is less
than FirstOffsetNumber or more than maxoff. We already avoided that
for redirects, but not for CTID links. Allowing bad offset numbers
into the successor[] array causes core dumps.

We shouldn't use HeapTupleHeaderIsHotUpdated() because it checks
stuff other than the status of the infomask2 bit HEAP_HOT_UPDATED.
We only care about the status of that bit, not the other stuff
that HeapTupleHeaderIsHotUpdated() checks. This mistake can cause
verify_heapam() to report corruption when none is present.

The first hunk of this patch was written by me. The other two were
written by Andres Freund. This could probably do with more review
before commit, but I'd like to try to get the buildfarm green again
sooner rather than later.

Discussion: http://postgr.es/m/20230322204552 [email protected]

Add missing "-I." flag when building pg_bsd_indent.

This is evidently not required by most compilers, but buildfarm
member fairywren is unhappy without it. It looks like the meson
infrastructure has this right already.

Minor comment improvements for compress_lz4

Author: Tomas Vondra
Reviewed-by: Georgios Kokolatos, Justin Pryzby
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com

Unify buffer sizes in pg_dump compression API

Prior to the introduction of the compression API in e9960732a9, pg_dump
would use the ZLIB_IN_SIZE/ZLIB_OUT_SIZE to size input/output buffers.
Commit 0da243fed0 introduced similar constants for LZ4, but while gzip
defined both buffers to be 4kB, LZ4 used 4kB and 16kB without any clear
reasoning why that's desirable.

Furthermore, parts of the code unaware of which compression is used
(e.g. pg_backup_directory.c) continued to use ZLIB_OUT_SIZE directly.

Simplify by replacing the various constants with DEFAULT_IO_BUFFER_SIZE,
set to 4kB. The compression implementations still have an option to use
a custom value, but considering 4kB was fine for 20+ years, I find that
unlikely (and we'd probably just increase the default buffer size).

Author: Georgios Kokolatos
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com