Tomas Vondra [Sun, 11 Jun 2017 13:30:59 +0000 (15:30 +0200)]
Randomize the choice of the initial ROUNDROBIN node
With roundrobin node, the initial node was always set to the first node
in the list. That works fine when inserting many rows at once (e.g. with
INSERT SELECT), but with single-row inserts this puts all data on the
first node, resulting in unbalanced distribution.
This randomizes the choice of the initial node, so that with single-row
inserts the ROUNDROBIN behaves a bit like RANDOM distribution.
This also removes unnecessary srand() call from RelationBuildLocator(),
located after a call to rand().
Tomas Vondra [Fri, 16 Jun 2017 20:59:48 +0000 (22:59 +0200)]
Prevent dropping distribution keys for MODULO
Due to a coding issue in IsDistColumnForRelId() it was possible to drop
columns that were used as modulo distribution keys. A simple example
demonstrates the behavior:
CREATE TABLE t (a INT, b INT, c INT) DISTRIBUTE BY MODULO (a);
ALTER TABLE t DROP COLUMN a;
test=# \d+ t
Table "public.t"
Column | Type | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
b | integer | | plain | |
c | integer | | plain | |
Distribute By: MODULO(........pg.dropped.1........)
Location Nodes: ALL DATANODES
With this commit, the ALTER TABLE command fails as expected:
ERROR: Distribution column cannot be dropped
The commit simplifies the coding a bit, and removes several functions
that were not needed anymore (and unused outside locator.c).
Tomas Vondra [Sun, 11 Jun 2017 16:00:37 +0000 (18:00 +0200)]
Add support for MODULO distribution on BIGINT
Until now BIGINT data type was not supported by MODULO distribution and
attempts to create such tables failed. This patch removes the limitation.
The compute_modulo() function originally used an optimized algorithm
from http://www.graphics.stanford.edu/~seander/bithacks.html (namely the
one described in section "Compute modulus division by (1 << s) - 1 in
parallel without a division operator") to compute the modulo. But that
algorithm version only supported 32-bit values, and so would require
changes to support 64-bit values. Instead, I've decided to simply drop
that code and use simple % operator, which should translate to IDIV
instruction.
Judging by benchmarks (MODULO on INTEGER column), switching to plain
modulo (%) might result in about 1% slowdown, but it might easily be
just noise caused by different binary layout due to code changes. In
fact, the simplified algorithm is much less noisy in this respect.
Tomas Vondra [Sat, 17 Jun 2017 19:09:21 +0000 (21:09 +0200)]
Rename plpgsql_1.out to plpgsq.out to match upstream
There's no plpgsql_1.out at the upstream, only plpgsql.out. So make it
consistent, which means the file will also receive upstream changes.
Tomas Vondra [Sat, 17 Jun 2017 18:49:15 +0000 (20:49 +0200)]
Resolve most regression failures in plpgsql suite
The fixed differences had mostly trivial causes:
1) The expected output was missing for several tests, most likely due to
initial resolution of a merge conflict (while it was not possible to
run the tests, making verification impossible). This includes blocks
labeled as
- Test handling of expanded arrays
- Test for proper handling of cast-expression caching
and a few more smaller ones.
2) Change in spelling of messages, e.g. from
CONTEXT: PL/pgSQL function footest() line 5 at EXECUTE statement
to
CONTEXT: PL/pgSQL function footest() line 5 at EXECUTE
3) Change in displaying context for notices, warnings and errors, which
was reworked by
0426f349effb6bde2061f3398a71db7180c97dd9. Since that
commit we only show the context for errors by default.
Pavan Deolasee [Thu, 15 Jun 2017 06:04:06 +0000 (11:34 +0530)]
Use thread-specific storage for computing a snapshot.
We mustn't use a global variable since concurrent GTM threads might try to
compute a snapshot at the same time and may overwrite the information before a
thread can send the complete snapshot to the client. Chi Gao
<
[email protected]> reported that this can cause infinite wait on the client
side because the client expects N bytes of data, but only receives (N - x)
bytes and it keeps waiting for remaining x bytes which the GTM never sends.
While we don't have a report, it's obvious that it can also go wrong in the
other direction.
We fix this by using a thread-specific storage which ensures that the snapshot
cannot be changed while it's being sent to the client.
Report, analysis and fix by Chi Gao <
[email protected]>. Some minor
editorialisation by me.
Backpatched to XL9_5_STABLE
Tomas Vondra [Sat, 10 Jun 2017 17:40:50 +0000 (19:40 +0200)]
Restrict subplan nodes even for equal distributions
Many regression tests were failing because the expected plan contains
Remote Subquery Scan on all (datanode_1)
but we were producing
Remote Subquery Scan on any (datanode_1,datanode_2)
Both those plans are in fact valid (at least on replicated tables).
The difference is that in the first case the restriction was computed
in adjust_subplan_distribution() while in the second case this did not
happen (and instead will happen at execution time).
The restriction is not applied because adjust_subplan_distribution()
contains this condition
if (subd && !equal(subd, pathd))
{
... restrict execution nodes ...
}
In Postgres-XL 9.6 the two distributions happen to be equal in some
cases where where that was not the case in Postgres-XL 9.5. It's not
entirely clear why this happens (it seems to be another consequence
of the upper-planner pathification), but the consequence is that the
restriction code is skipped.
Removing the equal() call from the condition fixes all the regression
failures caused by plans switching between any/all restrictions. In
fact, this fixes all remaining regressions failures in five regression
suites: create_view, subselect, aggregates, rangefuncs, xc_having.
In the future, we will probably pathify adjust_subplan_distribution(),
i.e. we'll probably get rid of it entirely and compute the restriction
either when constructing the path or possibly defer it until execution
time. Before the upper-planner pathification this was not possible.
Tomas Vondra [Sat, 10 Jun 2017 13:19:13 +0000 (15:19 +0200)]
Disable aggregate paths with extra COMBINE phase
The planner was generating aggregate paths with an additional COMBINE
step pushed to the remote side, like this:
QUERY PLAN
---------------------------------------------------------
Finalize Aggregate
-> Remote Subquery Scan on all (datanode1,datanode2)
-> Partial Aggregate (Combine)
-> Gather
-> Partial Aggregate
-> Parallel Seq Scan on public.t
This was done with the goal to reduce the amount of data transmitted
over network, and the amount of work to be done on a coordinator.
Unfortunately, the upstream code seems not quite ready for such plans,
leading to failures like this
ERROR: variable not found in subplan target list
for large amounts of data and high max_parallel_workers_per_gather.
Those plans would still be quite beneficial, improving the scalability
of Postgres-XL clusters in analytics. But we can reintroduce them once
the targetlist issue gets fixed.
Tomas Vondra [Sat, 10 Jun 2017 12:38:28 +0000 (14:38 +0200)]
Remove bogus from a program listing in docs
Tomas Vondra [Fri, 9 Jun 2017 15:31:22 +0000 (17:31 +0200)]
Fix minor issues in the tpcb-like pgbench script
The tpcb-like built-in script in pgbench contained two simple bugs.
It was still using the old \setrandom command to generate the delta
value, instead of the new
\set delta random(-5000, 5000)
This is mostly an omission in
32d57848458595a487d251b37c2872d86de439ef.
There was also a missing semicolon at the end of one of the commands,
causing cryptic syntax errors.
Tomas Vondra [Thu, 8 Jun 2017 16:21:46 +0000 (18:21 +0200)]
Fix warnings about uninitialized vars in pg_dump.c
Three XL-specific fields in getTables() were initialized and used in two
independent if blocks looking like this:
if (fout->isPostgresXL)
{
i_pgxclocatortype = PQfnumber(res, "pgxclocatortype");
i_pgxcattnum = PQfnumber(res, "pgxcattnum");
i_pgxc_node_names = PQfnumber(res, "pgxc_node_names");
}
if (fout->isPostgresXL)
{
... use the variables ...
}
Which however confuses the compiler (gcc 5.3.1) which then complains
that the variables are maybe used uninitialized. The fix is simple, just
make the initialization unconditional - if there are no such columns
then PQfnumber() will return -1, but we'll not use the value anyway.
Tomas Vondra [Thu, 8 Jun 2017 15:55:47 +0000 (17:55 +0200)]
Fix compiler warnings due to unused variables
Removes a few variables that were either entirely unused, or just set
and never read again.
Tomas Vondra [Thu, 8 Jun 2017 15:18:01 +0000 (17:18 +0200)]
Add remote subquery step to recurse_set_operations
During the initial phase of resolving 9.6 merge conflicts in the planner
we have switched back to a clean upstream code for some files (including
prepunion.c). Then we reintroduced the XL-specific bits with necessary
tweaks, caused particularly by upper-planner pathification.
We have however forgot about this bit in recurse_set_operations, so this
commit fixes that by adding the redistribution again.
This fixes failures in collate, copyselect and union regression suites.
Patch by senhu <
[email protected]>, review and commit by me.
Pavan Deolasee [Mon, 5 Jun 2017 03:59:24 +0000 (09:29 +0530)]
Switch connections after processing PGXLRemoteFetchSize rows
Fast-query-shipping consumes all rows produced by one datanode (connection)
before moving to the next connection. This leads to suboptimal performance
when the datanodes can't prroduce tuples at a desired pace. Instead, we switch
between connections after every PGXLRemoteFetchSize (pgx_remote_fetch_size)
rows are fetched. This gives datanode a chance to produce more tuples while
the coordinator consumes tuples already produced and sent across by another
datanode.
This seems to improve performance for FQS-ed queries significantly when they
are returning large number of rows from more than one datanodes.
Report by Pilar de Teodoro <
[email protected]>, initial analysis and
performance tests by Tomas Vondra, further analysis and patch by me.
Backpatched to XL9_5_STABLE.
Tomas Vondra [Sat, 3 Jun 2017 22:47:07 +0000 (00:47 +0200)]
Hide list of nodes in EXPLAIN (NODES off, FORMAT json)
EXPLAIN with json format was ignoring the NODES option, showing the list
of nodes every time. This commit fixes that.
Tomas Vondra [Sat, 3 Jun 2017 14:17:00 +0000 (16:17 +0200)]
Make pgbench script names (name prefixes) unique
PostgreSQL 9.6 picks the built-in scripts by prefix, as long as a single
script is matched. This is done even when not using the '-b' option as
in that case pgbench simply uses 'tpcb-like' (and 'pgbench -N' defaults
to 'simple-update').
XL adds two custom scripts that also use the 'bid' column in WHERE
conditions, not just the 'aid' one. Those scripts were however named
'tpcb-like-bid' and 'simple-update-bid' which makes the upstream names
non-unique (when used as prefixes).
This fixes it by simply not keeping the upstream scripts, and replacing
them with the XL version. It might have been possible to massage the
builtin_script[] array to keep both, but it doesn't seem worth it.
Tomas Vondra [Sat, 3 Jun 2017 14:04:12 +0000 (16:04 +0200)]
Fix built-in pgbench scripts to use 9.6 commands
PostgreSQL 9.6 changed some of the pgbench internal commands, e.g.
\setrandom VARIABLE FROM TO
changed to
\set VARIABLE random(FROM, TO)
and there are also some related changes. The merge only accepted changes
to the upstream scripts, not the two XL-specific ones (tpcb-like-bid and
simple-update-bid). Those are fixed by this commit.
Tomas Vondra [Thu, 1 Jun 2017 20:32:49 +0000 (22:32 +0200)]
Add missing include (pgxcnode.h) into parallel.c
Commit
9b12e275cd0b added a InitMultinodeExecutor() call to
ParallelWorkerMain(), but did not include the header with prototype of
that function. So fix that.
Pavan Deolasee [Thu, 18 May 2017 11:13:38 +0000 (16:43 +0530)]
Do not reset global_session on RESET ALL
This avoids resetting global session information when DISCARD/RESET ALL is
executed. This can have bad effects, especially as seen from the 'guc' test
case where we fail to handle temp tables correctly. So we mark global_session
GUC with GUC_NO_RESET_ALL flag and instead issue an explicit RESET
global_session when connection is cleaned up.
Pavan Deolasee [Thu, 18 May 2017 04:52:31 +0000 (10:22 +0530)]
Ensure that multi-node executor is initialised correctly.
Since parallel background worker threads may need to communicate with remote
nodes, we should initialise multi-node executor when the parallel worker starts
it work. This was discovered while running queries with force_parallel_mode set
to on.
Pavan Deolasee [Fri, 12 May 2017 12:06:23 +0000 (17:36 +0530)]
Fix EXPLAIN ANALYZE SELECT INTO
EXPLAIN ANALYZE SELECT INTO was missing the treatment that we give to a regular
SELECT INTO or CREATE TABLE AS SELECT. This patch fixes that such that even
when EXPLAIN ANALYZE is used, we first create the target table and then insert
the selected rows.
The EXPLAIN ANALYZE will only show the plan for the final transformed INSERT
INTO statement. This is not very useful right now the EXPLAIN ANALYZE doesn't
show anything below Remote Subquery Scan, but that's a separate issue and will
be fixed in a separate patch.
The regression test's expected output is updated accordingly.
This will be backpatched to XL9_5_STABLE.
Pavan Deolasee [Fri, 12 May 2017 06:20:44 +0000 (11:50 +0530)]
Accept regression diffs in copydml test case
We don't support triggers yet and hence we accept error messages regarding the
lack of support. Since triggers are not created, the corresponding NOTICEs
raised from the triggers are also not printed.
The core COPY DML functionality is covered by other cases in the test case
hence nothing additional is required.
Pavan Deolasee [Tue, 9 May 2017 07:11:56 +0000 (12:41 +0530)]
Allow COPY (INSERT RETURNING), but block COPY (SELECT INTO)
Since SELECT INTO is transformed into a CREATE TABLE AS SELECT, which then
further transformed into CREATE TABLE + INSERT INTO by XL, we must do the check
for SELECT INTO a bit differently in XL. This patch does that and as a result
also now allow COPY (INSERT RETURNING) correctly.
Backpatched to XL9_5_STABLE
Pavan Deolasee [Mon, 8 May 2017 08:18:41 +0000 (13:48 +0530)]
Use correct namespace while inserting rows via CTAS
We transform CREATE TABLE AS SELECT into a CREATE TABLE, followed by INSERT
INTO. But the generated INSERT INTO statement was not qualifying the table name
with schema, unless the original query has use qualified names. This results
into incorrect behaviour when tables are created in implicit schemas such
as "temporary" schemas. In passing also fix some places where we should be
quoting identifiers correctly.
Report, a test case and some initial analysis by Tomas Vondra. Patch and
further test cases by me.
Backpatched to XL9_5_STABLE
Tomas Vondra [Mon, 8 May 2017 01:16:22 +0000 (03:16 +0200)]
Fix pg_dump getTables() queries broken by
69fba376
Commit
69fba376, cherry-picked from the XL9_5_STABLE branch, broke two
of the queries in getTables(). The mistake seems fairly simple - adding
the string format in one appendPQExpBuffer() call, adding the values to
the next one.
Corrected, and gcc no longer complains about format and type mismatch.
Tomas Vondra [Sun, 7 May 2017 23:08:53 +0000 (01:08 +0200)]
Accept distributed plans in groupingsets test suite
Grouping sets were not supported by Postgres-XL so far, so all plans
are from upstream and thus missing the Remote Subquery nodes. The
changes look reasonable and produce correct results, so accept them.
Note the plans only push down the scans, not the actual grouping set
evaluation, which is still evaluated on the coordinator. That's because
PostgreSQL does not support parallel execution for grouping set paths.
If that gets added in the future, we can add that to XL quite easily.
Tomas Vondra [Sun, 7 May 2017 15:25:59 +0000 (17:25 +0200)]
Resolve failures in the gin regression test suite
The failures were due to running gin_clean_pending_list() being executed
on the coordinator only. As there are no data stored on the coordinator,
the pending list is always empty, the function can't clean anything and
so just returns 0. So the condition (r>10) was always false.
Resolved by using EXECUTE DIRECTO to run the function on both datanodes.
It would be nice if we could do this automatically for such maintenance
functions, but we don't have that capability at this point. So explicit
EXECUTE DIRECT seems like the right solution.
Tomas Vondra [Sun, 7 May 2017 14:11:40 +0000 (16:11 +0200)]
Resolve failures in foreign_key regression tests
The test case is a new one from upstream, and the failure to distribute
the DELETE (after the ON DELETE rule is created) is already present in
XL 9.5, so accept that as expected behavior.
I've however slightly expanded the test case to also check that after
dropping the rule, the DELETE succeeds. That is expected, as both column
are distribution keys of the two tables.
Tomas Vondra [Sat, 6 May 2017 23:55:41 +0000 (01:55 +0200)]
Resolve failures in truncate regression test suite
The failures were caused by triggers and RESTART IDENTITY, two features
not yet supported in Postgres-XL. Instead of adding the errors messages
to expected output (which is what XL9_5_STABLE does), I have removed
the relevant blocks.
The reason is that the tests did nothing useful. The basic TRUNCATE
functionality is tested by the preceding tests, and there are other
test suites verifying that triggers are indeed still unsupported.
The RESTART IDENTITY is a bit of an exception, as it was only tested
here, so to ensure we keep returning a 'not supported' error I've added
a simple test into xl_limitations test suite.
Tomas Vondra [Sat, 6 May 2017 21:47:28 +0000 (23:47 +0200)]
Resolve failures in json and jsonb regression tests
The failures were caused by failed distribution key updates, which also
caused differences in subsequent query results. Instead of making the
table replicated as usual, I've decided to use DISTRIBUTE RANDOMLY in
this case, because we don't have that exercised in the tests yet.
Tomas Vondra [Fri, 5 May 2017 18:41:58 +0000 (20:41 +0200)]
Update expected output issues in misc test suite
The expected output included output for queries that are however
commented-out in the input script. Fix that by removing the query
results and replace it with just the commands.
Tomas Vondra [Fri, 5 May 2017 18:35:04 +0000 (20:35 +0200)]
Remove unnecessary output variants for misc tests
The misc_1 and misc_2 output variants seem to be obsolete (not present
on upstream), and apparently unused, because missing output for the
last block testing multi-statement commands.
Simply remove the variants, and only keep misc.source.
Pavan Deolasee [Wed, 3 May 2017 08:35:17 +0000 (14:05 +0530)]
Add a user configurable parameter to control the number of rows fetched from
the remote side during RemoteSubplan execution.
This allows us to experiment with different sizes more easily. Playing with the
fetch size also exposed couple of problems fixed in this same commit.
1. We were incorrectly forgetting a connection response combiner while
suspending a portal, leading to errors later when we try to buffer the results
because the connection must be used for other queries.
2. The remote cursor name was not getting set properly, thus datanodes
complaining about non-existent cursors.
Pavan Deolasee [Tue, 2 May 2017 09:04:11 +0000 (14:34 +0530)]
Ensure that we don't try to allocate connection in/out buffers over
MaxAllocSize.
We take this opportunity to rearrange the code to avoid duplicity in handling
in and out buffers. Also, add some other checks to ensure that we don't overrun
the limits.
Report, investigation and draft patch by Krzysztof Nienartowicz.
Pavan Deolasee [Fri, 5 May 2017 04:57:15 +0000 (10:27 +0530)]
Update expected output for guc test case missed while cherry-picking
9954d3510a85918fa2c99c20be2ab1d6d32a584b
The test case still fails, but for other outstanding issues in the merge.
Pavan Deolasee [Fri, 5 May 2017 04:50:05 +0000 (10:20 +0530)]
Adjust expected output for tsearch which got missed during cherry-picking of
6f7506edc369
There was a merge conflict because the alternate expected output file is
removed in the master branch. So fix it manually now
Tomas Vondra [Thu, 19 Jan 2017 11:45:15 +0000 (12:45 +0100)]
fix buffer overflow in gtm_serialize_pgxcnodeinfo()
Due to gtm_get_pgxcnodeinfo_size() not considering 'max_sessions'
field, gtm_serialize_pgxcnodeinfo() was writing ~4B beyond the end
of the allocated buffer. In most cases that did not overwrite any
important data, but sometimes it corrupted malloc metadata, as
reported on the mailing list by Rami Sergey.
23:
1325909760:2017-01-16 12:29:56.522 MSK -DEBUG:
gtm_get_pgxcnodeinfo_size: s_len=87, s_datalen=91
LOCATION: ProcessPGXCNodeList, register_gtm.c:391
*** Error in `/usr/local/pgsql/bin/gtm': free(): invalid next size
(fast): 0x00007fc448004c90 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fc44f0f47e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7fe0a)[0x7fc44f0fce0a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7fc44f10098c]
Fixed by adding 'max_sessions' to gtm_get_pgxcnodeinfo_size().
Report by Rami Sergey, fix by me.
Pavan Deolasee [Tue, 25 Apr 2017 08:05:07 +0000 (13:35 +0530)]
Fix a typo in gtmPxyExtraConfig default value
Pavan Deolasee [Tue, 25 Apr 2017 05:34:47 +0000 (11:04 +0530)]
Ensure that GTM master is added first before allowing addition of other
components.
This fixes a pgxc_ctl crash reported by Tomas, but it also makes sense because
GTM details must be added to configuration files of other components while
adding them.
In passing also fix problems while calling "clean/stop all" on non existing GTM
master.
Tomas Vondra [Sat, 19 Nov 2016 10:33:49 +0000 (11:33 +0100)]
remove pgxcpath.c and RemoteQueryPath, which are both unused
The pgxcpath.c file was not even built, so it's obviously dead code.
As it was the only place referencing RemoteQueryPath, remove that
structure too.
Tomas Vondra [Sat, 19 Nov 2016 10:23:08 +0000 (11:23 +0100)]
remove optimizer/pgxcplan.h, mostly a duplicate of pgxc/planner.h
The header defined pretty much the same structures as pgxc/planner.h,
but often with subtle differences (missing fields etc.). Furthermore,
most of the the function prototypes were not actually implemented.
Turns out, the header was only included in pgxcship.c, which only
used a single function from the header. So just remove the file and
make the function static within pgxcship.c.
Pavan Deolasee [Wed, 19 Apr 2017 14:49:32 +0000 (20:19 +0530)]
Fix non-deterministic behaviour of sequence test case added in
f8cbb7dc.
Pavan Deolasee [Tue, 18 Apr 2017 12:05:53 +0000 (17:35 +0530)]
Use already created EState while explaining FQS-ed query.
This ensures that we get details about supplied parameters correctly.
Pavan Deolasee [Tue, 18 Apr 2017 06:57:06 +0000 (12:27 +0530)]
Allow pg_dump to dump from PostgreSQL databases.
The current implementation could only dump from XL database, which is not ideal
since we then need to use pg_dump/pg_dumpall from PG installation to dump PG
database. We now check whether the remote server is running XL or PG and then
acoordingly skip XL-specific dumps.
Pavan Deolasee [Wed, 12 Apr 2017 14:07:37 +0000 (19:37 +0530)]
Handle temp sequences so that duplicate values are not produced.
We used to keep the temporary sequences on the local node and generate sequence
values locally. But when nextval is pushed down to the datanodes, each node
might end up producing the same value and thus causing duplicates. Instead we
now handle the temporary sequences on the GTM too. But instead of schema
qualifying sequence names, we use coordinator name and coordinator PID to
uniquely identify the sequence.
Report by Tomas Vondra and fixes by me.
Pavan Deolasee [Wed, 12 Apr 2017 11:53:21 +0000 (17:23 +0530)]
Add a 5s sleep in create_index test case to give cluster monitor a chance to
advance OldestXmin and vacuum can correctly detect all-visible pages.
Pavan Deolasee [Tue, 11 Apr 2017 05:55:48 +0000 (11:25 +0530)]
Adjust tsearch test case and expected output based on testing feedback.
We don't really need to add ORDER BY since the results don't come from any
distributed table. This allows us to have consistent output from the test.
Pavan Deolasee [Fri, 7 Apr 2017 05:21:33 +0000 (10:51 +0530)]
Reduce verbosity of client disconnection messages.
Per complaint from Michael Misiewicz that it may fill up the GTM/GTM-proxy log
files.
Pavan Deolasee [Thu, 6 Apr 2017 07:29:39 +0000 (12:59 +0530)]
Remove an obselete code which was stopping constraints to be modified at the
coordinator.
It was causing regression failures and making this change while fixes the
failure, it does not cause any new failures either. So it seems like a good
change. We will revisit if something gets reported because of the change.
Pavan Deolasee [Thu, 6 Apr 2017 05:34:14 +0000 (11:04 +0530)]
Support an additional syntax ANALYZE (COORDINATOR) to allow users to rebuild
coordinator side statistics without running ANALYZE again on the datanodes.
When ANALYZE (COORDINATOR) is run, we don't update planner statistics on the
datanodes. But simply gather the existing statistics and update coordinator
side view of the global stats. The command only updates statistics on the
current coordinator and to update stats on all coordintors, the command must be
executed on all coordintors separately.
Pavan Deolasee [Mon, 3 Apr 2017 17:06:19 +0000 (22:36 +0530)]
Try to validate the combiner only when a RESPONSE_COPY is received during
running COPY protocol.
We sometimes do see a 'M' message (command ID received from the remote node)
from the datanode and that breaks the combiner validation message. So we only
do that if we received a RESPONSE_COPY message during COPY IN protocol.
Pavan Deolasee [Fri, 24 Mar 2017 08:10:02 +0000 (13:40 +0530)]
Ensure that the config parameters specified in the gtmPxyExtraConfig file take
precendence over the defaults.
As reported quite a while back by Tobias Oberstein, this bug remained
unaddressed for too long. This commit should fix and ensure that the values
supplied via gtmPxyExtraConfig are honored correctly.
Pavan Deolasee [Fri, 10 Mar 2017 09:07:20 +0000 (14:37 +0530)]
Do not add a newline ('\n') between rows while running a BINARY COPY protocol.
PostgreSQL's BINARY COPY protocol does not expect a newline character between
rows. But coordinator was adding this while running the protocol between
coordinator and the datanodes. While Postgres-XL had some provision to deal
with this on the datanode side, it seemed either insufficient or buggy as
evident during tests with pglogical. So get rid of that and make the protocol
same as vanilla PG.
Pavan Deolasee [Thu, 9 Mar 2017 08:03:54 +0000 (13:33 +0530)]
Extend CommitTS properly, filling in any holes, just like ExtendCLOG.
Postgres-XL may have gaps in assigned transaction IDs, especially because not
all XIDs may be activated on a given node. So we must ensure that CommitTS is
extended properly when we see a new XID. This is same as CLOG, but we never
properly added support for CommitTS.
Pavan Deolasee [Mon, 20 Feb 2017 07:44:19 +0000 (13:14 +0530)]
Handle sequence's transactional behaviour on GTM
Previously we were tracking changes to sequences on the coordinator side and
applying those changes at transaction commit/rollback time. While this worked
ok for most cases, there were issues such as what happens if a sequence is
dropped and then recreated in the same transaction. Since the DROP is not
executed until the transaction commit time, the subsequent CREATE would fail on
the GTM.
We now track sequences renamed/dropped/created on the GTM side and do a cleanup
on transaction commit/rollback. For example, if a sequence is renamed but the
transaction is later aborted, the sequence will be renamed back to its original
name. Similarly, if a sequence is dropped and the transaction aborts, the
sequence will be re-instated.
Pavan Deolasee [Tue, 31 Jan 2017 17:29:10 +0000 (22:59 +0530)]
Handle some corner cases around empty strings in SET commands.
There are some tricky situations where a SET command may only use an empty
string ('') as a value. This lead to various problems since the value is
converted into an zero length string or even a \"\" by GUC processor, depending
on whether it appears in a quoted list GUC or a normal GUC. Sending the value
to the remote node on any of these formats is guaranteed to break things. So
for now add some band-aids to deal with these special cases.
Per report from Vivek Shukla (
[email protected])
Pavan Deolasee [Fri, 20 Jan 2017 09:45:43 +0000 (15:15 +0530)]
A very naive way to deal with the fact that RemoteQuery (which EXECUTE DIRECT
internally uses) cannot support cursor fetch.
FOR EXECUTE query inside plpgsql uses cursor mechanism to fetch a few tuples at
a time. But that fails badly when the query is a EXECUTE DIRECT statement
because we internally use RemoteQuery to execute that. Instead just fetch 10000
tuples at a time and complain if RemoteQuery returns more than 10000 rows. May
be that's enough for the scenario that we're trying to address where a global
view can be created using EXECUTE DIRECT
Pavan Deolasee [Tue, 17 Jan 2017 05:53:37 +0000 (11:23 +0530)]
Handle multi-command queries correctly inside SQL as well as plpgsql functions.
Postgres-XL sends down utility statements to the remote nodes as plain query
strings. When there are multiple commands in a query string, separated by ';',
we were incorrectly sending down the entire query string again and again while
handling each command. This can lead to unpleasant as well as incorrect
behaviour. This was earlier handled for execution via psql, but this patch
fixes it for SPI and other places such as extension creation and SQL function
handling.
Pavan Deolasee [Thu, 5 Jan 2017 14:04:36 +0000 (19:34 +0530)]
Ensure variable values are quoted when necessary while sending down SET
comamnds to the remote nodes
Earlier we'd special cased a few GUCs such as those using memory or time units
or transaction isolation levels. But clearly that wasn't enough as we noticed
with "application_name" recently. So fix this problem in a more comprehensive
manner.
Added a few more test cases to cover these scenarios.
Pavan Deolasee [Wed, 21 Dec 2016 17:24:58 +0000 (22:54 +0530)]
Send shared invalidation messages to other backends upon completion of a
command.
Since multiple backends may cooperate in a distributed transaction, it's
important that when one of the backends make changes to catalogs, which
requires cache invalidation, the other cooperating backends see those changes
immediately instead of end-of-transaction.
Also ensure that invalidation messages are accepted even when a relation lock
was already held by the backend, if it's running in a distributed transaction.
Pavan Deolasee [Tue, 20 Dec 2016 05:35:08 +0000 (11:05 +0530)]
Handle multi-command SQL strings correctly even when there are 'null' sql
commands.
Reported by Krzysztof Nienartowicz
Pavan Deolasee [Mon, 19 Dec 2016 05:21:12 +0000 (10:51 +0530)]
Correctly identify MSG_BARRIER as a proxy command
Pavan Deolasee [Tue, 29 Nov 2016 08:59:36 +0000 (14:29 +0530)]
Fix a non-deterministic regression test and expected output
Pavan Deolasee [Tue, 29 Nov 2016 05:29:23 +0000 (10:59 +0530)]
Create destination data directory before trying to install GTM sample files
Report and patch by Owais Khan
Pavan Deolasee [Mon, 21 Nov 2016 06:44:28 +0000 (12:14 +0530)]
Do not silently skip FK constraints if loose_constraints are ON.
While the git history and comments do not tell us much, it seems this was done
to avoid creating FK constraints which may not be correctly enforced when
loose_constraints is ON (which allows user to define globally non-enforcable
constraints). But instead of wholesale skipping of FKs, it seems we should
rather let the users determine which constraints can be enforced and allow them
to define them.
Pavan Deolasee [Mon, 21 Nov 2016 06:29:53 +0000 (11:59 +0530)]
Do not emit a WARNING about missing relation while reading target entry since
it may refer to a view which is not created on the datanode
Tomas Vondra [Thu, 4 May 2017 20:32:13 +0000 (22:32 +0200)]
Remove cost estimates from the final EXPLAIN in join test suite
While this was not causing any failures at the moment, regression
tests should not rely on cost estimates being constant. So change
the last EXPLAIN in join test suite to use COSTS OFF.
Tomas Vondra [Thu, 4 May 2017 20:09:56 +0000 (22:09 +0200)]
Resolve failure due to changing cost in xl_join regression suite
The regression failures were caused by slightly fluctuating cost
estimates. In general it's not a good idea to rely on costs being
constant in regression tests, so use EXPLAIN (COSTS OFF) instead.
This stabilizes the output, and resolves the regression failure.
Note: The cost estimates probably change due to stale oldestXmin
reported by one of the nodes, so that analyze does not see some
of the recent changes. That is an independent issue, however, and
it does not really change that relying on cost estimates being
constant is not a good practice.
Tomas Vondra [Thu, 4 May 2017 18:10:00 +0000 (20:10 +0200)]
Accept upstream plan change in the subselect regression suite
This is an expected upstream plan change, where REL9_6_STABLE
switched from Hash Join to Hash Semi Join, and removed one of
the Hash Aggregate nodes.
Tomas Vondra [Thu, 4 May 2017 17:41:53 +0000 (19:41 +0200)]
Resolve failures in without_oid test suite by switching to CTAS
The tests used CREATE TABLE ... AS EXECUTE to populate two of the
tables, but this is unsupported in Postgres-XL and so resulted in
test failures. Instead switch to CREATE TABLE ... AS SELECT.
Tomas Vondra [Thu, 4 May 2017 16:22:36 +0000 (18:22 +0200)]
Accept plan changes in the aggregates regression suite
The expected plans were different for a number of simple reasons:
- missing list of data nodes in EXPLAIN output
- new upstream plan, missing Remote Subquery node
- MixMax queries using upstream plans after the merge
The first two issues were fixed by simply adding the missing pieces.
The third difference was resolved by reverting to XL 9.5 plans, with
a two minor tweaks. Firstly, upstream switched from HashAggregate
to Unique path for DISTINCT. Secondly, XL 9.6 uses the partial
aggregate infrastructure, so the plans use Partial / Finalize
Aggregate nodes.
Tomas Vondra [Thu, 4 May 2017 16:18:43 +0000 (18:18 +0200)]
Accept upstream plan changes in the join regression suite
Two plans changed between REL9_5_STABLE and REL9_6_STABLE in the join
regression suite, so accept matching changes in XL (add Remote nodes).
Tomas Vondra [Sun, 23 Apr 2017 22:45:30 +0000 (00:45 +0200)]
Accept plan changes due to showing 'Sort Key' for Remote Subquery
Commit
80bd858ca4b added Sort Key into EXPLAIN output for remote
queries with merging sorted data from remote nodes. Accept trivial
plan changes caused by this.
Tomas Vondra [Sun, 23 Apr 2017 22:23:11 +0000 (00:23 +0200)]
Accept plan changes caused by delaying creation of subplan tlist
Commit
c44d013 in the upstream
c44d0138350490 delayed creation of
subplan tlist until after create_plan(), which may change the order
of entries in target lists.
This commit accepts changes obviously caused by this, and nothing
else - if the plan changed shape in other ways, it's left alone.
Tomas Vondra [Sun, 23 Apr 2017 16:46:01 +0000 (18:46 +0200)]
Resolve failures in the prepared_xacts regression test suite
This simply adopts the expected results from Postgres-XL 9.5,
instead of the results (apparently) inherited from upstream.
Tomas Vondra [Sun, 23 Apr 2017 14:25:37 +0000 (16:25 +0200)]
Resolve failures in guc regression suite related to SAVEPOINTs
The test suite checked GUCs behavior when combined with SAVEPOINTs,
but subtransactions are unsupported in Postgres-XL so this caused
'current transaction is aborted' failues.
Simply remove parts of the test checking how GUCs interact with
subtransactions. Removing the SAVEPOINTs and updating the expected
results would be useless, because that exactly matches already
existing tests in the suite.
There remain several failures in the guc test suite, but those are
independent of SAVEPOINTs, and are more about prepared transactions
vs. access to temporary objects.
Tomas Vondra [Sun, 23 Apr 2017 00:15:40 +0000 (02:15 +0200)]
Resolve failures in the cluster regression suite
Distributing tables in this particular test suite does not make
much sense. Firstly, the tables would need to be more complex
to work around 'could not plan distributed update' errors. But
more importantly, we would have ho use ORDER BY to make the
query results stable, which defeats the intent, i.e. testing
that the table is sorted on disk. So just use replicated tables.
Also fix name of the user used in the regression tests.
Tomas Vondra [Sat, 22 Apr 2017 22:21:20 +0000 (00:21 +0200)]
Resolve failures in create_am and amutils regression suites.
The problem was that CREATE ACCESS METHOD was not properly pushed
to data nodes, causing failures in subsequent commands refering
to the access method. The plan changes in create_am seem trivial,
generally just adding "Remote Subquery" to the single-node plan.
Tomas Vondra [Fri, 21 Apr 2017 22:11:35 +0000 (00:11 +0200)]
Resolve failures in tablesample regression test suite
Expected query results are adopted from XL 9.5, and seem to be
quite stable. Removed part of expected results which comes from
upstream, but from a later commit.
Tomas Vondra [Fri, 21 Apr 2017 21:28:02 +0000 (23:28 +0200)]
Add the missing REFRESH MATERIALIZED VIEW to matview test suite
This adds back the missing refreshes, which were removed due to
the lack of support for a CONCURRENTLY mode. This adds back plain
refreshes (without the CONCURRENTLY keyword), which at least makes
the results the same as in upstream.
Tomas Vondra [Fri, 21 Apr 2017 20:56:14 +0000 (22:56 +0200)]
Resolve regression failures in matview test suite
All the failures are changes in query plans, switching from 2-phase
Hash Aggregate to 1-phase Group Aggregate. This seems to be happening
for two main reasons:
(1) Due to the upper planner pathification we now build additional
paths and cost them correctly. I've reviewed the plans and
the Group Aggregate paths seem cheaper than Hash Aggregate.
(2) The amount of data in the suite is rather small, so the 2-phase
aggregation seems not worth it.
Tomas Vondra [Thu, 20 Apr 2017 23:52:19 +0000 (01:52 +0200)]
Resolve trivial differences in privileges test suite
Most of the differences were due to ordering (sometimes in query but
not in the expected output) or spaces at the end of the line.
Tomas Vondra [Thu, 20 Apr 2017 23:33:39 +0000 (01:33 +0200)]
Resolve trivial failures in update regression suite
The failures were ordering-related, so simply fix the queries and
expected results.
Tomas Vondra [Thu, 20 Apr 2017 23:16:48 +0000 (01:16 +0200)]
Fix order of CREATE TABLE + INSERT for testh table in join suite
The original order (INSERT before CREATE TABLE) was obviously broken,
so correct it and also fix the expected query result.
Tomas Vondra [Thu, 20 Apr 2017 23:14:49 +0000 (01:14 +0200)]
Correct expected output for a query in join test suite
The expected output is copied from upstream. The current results
seem duplicated, for some reason.
Tomas Vondra [Thu, 20 Apr 2017 22:41:13 +0000 (00:41 +0200)]
Resolve most failures in join regression tests
This resolves most failures in 'join' regression suite. The expected
results were produced by running the new join.sql script on XL 9.5,
with a few tweaks of the explain plans (partial aggregates etc.)
Tomas Vondra [Thu, 20 Apr 2017 19:10:47 +0000 (21:10 +0200)]
Resolve most failures in rowsecurity test suite.
This resolves most failures in the rowsecurity test suite, caused
mostly by new tests from the upstream (thus the plans were missing
RemoteSubquery steps).
Tomas Vondra [Thu, 20 Apr 2017 19:05:25 +0000 (21:05 +0200)]
Propagate the distribution up when (root->distribution==NULL).
Without this patch the planner was constricting paths like this:
-> Material
-> SubquerySubPath
-> RemoteSubPath
-> BitmapHeapScan
which then led to failures to plan some of the UPDATE and DELETE
queries in updatable_views. This fixes it, and makes the planner
build the same paths as on XL 9.5 (without the RemoteSubPath,
discarding the distribution info).
This resolved all the 'could not plan distributed UPDATE/DELETE'
in updatable_views, but not universally. So either this still is
not entirely correct, or there's another independent bug.
Tomas Vondra [Sun, 19 Mar 2017 22:54:19 +0000 (23:54 +0100)]
Add missing block to 'portals' expected output
The test included block testing non-backwards-scan-capable plans
with scroll cursors, but the output was missing from the expected
file.
Tomas Vondra [Sun, 19 Mar 2017 22:50:58 +0000 (23:50 +0100)]
Remove duplicit output from 'join' test suite
The expected output contained one block twice by mistake, causing
failures when running the test suite.
Tomas Vondra [Sun, 19 Mar 2017 00:56:08 +0000 (01:56 +0100)]
Resolve failures in rules tests by adding T_CteScan to execRemote.c
The switch in determine_param_types() was missing T_CteScan case,
so some queries were failing with:
ERROR: unrecognized node type: 119
This fixes it, and it also fixes the expected output for a few other
queries in the same test suite.
Tomas Vondra [Sun, 19 Mar 2017 00:17:23 +0000 (01:17 +0100)]
Resolve failures in polymorphism suite by accepting NOTICEs
The query with sql_if()/bleat() calls is expected to produce NOTICE
messages, so just add them to the expected output.
Tomas Vondra [Sun, 19 Mar 2017 00:07:05 +0000 (01:07 +0100)]
Resolve failures in the xc_FQS regression suite
Most of the changes were simply due to plan changes introduced in
PostgreSQL 9.6, typically a SELECT DISTINCT switching from hash or
group Aggregate to Unique. The plans were verified by running the
same query on PostgreSQL 9.6, which should produce about the same
plan (except for the Remote Subquery nodes, of course).
Also fixes the result ordering for three queries by adding an
explicit ORDER BY clause.
Tomas Vondra [Sat, 18 Mar 2017 22:51:50 +0000 (23:51 +0100)]
Accept minor plan change in xl_join due to different costing
The cost of the RemoteSubplan node changed, most likely due to moving
the Sort pushdown into pathnode.c (instead of in createplan.c).
Tomas Vondra [Sat, 18 Mar 2017 20:25:59 +0000 (21:25 +0100)]
Remove 'unrecognized node type' errors from expected output
The expected output for two regression suites ("with" and "rules")
included error messages like this:
ERROR: unrecognized node type: %d
which is clearly a mistake, and just hides an error in the code.
Tomas Vondra [Sat, 18 Mar 2017 20:04:25 +0000 (21:04 +0100)]
Push recursive paths to remote nodes when construction the paths
Until now, the pushdown happened in make_recursive_union(), but that
apparently fails to update subroots (and possibly other fields), causing
segfaults in SS_finalize_plan().
Instead move the push-down logic to generate_recursion_path(), which
fixes the issue and is consistent with the overall pathification of
the planner.
With this change XL 9.6 now generates the same plans as XL 9.5.
There remains a lot to be desired, though. Firstly, it fails for queries
with nested recursive unions, because it produces a plan like this:
-> Remote Subquery Scan on all (dn1)
-> Recursive Union
-> Seq Scan on department
-> CTE Scan on x
CTE x
-> Remote Subquery Scan on all (dn1)
-> Recursive Union
-> Seq Scan on department
-> Append
-> WorkTable Scan on q
-> WorkTable Scan on x
This however fails with 'ERROR: unrecognized node type: 120' error,
because determine_param_types() in execRemote.c does not expect
T_WorkTableScan.
After resolving that issue, it however starts failing with a segfault, as
WorkTableScan nodes try to access the top recursive union node, which is
however executed in a different process, and possibly even on a different
data node (note the "Remote Subquery").
Those issues are however not resolved by this commit, and is left for
the future.
Tomas Vondra [Sat, 18 Mar 2017 19:47:32 +0000 (20:47 +0100)]
Iterate over the whole simple_rel_array in WITH RECURSIVE check
The loop over in subquery_planner() looked about like this:
for (i = 1; i < root->simple_rel_array_size - 1; i++)
which fails to check the last element in the array, due to a trivial
off-by-one error. Instead, the loop should be:
for (i = 1; i < root->simple_rel_array_size; i++)
Tomas Vondra [Sat, 18 Mar 2017 19:27:08 +0000 (20:27 +0100)]
produce the same plans on replicated tables as XL 9.5
Due to a small difference in adjust_path_distribution(), 9.6 updated
the root distribution (to path distribution) whenever it was NULL,
unlike XL 9.5.
Due to this, adjust_subplan_distribution() skipped the block that
restricts the execution on replicated tables to a single node
if (subd && !equal(subd, pathd))
{
if (IsLocatorReplicated(subd->distributionType) &&
bms_num_members(subd->restrictNodes) != 1)
{
...
}
}
which changed the generated plans from
QUERY PLAN
---------------------------------------------------------------------------
Remote Subquery Scan on any
-> Insert on base_tbl b
-> Result
SubPlan 1
-> Remote Subquery Scan on all
-> Index Only Scan using ref_tbl_pkey on ref_tbl r
Index Cond: (a = b.a)
(7 rows)
to
QUERY PLAN
---------------------------------------------------------------------------
Remote Subquery Scan on any
-> Insert on base_tbl b
-> Result
SubPlan 1
-> Remote Subquery Scan on any
-> Index Only Scan using ref_tbl_pkey on ref_tbl r
Index Cond: (a = b.a)
(7 rows)
While this plan change is mostly harmless (it merely postpones the node
selection to from planning to execution time), it generated a lot of
regression test failures.
Tomas Vondra [Sun, 5 Mar 2017 00:26:11 +0000 (01:26 +0100)]
fix failures in 'largeobject' tests due to changed role name
The role name changed from 'regresslo' to 'regress_lo_user' and one
of the source files was still using the old one.
Tomas Vondra [Sat, 4 Mar 2017 23:48:07 +0000 (00:48 +0100)]
fix failures in 'brin' regression tests
The failures seemed to be caused simply by different parameter
values, but it turned out the tests are actually broken and don't
really work as intended, for two reasons.
Firstly, the script does EXPLAIN on multiple queries and checks if
there's Bitmap Heap Scan etc. but it was only looking at the very
beginning of the result. But XL puts Remote Fast Query Execution
on top of the plan, so these checks were failing (but the WARNING
output was accepted, so no one knew).
Secondly, one of the commands populating the test table was using
subquery with ORDER BY, which is not supported in XL. So the query
was failing, and the results were a bit different from upstream.
So this commit fixes two things:
- Changes the pattern matching to look for the expected operation
anywhere in the plan. Considering that the queries are very simple
(single scan with a single condition), this is unlikely to trigger
false positives or negatives.
- Changes how the 'brintest' table is pupulated, replacing the
single subquery (with ORDER BY) with a few commands producing the
same data set as on upstream.