Skip to content

Commit c69d09a

Browse files
author
Ole John Aske
committed
Bug#34348706 AUTOTEST: testLimits -n DropSignalFragments T1 fails'
As the TAB_SCANREF-error reply is received asynchronously wrt other client processing, a NdbScanOperation error, as well as a NdbTransaction error can arrive, and be set, at any time after a NdbTransaction::execute() has been sent. The testLimits test case in subject execute a scan request where a failure was expected to be returned, but the test pass required that the error did not become visible until after a nextResult-wait. However, there were no such guarantees in the error return protocol. In particular after the patch for bug 33752914 'Allow receiveThread to become more active', it became more likely that a REF-error could race back to the client before it reached a 'nextResult' where it had to wait. (Note that it is still not a regression caused by that receiveThread-patch) This patch makes the behaviour more deterministic by not setting the NdbTransaction error code when a TAB_SCANREF-error is received. The REF-error code is only set and kept in the NdbOperation::theError object. Subsequent calls to ::nextResult() will check 'theError.code' after the PollGuard has been aquired, when it could be propogated to the NdbTransaction object as well. Note that holding the PollGuard also provide a protection against new REF-errors appearing while theError is being checked & propogated. Patch also fixed ::ordered_send_scan_wait_for_all() where we failed to setErrorCode() when having the PollGuard and theError.code was set. (Was propably not required prior to this patch as a TAB_SCANREF used to setErrorCode() immediately) Test case is enhanced such that the reported failure in subject will be hit without this patch - also fixed as similar testcase having the same issue. Note that the documentation of our getErrorCode() methods are very fuzzy in this area as well - that need to improve. Change-Id: If990c42b061644485a7003664a5e31113d498dac (cherry picked from commit c61ca293e08241c09d912be1cb4470c1c1005277)
1 parent 15c2735 commit c69d09a

File tree

3 files changed

+45
-13
lines changed

3 files changed

+45
-13
lines changed

storage/ndb/src/ndbapi/NdbScanOperation.cpp

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
/*
2-
Copyright (c) 2003, 2021, Oracle and/or its affiliates.
2+
Copyright (c) 2003, 2022, Oracle and/or its affiliates.
33
44
This program is free software; you can redistribute it and/or modify
55
it under the terms of the GNU General Public License, version 2.0,
@@ -67,6 +67,16 @@ NdbScanOperation::~NdbScanOperation()
6767
assert(m_scan_buffer==NULL);
6868
}
6969

70+
/*****************************************************************************
71+
* setErrorCode(int aErrorCode)
72+
*
73+
* Use setErrorCode() with errors so that they are mapped to an operation,
74+
* and propogated to the 'transConnection' as well, unless they are
75+
* encountered in the asynchronous signal handling part of the scan
76+
* processing code, in which case the error should be set on
77+
* 'theError' member variable only.
78+
*
79+
*****************************************************************************/
7080
void
7181
NdbScanOperation::setErrorCode(int aErrorCode) const
7282
{
@@ -1891,12 +1901,10 @@ NdbScanOperation::nextResultNdbRecord(const char * & out_row,
18911901
* after getting return value 1 (meaning end of scan) or
18921902
* -1 (for error).
18931903
*
1894-
* Or there seems to be a bug in ndbapi that put operation
1895-
* in error between calls.
1896-
*
1897-
* Or an error have been received.
1904+
* Or an SCAN_TABREF-error have been received into the operation
1905+
* (asynchronously) between calls.
18981906
*
1899-
* In any case, keep and propagate error and fail.
1907+
* In any case, keep and propagate as NdbTransaction error and fail.
19001908
*/
19011909
if (theError.code != Err_scanAlreadyComplete)
19021910
setErrorCode(theError.code);
@@ -3880,7 +3888,11 @@ NdbIndexScanOperation::ordered_send_scan_wait_for_all(bool forceSend)
38803888

38813889
PollGuard poll_guard(* impl);
38823890
if(theError.code)
3891+
{
3892+
if (theError.code != Err_scanAlreadyComplete)
3893+
setErrorCode(theError.code);
38833894
return -1;
3895+
}
38843896

38853897
Uint32 seq= theNdbCon->theNodeSequence;
38863898
Uint32 nodeId= theNdbCon->theDBnode;

storage/ndb/src/ndbapi/NdbTransactionScan.cpp

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
/*
2-
Copyright (c) 2003, 2021, Oracle and/or its affiliates.
2+
Copyright (c) 2003, 2022, Oracle and/or its affiliates.
33
44
This program is free software; you can redistribute it and/or modify
55
it under the terms of the GNU General Public License, version 2.0,
@@ -36,7 +36,16 @@
3636
* int receiveSCAN_TABREF(NdbApiSignal* aSignal)
3737
*
3838
* This means the scan could not be started, set status(s) to indicate
39-
* the failure
39+
* the failure. Note that scan requests are asynchronous, i.e. we do not
40+
* wait for the CONF or REF to be returned. Which also imples that a REF-error
41+
* could be received into a scan operation while the client is in the midts
42+
* of doing other work, e.g. handling results from other operations in the
43+
* same transaction.
44+
*
45+
* To avoid transaction errors appearing 'out of the blue', such asynch
46+
* errors are set only on the operation when they are received. Only when
47+
* processing the scan results with ::nextResult(), operational errors are
48+
* propogated to the transaction level.
4049
*
4150
****************************************************************************/
4251
int
@@ -46,7 +55,8 @@ NdbTransaction::receiveSCAN_TABREF(const NdbApiSignal* aSignal){
4655
if (checkState_TransId(&ref->transId1)) {
4756
if (theScanningOp) {
4857
theScanningOp->execCLOSE_SCAN_REP();
49-
theScanningOp->setErrorCode(ref->errorCode);
58+
// Do not ::setErrorCode() yet! - See comment in header
59+
theScanningOp->theError.code = ref->errorCode;
5060
if(!ref->closeNeeded){
5161
return 0;
5262
}

storage/ndb/test/ndbapi/testLimits.cpp

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/* Copyright (c) 2008, 2021, Oracle and/or its affiliates.
1+
/* Copyright (c) 2008, 2022, Oracle and/or its affiliates.
22
33
This program is free software; you can redistribute it and/or modify
44
it under the terms of the GNU General Public License, version 2.0,
@@ -931,14 +931,19 @@ int testSegmentedSectionScan(NDBT_Context* ctx, NDBT_Step* step){
931931

932932
CHECKEQUAL(0, scan->setInterpretedCode(&prog));
933933

934-
/* Api doesn't seem to wait for result of scan request */
935934
CHECKEQUAL(0, trans->execute(NdbTransaction::NoCommit));
936935

936+
// Scan errors arrive asynchronously into the ScanOperation.
937+
// However, errors should not become visible on the Transaction object
938+
// until after the nextResult-wait.
939+
CHECKEQUAL(0, trans->getNdbError().code);
940+
NdbSleep_MilliSleep(10); // Not even after a long sleep.
937941
CHECKEQUAL(0, trans->getNdbError().code);
938942

939943
CHECKEQUAL(-1, scan->nextResult());
940944

941945
CHECKEQUAL(217, scan->getNdbError().code);
946+
CHECKEQUAL(217, trans->getNdbError().code);
942947

943948
trans->close();
944949

@@ -1037,15 +1042,20 @@ int testDropSignalFragments(NDBT_Context* ctx, NDBT_Step* step){
10371042

10381043
CHECKEQUAL(0, scan->setInterpretedCode(&prog));
10391044

1040-
/* Api doesn't seem to wait for result of scan request */
10411045
CHECKEQUAL(0, trans->execute(NdbTransaction::NoCommit));
1042-
1046+
1047+
// Scan errors arrive asynchronously into the ScanOperation.
1048+
// However, they should not become visible on the Transaction object
1049+
// until after the nextResult-wait.
1050+
CHECKEQUAL(0, trans->getNdbError().code);
1051+
NdbSleep_MilliSleep(10); // Not even after a long sleep.
10431052
CHECKEQUAL(0, trans->getNdbError().code);
10441053

10451054
CHECKEQUAL(-1, scan->nextResult());
10461055

10471056
int expectedResult= subcase.expectedRc;
10481057
CHECKEQUAL(expectedResult, scan->getNdbError().code);
1058+
CHECKEQUAL(expectedResult, trans->getNdbError().code);
10491059

10501060
scan->close();
10511061

0 commit comments

Comments
 (0)