ORC-445: [C++] Code improvements in RLEV2Util. #346

fangzheng · 2018-12-04T07:21:29Z

No description provided.

xndai · 2018-12-04T17:34:50Z

c++/src/RLEV2Util.hh

-    } else {
-      return 64;
-    }
+    return ClosestFixedBitsMap[n];


what if n > 64?

@xndai Hi Xiening, thanks for the catch. When computing patch width, it is possible for the input n to be larger than 64. I have added boundary check in this function and getClosestAlignedFixedBits().

wgtmac · 2018-12-04T19:04:12Z

Can you please provide some perf data to confirm the refactoring is effective?

wgtmac · 2018-12-05T00:46:33Z

c++/src/RLEV2Util.hh

@@ -23,85 +23,32 @@

 namespace orc {
  extern const uint32_t FBSToBitWidthMap[FixedBitSizes::SIZE];
+  extern const uint32_t ClosestFixedBitsMap[65];
+  extern const uint32_t ClosestAlignedFixedBitsMap[65];
+  extern const uint32_t BitWidthToFBSMap[65];

  inline uint32_t decodeBitWidth(uint32_t n) {
    return FBSToBitWidthMap[n];


make sure this won't be out of bound.

Hi Gang, I thought this over and don't feel we need to add boundary check here for two reasons:

in current RLEv2 encoder and decoder code, all the callers of decodeBitWidth() pass in a integer that only has the lowest 5 bits set, so the input n is always less than FixedBitSizes::SIZE (32).

Since this function is used to decode the 5-bit length code, it would be a programming error for a caller to pass in any value that is >= 32. In this case, returning 64 (as the original implementation does) is not helping.

fangzheng · 2018-12-05T18:37:41Z

@wgtmac Hi Gang, thanks for the comments. I'll add performance measurements later this week.

…ry check; 3) add unit test.

fangzheng · 2018-12-05T23:23:52Z

@wgtmac Hi Gang,
I put together a test program and some measurements here: https://github.com/fangzheng/RLEV2Util_performance

The new functions are 20 to 40% faster than the original ones.

I've also committed some new changes:

change the lookup arrays from uint32_t to uint8_t to reduce their sizes and get better cache behavior.
add a unit test TestRLEV2Util.cc to verify that the new implementation produces the same results as original one.
simplify the code in encodeBitWidth() to avoid calling getClosestFixedBits().

Please let me know if you have further suggestions. Thanks.

wgtmac

+1 LGTM. Thanks @fangzheng for the detail perf data!

fangzheng added 2 commits December 3, 2018 23:18

ORC-445: [C++] Code improvements in RLEV2Util.

2d70d83

ORC-445: [C++] Minor fix.

cab6217

xndai reviewed Dec 4, 2018

View reviewed changes

ORC-445: [C++] Add boundary check.

7076e1a

wgtmac reviewed Dec 5, 2018

View reviewed changes

ORC-445: 1) reduce lookup array size; 2) simplify code and add bounda…

8609e8b

…ry check; 3) add unit test.

wgtmac approved these changes Dec 6, 2018

View reviewed changes

wgtmac merged commit 9faf7f5 into apache:master Dec 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ORC-445: [C++] Code improvements in RLEV2Util. #346

ORC-445: [C++] Code improvements in RLEV2Util. #346

Uh oh!

fangzheng commented Dec 4, 2018

Uh oh!

xndai Dec 4, 2018

Uh oh!

fangzheng Dec 4, 2018

Uh oh!

wgtmac commented Dec 4, 2018

Uh oh!

wgtmac Dec 5, 2018

Uh oh!

fangzheng Dec 5, 2018 •

edited

Loading

Uh oh!

fangzheng commented Dec 5, 2018

Uh oh!

fangzheng commented Dec 5, 2018

Uh oh!

wgtmac left a comment

Uh oh!

Uh oh!

ORC-445: [C++] Code improvements in RLEV2Util. #346

ORC-445: [C++] Code improvements in RLEV2Util. #346

Uh oh!

Conversation

fangzheng commented Dec 4, 2018

Uh oh!

xndai Dec 4, 2018

Choose a reason for hiding this comment

Uh oh!

fangzheng Dec 4, 2018

Choose a reason for hiding this comment

Uh oh!

wgtmac commented Dec 4, 2018

Uh oh!

wgtmac Dec 5, 2018

Choose a reason for hiding this comment

Uh oh!

fangzheng Dec 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fangzheng commented Dec 5, 2018

Uh oh!

fangzheng commented Dec 5, 2018

Uh oh!

wgtmac left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fangzheng Dec 5, 2018 •

edited

Loading