Skip to content

[InstSimplify] Fold getelementptr inbounds null, idx -> null #130742

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 17, 2025

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Mar 11, 2025

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 12, 2025

Stage2 build is broken by this patch :(
Reproducer extracted from SymbolTableListTraits::getListOwner: https://godbolt.org/z/jM13qvGas

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 12, 2025

Stage2 build is broken by this patch :( Reproducer extracted from SymbolTableListTraits::getListOwner: https://godbolt.org/z/jM13qvGas

Fixed by 66f53e3.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 12, 2025

Stage2 build is broken by this patch :( Reproducer extracted from SymbolTableListTraits::getListOwner: https://godbolt.org/z/jM13qvGas

Another case: https://github.com/nodejs/node/blob/a2a53cb728ec5f776ac16879ce0f480a8e838320/src/util-inl.h#L134-L137

@dtcxzyw dtcxzyw requested a review from efriedma-quic March 12, 2025 08:38
@dtcxzyw dtcxzyw marked this pull request as ready for review March 12, 2025 12:21
@dtcxzyw dtcxzyw requested a review from nikic as a code owner March 12, 2025 12:21
@llvmbot
Copy link
Member

llvmbot commented Mar 12, 2025

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

Proof: https://alive2.llvm.org/ce/z/5ZkPx-


Full diff: https://github.com/llvm/llvm-project/pull/130742.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+7)
  • (modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+35-4)
  • (modified) llvm/test/Transforms/InstCombine/store.ll (+1-2)
  • (modified) llvm/test/Transforms/InstCombine/sub-gep.ll (+2-2)
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 4c14dcfb4d75f..dcb17ee8d7b43 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -2910,6 +2910,13 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
                           SQ.getWithInstruction(&GEP)))
     return replaceInstUsesWith(GEP, V);
 
+  // getelementptr inbounds null, idx -> null
+  if (auto *BaseC = dyn_cast<Constant>(PtrOp))
+    if (GEP.isInBounds() && BaseC->isNullValue() &&
+        !NullPointerIsDefined(GEP.getFunction(),
+                              GEPType->getPointerAddressSpace()))
+      return replaceInstUsesWith(GEP, Constant::getNullValue(GEPType));
+
   // For vector geps, use the generic demanded vector support.
   // Skip if GEP return type is scalable. The number of elements is unknown at
   // compile-time.
diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll b/llvm/test/Transforms/InstCombine/getelementptr.ll
index ec03d9a2dae2b..c1bd6806eae86 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1328,8 +1328,7 @@ define ptr @PR45084_extra_use(i1 %cond, ptr %p) {
 
 define ptr @gep_null_inbounds(i64 %idx) {
 ; CHECK-LABEL: @gep_null_inbounds(
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, ptr null, i64 [[IDX:%.*]]
-; CHECK-NEXT:    ret ptr [[GEP]]
+; CHECK-NEXT:    ret ptr null
 ;
   %gep = getelementptr inbounds i8, ptr null, i64 %idx
   ret ptr %gep
@@ -1355,8 +1354,7 @@ define ptr @gep_null_defined(i64 %idx) null_pointer_is_valid {
 
 define ptr @gep_null_inbounds_different_type(i64 %idx1, i64 %idx2) {
 ; CHECK-LABEL: @gep_null_inbounds_different_type(
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds [0 x i8], ptr null, i64 0, i64 [[IDX2:%.*]]
-; CHECK-NEXT:    ret ptr [[GEP]]
+; CHECK-NEXT:    ret ptr null
 ;
   %gep = getelementptr inbounds [0 x i8], ptr null, i64 %idx1, i64 %idx2
   ret ptr %gep
@@ -2019,5 +2017,38 @@ define ptr @gep_merge_nusw_const(ptr %p, i64 %idx, i64 %idx2) {
   ret ptr %gep
 }
 
+define <2 x ptr> @gep_inbounds_null_vec(i64 %idx) {
+; CHECK-LABEL: @gep_inbounds_null_vec(
+; CHECK-NEXT:    ret <2 x ptr> zeroinitializer
+;
+  %p = getelementptr inbounds i8, <2 x ptr> zeroinitializer, i64 %idx
+  ret <2 x ptr> %p
+}
+
+define <2 x ptr> @gep_inbounds_null_vec_broadcast(<2 x i64> %idx) {
+; CHECK-LABEL: @gep_inbounds_null_vec_broadcast(
+; CHECK-NEXT:    ret <2 x ptr> zeroinitializer
+;
+  %p = getelementptr inbounds i8, ptr null, <2 x i64> %idx
+  ret <2 x ptr> %p
+}
+
+define ptr @gep_noinbounds_null(i64 %idx) {
+; CHECK-LABEL: @gep_noinbounds_null(
+; CHECK-NEXT:    [[P:%.*]] = getelementptr i8, ptr null, i64 [[IDX:%.*]]
+; CHECK-NEXT:    ret ptr [[P]]
+;
+  %p = getelementptr i8, ptr null, i64 %idx
+  ret ptr %p
+}
+
+define ptr @gep_inbounds_null_null_is_valid(i64 %idx) null_pointer_is_valid {
+; CHECK-LABEL: @gep_inbounds_null_null_is_valid(
+; CHECK-NEXT:    [[P:%.*]] = getelementptr inbounds i8, ptr null, i64 [[IDX:%.*]]
+; CHECK-NEXT:    ret ptr [[P]]
+;
+  %p = getelementptr inbounds i8, ptr null, i64 %idx
+  ret ptr %p
+}
 
 !0 = !{!"branch_weights", i32 2, i32 10}
diff --git a/llvm/test/Transforms/InstCombine/store.ll b/llvm/test/Transforms/InstCombine/store.ll
index daa40da1828b5..48c63c6f24c72 100644
--- a/llvm/test/Transforms/InstCombine/store.ll
+++ b/llvm/test/Transforms/InstCombine/store.ll
@@ -49,8 +49,7 @@ define void @test2(ptr %P) {
 
 define void @store_at_gep_off_null_inbounds(i64 %offset) {
 ; CHECK-LABEL: @store_at_gep_off_null_inbounds(
-; CHECK-NEXT:    [[PTR:%.*]] = getelementptr inbounds i32, ptr null, i64 [[OFFSET:%.*]]
-; CHECK-NEXT:    store i32 poison, ptr [[PTR]], align 4
+; CHECK-NEXT:    store i32 poison, ptr null, align 4
 ; CHECK-NEXT:    ret void
 ;
   %ptr = getelementptr inbounds i32, ptr null, i64 %offset
diff --git a/llvm/test/Transforms/InstCombine/sub-gep.ll b/llvm/test/Transforms/InstCombine/sub-gep.ll
index 3f8728d3a4381..c86a1a37bd7ad 100644
--- a/llvm/test/Transforms/InstCombine/sub-gep.ll
+++ b/llvm/test/Transforms/InstCombine/sub-gep.ll
@@ -741,7 +741,7 @@ define i64 @nullptrtoint_scalable_c() {
 ; CHECK-NEXT:    ret i64 [[PTR_IDX]]
 ;
 entry:
-  %ptr = getelementptr inbounds <vscale x 4 x i32>, ptr null, i64 8
+  %ptr = getelementptr nusw <vscale x 4 x i32>, ptr null, i64 8
   %ret = ptrtoint ptr %ptr to i64
   ret i64 %ret
 }
@@ -755,7 +755,7 @@ define i64 @nullptrtoint_scalable_x(i64 %x) {
 ; CHECK-NEXT:    ret i64 [[PTR_IDX]]
 ;
 entry:
-  %ptr = getelementptr inbounds <vscale x 4 x i32>, ptr null, i64 %x
+  %ptr = getelementptr nusw <vscale x 4 x i32>, ptr null, i64 %x
   %ret = ptrtoint ptr %ptr to i64
   ret i64 %ret
 }

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the clang changes have landed.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 11, 2025

Failed Tests

Clang
Clang.CodeGenOpenCL/spir32_target.cl
Clang.CodeGenOpenCL/spir64_target.cl
Clang.CodeGenOpenCL/spirv_target.cl
LLVM
LLVM.CodeGen/Hexagon/64bit_tstbit.ll
LLVM.CodeGen/Hexagon/always-ext.ll
LLVM.CodeGen/Hexagon/autohvx/isel-concat-multiple.ll
LLVM.CodeGen/Hexagon/autohvx/isel-q-legalization-loop.ll
LLVM.CodeGen/Hexagon/autohvx/vector-align-terminator.ll
LLVM.CodeGen/Hexagon/autohvx/vector-align-use-in-different-block.ll
LLVM.CodeGen/Hexagon/reg-scavengebug-2.ll
LLVM.CodeGen/Hexagon/swp-const-tc1.ll
LLVM.CodeGen/PowerPC/loop-instr-form-non-inc.ll
LLVM.CodeGen/PowerPC/p10-spill-crgt.ll
LLVM.CodeGen/PowerPC/pr43527.ll
LLVM.CodeGen/PowerPC/sms-cpy-1.ll
LLVM.CodeGen/PowerPC/sms-phi.ll
LLVM.Transforms/LoopUnroll/PowerPC/p10-respect-unroll-pragma.ll

@dtcxzyw dtcxzyw changed the title [InstCombine] Fold getelementptr inbounds null, idx -> null [InstSimplify] Fold getelementptr inbounds null, idx -> null Apr 15, 2025
dtcxzyw added a commit that referenced this pull request Apr 17, 2025
dtcxzyw added a commit that referenced this pull request Apr 17, 2025
@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 30, 2025

If this is causing a lot of issues can we possibly revert it while the fixes make their way through review, and then re-merge it with the fixes?

Sorry, we cannot add a workaround for the SDL2 case. The base pointer is not a constant null, so that clang doesn't know if it is safe to set inbounds flag. The best solution is to replace the pointer addition with an integer addition.

@nikic
Copy link
Contributor

nikic commented Apr 30, 2025

If this is causing a lot of issues can we possibly revert it while the fixes make their way through review, and then re-merge it with the fixes?

@efriedma-quic @nikic I am ok to just add a ubsan check in clang 21. Then we can reland this patch in llvm 22/23.

I'm probably missing something, but don't we already have a ubsan check for this for a long time? https://clang.godbolt.org/z/vGr7qYYjr

@jyknight
Copy link
Member

Is there a real-world optimization benefit derived from this fold? Or is it just added because it's an obviously-correct simplification?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 30, 2025

If this is causing a lot of issues can we possibly revert it while the fixes make their way through review, and then re-merge it with the fixes?

@efriedma-quic @nikic I am ok to just add a ubsan check in clang 21. Then we can reland this patch in llvm 22/23.

I'm probably missing something, but don't we already have a ubsan check for this for a long time? https://clang.godbolt.org/z/vGr7qYYjr

We have a check for pointer arithmetic, but not for member access.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 30, 2025

Is there a real-world optimization benefit derived from this fold? Or is it just added because it's an obviously-correct simplification?

See #130734 (comment) and https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2204/files. This patch is useful to eliminate unreachable code introduced by passes like JumpThreading.

@nikic
Copy link
Contributor

nikic commented Apr 30, 2025

If this is causing a lot of issues can we possibly revert it while the fixes make their way through review, and then re-merge it with the fixes?

@efriedma-quic @nikic I am ok to just add a ubsan check in clang 21. Then we can reland this patch in llvm 22/23.

I'm probably missing something, but don't we already have a ubsan check for this for a long time? https://clang.godbolt.org/z/vGr7qYYjr

We have a check for pointer arithmetic, but not for member access.

Right, but are any of the problematic cases (beyond the ones we already catch directly in clang) related to member access?

@efriedma-quic
Copy link
Collaborator

xalancbmk is a member access issue. The glibc issue and the sdl2 issue are pointer arithmetic. I don't think I've seen any other issues reported.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 30, 2025

xalancbmk is a member access issue. The glibc issue and the sdl2 issue are pointer arithmetic. I don't think I've seen any other issues reported.

Right. I only checked the member access issue before drafting this patch (hwloc/slurm/luajit/llvm). They have been covered by #130734 and #130952.

@DavidTruby
Copy link
Member

I didn't realise the recent xalancbmk failures we've been seeing are the same issue here (I hadn't got around to looking into that yet!)
Given the variety of failures I'd like to push a bit harder for a revert until fixes can be found for these failures

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 1, 2025

We've found that this also breaks SDL2, specifically somewhere in SDL_render_gles2.c. Unfortunately I haven't been able to narrow down the failure yet, I didn't see any obvious use of calculating offsets from nullptr or anything like that. The fix in #137851 didn't work for it.

FYI the fix has been upstreamed: libsdl-org/SDL@14deef9

@tblah
Copy link
Contributor

tblah commented May 1, 2025

I also think this should be reverted until there is a fix. It will take years until some distributions stop shipping broken versions of glibc.

@nikic
Copy link
Contributor

nikic commented May 1, 2025

I'm ok with reverting this until the __PTR_ALIGN workarounds in clang have landed. We should reapply directly after that though, to ensure more of these issues get caught as soon as possible.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 1, 2025

I am preparing for a revert patch.

dtcxzyw added a commit to dtcxzyw/llvm-project that referenced this pull request May 1, 2025
efriedma-quic pushed a commit that referenced this pull request May 1, 2025
#130742)" (#138168)

Revert #130742 for now to avoid breaking glibc failures until the
workaround patches are landed.
dschuff added a commit to dschuff/emscripten that referenced this pull request May 1, 2025
This works around undefined behavior that is now being exploited by
clang.
See llvm/llvm-project#130742
dschuff added a commit to dschuff/emscripten that referenced this pull request May 1, 2025
This works around undefined behavior that is now being exploited by
clang.
See llvm/llvm-project#130742
dschuff added a commit to emscripten-core/emscripten that referenced this pull request May 2, 2025
This works around undefined behavior that is now being exploited by
clang.
See llvm/llvm-project#130742
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
llvm#130742)" (llvm#138168)

Revert llvm#130742 for now to avoid breaking glibc failures until the
workaround patches are landed.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
llvm#130742)" (llvm#138168)

Revert llvm#130742 for now to avoid breaking glibc failures until the
workaround patches are landed.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
llvm#130742)" (llvm#138168)

Revert llvm#130742 for now to avoid breaking glibc failures until the
workaround patches are landed.
delcypher added a commit to delcypher/apple-llvm-project that referenced this pull request May 6, 2025
This commit (and its corresponding merge commit) broke several codegen
tests.

```
commit 4b59b95
Merge: c5ca745 c37b254
Author: git apple-llvm automerger <am@git-apple-llvm>
Date:   Fri May 2 23:22:29 2025 -0700

    Merge commit 'c37b2549ff01' from llvm.org/main into next

commit c37b254
error: cannot run gpg: No such file or directory
Author: Yingwei Zheng <[email protected]>
Date:   Fri May 2 05:21:59 2025 +0800

    Revert "[InstSimplify] Fold getelementptr inbounds null, idx -> null (llvm#130742)" (llvm#138168)

    Revert llvm#130742 for now to avoid breaking glibc failures until the
    workaround patches are landed.
```

This commit re-generates the failing codegen tests.

rdar://150697626
delcypher added a commit to swiftlang/llvm-project that referenced this pull request May 6, 2025
This commit (and its corresponding merge commit) broke several codegen
tests.

```
commit 4b59b95
Merge: c5ca745 c37b254
Author: git apple-llvm automerger <am@git-apple-llvm>
Date:   Fri May 2 23:22:29 2025 -0700

    Merge commit 'c37b2549ff01' from llvm.org/main into next

commit c37b254
error: cannot run gpg: No such file or directory
Author: Yingwei Zheng <[email protected]>
Date:   Fri May 2 05:21:59 2025 +0800

    Revert "[InstSimplify] Fold getelementptr inbounds null, idx -> null (llvm#130742)" (llvm#138168)

    Revert llvm#130742 for now to avoid breaking glibc failures until the
    workaround patches are landed.
```

This commit re-generates the failing codegen tests.

rdar://150697626
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
llvm#130742)" (llvm#138168)

Revert llvm#130742 for now to avoid breaking glibc failures until the
workaround patches are landed.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
llvm#130742)" (llvm#138168)

Revert llvm#130742 for now to avoid breaking glibc failures until the
workaround patches are landed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.