Skip to content

[VPlan] Fix miscompile after PR #142433. #147398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 8, 2025
Merged

Conversation

jyknight
Copy link
Member

@jyknight jyknight commented Jul 7, 2025

This fixes a bug introduced by aa24029, "[VPlan] Unroll VPReplicateRecipe by VF", which cloned a VPReplicateRecipe without transferring the flags from the original.

That can cause incorrect nsw/nuw flags to be emitted on the new instructions, which may result in miscompiles.

It turns out there were no test-cases in the repo which end up hitting the situation where the recipe requires instruction clones to have different flags from the underlying instruction. The existing tests covered the flags being correct when the replacement instruction is a vectorized version of the initial instruction, but not when it required clones. A new test is added covering this.

Commit aa24029, "[VPlan] Unroll VPReplicateRecipe by VF" cloned a VPReplicateRecipe without transferring the flags from the original. This caused incorrect flags to be emitted on the new instructions, which resulted later passes introducing miscompiles.
@llvmbot
Copy link
Member

llvmbot commented Jul 7, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-vectorizers

Author: James Y Knight (jyknight)

Changes

This fixes aa24029, "[VPlan] Unroll VPReplicateRecipe by VF", which cloned a VPReplicateRecipe without transferring the flags from the original.

That can cause incorrect nsw/nuw flags to be emitted on the new instructions, which may result in miscompiles.

It turns out there are currently zero test-cases in the repo which end up hitting the case where the recipe requires the clones to have different flags from the underlying instruction when cloning here, so I've also added a new test case minimized from the production code that was broken by this. Probably more extensive tests are needed in this area.


Full diff: https://github.com/llvm/llvm-project/pull/147398.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp (+1)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
index 2dd43c092ff7a..b89cd21595efd 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
@@ -486,6 +486,7 @@ static VPReplicateRecipe *cloneForLane(VPlan &Plan, VPBuilder &Builder,
   auto *New =
       new VPReplicateRecipe(RepR->getUnderlyingInstr(), NewOps,
                             /*IsSingleScalar=*/true, /*Mask=*/nullptr, *RepR);
+  New->transferFlags(*RepR);
   New->insertBefore(RepR);
   return New;
 }

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-linux-gnu"

define i64 @test(ptr %arg2, i64 %dim) #0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the test to llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll which already has coverage for dropping poison generating flags?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon reviewing this test file, I found that my test case is almost the same as drop_vector_nuw_nsw -- except that I'm building with AVX instead of AVX512. I've replaced my test with a copy of that test, but built using AVX.

Copy link
Member Author

@jyknight jyknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL, thanks!

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-linux-gnu"

define i64 @test(ptr %arg2, i64 %dim) #0 {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon reviewing this test file, I found that my test case is almost the same as drop_vector_nuw_nsw -- except that I'm building with AVX instead of AVX512. I've replaced my test with a copy of that test, but built using AVX.

@jyknight
Copy link
Member Author

jyknight commented Jul 8, 2025

Given that the comments are only on the test, and this is fixing a regression, I'm going to go ahead and merge after having resolved the comments as far as I can tell. If you have any additional comments, I'll happily fix with a follow-up.

@jyknight jyknight merged commit 093afed into llvm:main Jul 8, 2025
9 checks passed
@jyknight jyknight deleted the fix-vpreplicate branch July 8, 2025 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants