-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[RISCV] Use QC.INSBI for OR with immediate when ORI isn't possible #147349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When the immediate to the ORI is a ShiftedMask_32 that does not fit in 12-bits we can use the QC.INSBI instruction instead. We do not do this for cases where the ORI can be replaced with a BSETI since these can be compressesd when the Xqcibm extension (which QC.INSBI is a part of) is enabled.
@llvm/pr-subscribers-backend-risc-v Author: Sudharsan Veeravalli (svs-quic) ChangesWhen the immediate to the ORI is a ShiftedMask_32 that does not fit in 12-bits we can use the QC.INSBI instruction instead. We do not do this for cases where the ORI can be replaced with a BSETI since these can be compressesd when the Xqcibm extension (which QC.INSBI is a part of) is enabled. Co-authored by: Albert Yosher Full diff: https://github.com/llvm/llvm-project/pull/147349.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 6298c7d5e9ef5..195e264582673 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -1298,7 +1298,42 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
ReplaceNode(Node, SRAI);
return;
}
- case ISD::OR:
+ case ISD::OR: {
+ if (Subtarget->hasVendorXqcibm()) {
+ auto *N1C = dyn_cast<ConstantSDNode>(Node->getOperand(1));
+ if (!N1C)
+ break;
+
+ int32_t C1 = N1C->getSExtValue();
+ // If C1 is a shifted mask (but can't be formed as an ORI),
+ // use a bitfield insert of -1.
+ // Transform (or x, C1)
+ // -> (qc.insbi x, width, shift)
+ if (isShiftedMask_32(C1) && !isInt<12>(C1)) {
+ const unsigned Leading = llvm::countl_zero((uint32_t)C1);
+ const unsigned Trailing = llvm::countr_zero((uint32_t)C1);
+
+ // If Zbs is enabled and it is a single bit set we can use BSETI which
+ // can be compressed to C_BSETI when Xqcibm in enabled.
+ if ((Leading + Trailing == 31) && Subtarget->hasStdExtZbs())
+ break;
+
+ const unsigned Width = 32 - Leading - Trailing;
+ SmallVector<SDValue, 4> Ops = {
+ CurDAG->getSignedTargetConstant(-1, DL, VT),
+ CurDAG->getTargetConstant(Width, DL, VT),
+ CurDAG->getTargetConstant(Trailing, DL, VT)};
+ SDNode *BitIns = CurDAG->getMachineNode(RISCV::QC_INSBI, DL, VT, Ops);
+ ReplaceNode(Node, BitIns);
+ return;
+ }
+ }
+
+ if (tryShrinkShlLogicImm(Node))
+ return;
+
+ break;
+ }
case ISD::XOR:
if (tryShrinkShlLogicImm(Node))
return;
diff --git a/llvm/test/CodeGen/RISCV/xqcibm-cto-clo-brev.ll b/llvm/test/CodeGen/RISCV/xqcibm-cto-clo-brev.ll
index 691c5bec7fb51..f227fa9aa423d 100644
--- a/llvm/test/CodeGen/RISCV/xqcibm-cto-clo-brev.ll
+++ b/llvm/test/CodeGen/RISCV/xqcibm-cto-clo-brev.ll
@@ -105,8 +105,7 @@ define i16 @test_cttz_i16(i16 %a) nounwind {
;
; RV32ZBBXQCIBM-LABEL: test_cttz_i16:
; RV32ZBBXQCIBM: # %bb.0:
-; RV32ZBBXQCIBM-NEXT: lui a1, 16
-; RV32ZBBXQCIBM-NEXT: orn a0, a1, a0
+; RV32ZBBXQCIBM-NEXT: qc.insbi a0, -1, 1, 16
; RV32ZBBXQCIBM-NEXT: ctz a0, a0
; RV32ZBBXQCIBM-NEXT: ret
%1 = xor i16 %a, -1
diff --git a/llvm/test/CodeGen/RISCV/xqcibm-insert.ll b/llvm/test/CodeGen/RISCV/xqcibm-insert.ll
new file mode 100644
index 0000000000000..6b7f9ae856625
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/xqcibm-insert.ll
@@ -0,0 +1,88 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN: | FileCheck %s -check-prefixes=RV32I
+; RUN: llc -mtriple=riscv32 -mattr=+experimental-xqcibm -verify-machineinstrs < %s \
+; RUN: | FileCheck %s -check-prefixes=RV32IXQCIBM
+; RUN: llc -mtriple=riscv32 -mattr=+experimental-xqcibm,+zbs -verify-machineinstrs < %s \
+; RUN: | FileCheck %s -check-prefixes=RV32IXQCIBMZBS
+
+
+define i32 @test_ori(i32 %a) nounwind {
+; RV32I-LABEL: test_ori:
+; RV32I: # %bb.0:
+; RV32I-NEXT: ori a0, a0, 511
+; RV32I-NEXT: ret
+;
+; RV32IXQCIBM-LABEL: test_ori:
+; RV32IXQCIBM: # %bb.0:
+; RV32IXQCIBM-NEXT: ori a0, a0, 511
+; RV32IXQCIBM-NEXT: ret
+;
+; RV32IXQCIBMZBS-LABEL: test_ori:
+; RV32IXQCIBMZBS: # %bb.0:
+; RV32IXQCIBMZBS-NEXT: ori a0, a0, 511
+; RV32IXQCIBMZBS-NEXT: ret
+ %or = or i32 %a, 511
+ ret i32 %or
+}
+
+define i32 @test_insbi_mask(i32 %a) nounwind {
+; RV32I-LABEL: test_insbi_mask:
+; RV32I: # %bb.0:
+; RV32I-NEXT: lui a1, 16
+; RV32I-NEXT: addi a1, a1, -1
+; RV32I-NEXT: or a0, a0, a1
+; RV32I-NEXT: ret
+;
+; RV32IXQCIBM-LABEL: test_insbi_mask:
+; RV32IXQCIBM: # %bb.0:
+; RV32IXQCIBM-NEXT: qc.insbi a0, -1, 16, 0
+; RV32IXQCIBM-NEXT: ret
+;
+; RV32IXQCIBMZBS-LABEL: test_insbi_mask:
+; RV32IXQCIBMZBS: # %bb.0:
+; RV32IXQCIBMZBS-NEXT: qc.insbi a0, -1, 16, 0
+; RV32IXQCIBMZBS-NEXT: ret
+ %or = or i32 %a, 65535
+ ret i32 %or
+}
+
+define i32 @test_insbi_shifted_mask(i32 %a) nounwind {
+; RV32I-LABEL: test_insbi_shifted_mask:
+; RV32I: # %bb.0:
+; RV32I-NEXT: lui a1, 15
+; RV32I-NEXT: or a0, a0, a1
+; RV32I-NEXT: ret
+;
+; RV32IXQCIBM-LABEL: test_insbi_shifted_mask:
+; RV32IXQCIBM: # %bb.0:
+; RV32IXQCIBM-NEXT: qc.insbi a0, -1, 4, 12
+; RV32IXQCIBM-NEXT: ret
+;
+; RV32IXQCIBMZBS-LABEL: test_insbi_shifted_mask:
+; RV32IXQCIBMZBS: # %bb.0:
+; RV32IXQCIBMZBS-NEXT: qc.insbi a0, -1, 4, 12
+; RV32IXQCIBMZBS-NEXT: ret
+ %or = or i32 %a, 61440
+ ret i32 %or
+}
+
+define i32 @test_single_bit_set(i32 %a) nounwind {
+; RV32I-LABEL: test_single_bit_set:
+; RV32I: # %bb.0:
+; RV32I-NEXT: lui a1, 1
+; RV32I-NEXT: or a0, a0, a1
+; RV32I-NEXT: ret
+;
+; RV32IXQCIBM-LABEL: test_single_bit_set:
+; RV32IXQCIBM: # %bb.0:
+; RV32IXQCIBM-NEXT: qc.insbi a0, -1, 1, 12
+; RV32IXQCIBM-NEXT: ret
+;
+; RV32IXQCIBMZBS-LABEL: test_single_bit_set:
+; RV32IXQCIBMZBS: # %bb.0:
+; RV32IXQCIBMZBS-NEXT: bseti a0, a0, 12
+; RV32IXQCIBMZBS-NEXT: ret
+ %or = or i32 %a, 4096
+ ret i32 %or
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Title doesn't really make sense. If the immediate doesn't fit in signed 12-bits, it can't use ORI. Maybe "Use QC.INSBI for Or with immediate when ORI isn't possible"? |
Thanks, I have changed the title accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/22865 Here is the relevant piece of the build log for the reference
|
When the immediate to the ORI is a ShiftedMask_32 that does not fit in 12-bits we can use the QC.INSBI instruction instead. We do not do this for cases where the ORI can be replaced with a BSETI since these can be compressesd when the Xqcibm extension (which QC.INSBI is a part of) is enabled.