Skip to content

Conversation

DharuniRAcharya
Copy link
Contributor

@DharuniRAcharya DharuniRAcharya commented Oct 9, 2025

This patch adds LLVM infrastructure to support pretty printing of the intrinsic arguments.
The motivation is to improve the readability of LLVM intrinsics and facilitate easy
modifications and debugging of LLVM IR.

This feature adds a property ArgInfo<ArgIndex, "argName", "functionName"> to the
intrinsic arguments to print self-explanatory inline comments for the arguments.

The addition of pretty print support can provide a simple, low-overhead feature that
enhances the usability of LLVM intrinsics without disrupting existing workflows.

Link to the RFC, where this feature was discussed:
https://discourse.llvm.org/t/rfc-pretty-printing-immediate-arguments-in-llvm-intrinsics/88536

@llvmbot
Copy link
Member

llvmbot commented Oct 9, 2025

@llvm/pr-subscribers-backend-nvptx

@llvm/pr-subscribers-llvm-ir

Author: Dharuni R Acharya (DharuniRAcharya)

Changes

This patch adds LLVM infrastructure to support pretty printing of immediate arguments of the intrinsics. The motivation is to increase the readability of LLVM intrinsics and facilitate easy modifications and debugging of LLVM IR. This adds a property PrettyPrintImmArg&lt;ArgIndex, "functionName"&gt; to the immediate arguments of the intrinsic that enables printing self-explanatory inline comment for the immediate arguments. The addition of pretty print support can provide a simple, low-overhead feature that enhances usability of LLVM intrinsics without disrupting existing workflows.


Patch is 40.68 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/162629.diff

13 Files Affected:

  • (modified) llvm/include/llvm/IR/Function.h (+2-1)
  • (modified) llvm/include/llvm/IR/Intrinsics.h (+7)
  • (modified) llvm/include/llvm/IR/Intrinsics.td (+7)
  • (modified) llvm/include/llvm/IR/IntrinsicsNVVM.td (+15-1)
  • (modified) llvm/include/llvm/IR/Module.h (+2-1)
  • (modified) llvm/include/llvm/IR/NVVMIntrinsicUtils.h (+63)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+46-12)
  • (modified) llvm/lib/IR/Intrinsics.cpp (+11)
  • (modified) llvm/test/CodeGen/NVPTX/tcgen05-mma.ll (+32)
  • (modified) llvm/tools/llvm-dis/llvm-dis.cpp (+6-1)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.cpp (+21)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.h (+5)
  • (modified) llvm/utils/TableGen/Basic/IntrinsicEmitter.cpp (+69)
diff --git a/llvm/include/llvm/IR/Function.h b/llvm/include/llvm/IR/Function.h
index d3497716ca844..ee303618515fb 100644
--- a/llvm/include/llvm/IR/Function.h
+++ b/llvm/include/llvm/IR/Function.h
@@ -930,7 +930,8 @@ class LLVM_ABI Function : public GlobalObject, public ilist_node<Function> {
   /// AssemblyAnnotationWriter.
   void print(raw_ostream &OS, AssemblyAnnotationWriter *AAW = nullptr,
              bool ShouldPreserveUseListOrder = false,
-             bool IsForDebug = false) const;
+             bool IsForDebug = false, 
+             bool PrettyPrintIntrinsicArgs = false) const;
 
   /// viewCFG - This function is meant for use from the debugger.  You can just
   /// say 'call F->viewCFG()' and a ghostview window should pop up from the
diff --git a/llvm/include/llvm/IR/Intrinsics.h b/llvm/include/llvm/IR/Intrinsics.h
index 9577d0141f168..b197bf5340b15 100644
--- a/llvm/include/llvm/IR/Intrinsics.h
+++ b/llvm/include/llvm/IR/Intrinsics.h
@@ -30,6 +30,8 @@ class LLVMContext;
 class Module;
 class AttributeList;
 class AttributeSet;
+class raw_ostream;
+class Constant;
 
 /// This namespace contains an enum with a value for every intrinsic/builtin
 /// function known by LLVM. The enum values are returned by
@@ -81,6 +83,9 @@ namespace Intrinsic {
   /// Returns true if the intrinsic can be overloaded.
   LLVM_ABI bool isOverloaded(ID id);
 
+  /// Returns true if the intrinsic has pretty printed immediate arguments.
+  LLVM_ABI bool hasPrettyPrintedArgs(ID id);
+
   /// isTargetIntrinsic - Returns true if IID is an intrinsic specific to a
   /// certain target. If it is a generic intrinsic false is returned.
   LLVM_ABI bool isTargetIntrinsic(ID IID);
@@ -284,6 +289,8 @@ namespace Intrinsic {
   /// N.
   LLVM_ABI Intrinsic::ID getDeinterleaveIntrinsicID(unsigned Factor);
 
+  LLVM_ABI void printImmArg(ID IID, unsigned ArgIdx, raw_ostream &OS, const Constant *ImmArgVal);
+
   } // namespace Intrinsic
 
   } // namespace llvm
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 96da698538314..f56a7603f4963 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -87,6 +87,13 @@ class NoUndef<AttrIndex idx> : IntrinsicProperty {
   int ArgNo = idx.Value;
 }
 
+// PrettyPrintImmArg - The specified immediate argument has a custom pretty-print
+// function for diagnostic output.
+class PrettyPrintImmArg<AttrIndex idx, string funcname> : IntrinsicProperty {
+  int ArgNo = idx.Value;
+  string FunctionName = funcname;
+}
+
 // NonNull - The return value or specified argument is not null.
 class NonNull<AttrIndex idx> : IntrinsicProperty {
   int ArgNo = idx.Value;
diff --git a/llvm/include/llvm/IR/IntrinsicsNVVM.td b/llvm/include/llvm/IR/IntrinsicsNVVM.td
index 3af1750ffcf3f..6a132de3413bb 100644
--- a/llvm/include/llvm/IR/IntrinsicsNVVM.td
+++ b/llvm/include/llvm/IR/IntrinsicsNVVM.td
@@ -2871,7 +2871,14 @@ foreach sp = [0, 1] in {
         defvar nargs = !size(args);
         defvar scale_d_imm = ArgIndex<!sub(nargs, 1)>;
         defvar scale_d_imm_range = [ImmArg<scale_d_imm>, Range<scale_d_imm, 0, 16>];
-        defvar intrinsic_properties = !listconcat(
+
+        // Check if this is the specific llvm.nvvm.tcgen05.mma.tensor intrinsic
+        defvar is_target_intrinsic = !and(!eq(sp, 0), 
+                                          !eq(space, "tensor"), 
+                                          !eq(scale_d, 0), 
+                                          !eq(ashift, 0));
+
+        defvar base_properties = !listconcat(
           mma.common_intr_props,
           !if(!eq(scale_d, 1), scale_d_imm_range, []),
           [Range<ArgIndex<nargs>, 0, !if(!eq(scale_d, 1), 2, 4)>, // kind
@@ -2881,6 +2888,13 @@ foreach sp = [0, 1] in {
           ]
         );
 
+        defvar intrinsic_properties = !if(is_target_intrinsic, 
+          !listconcat(base_properties,
+            [PrettyPrintImmArg<ArgIndex<nargs>, "printTcgen05MMAKind">,
+             PrettyPrintImmArg<ArgIndex<!add(nargs, 1)>, "printCTAGroupKind">,
+             PrettyPrintImmArg<ArgIndex<!add(nargs, 2)>, "printTcgen05CollectorUsageOp">]),
+          base_properties);
+
         def mma.record:
               DefaultAttrsIntrinsicFlags<[], args, flags, intrinsic_properties,
                 mma.intr>;
diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h
index a99937a90cbb7..fe5c1308a190a 100644
--- a/llvm/include/llvm/IR/Module.h
+++ b/llvm/include/llvm/IR/Module.h
@@ -866,7 +866,8 @@ class LLVM_ABI Module {
   /// the assembly.
   void print(raw_ostream &OS, AssemblyAnnotationWriter *AAW,
              bool ShouldPreserveUseListOrder = false,
-             bool IsForDebug = false) const;
+             bool IsForDebug = false, 
+             bool PrettyPrintIntrinsicArgs = false) const;
 
   /// Dump the module to stderr (for debugging).
   void dump() const;
diff --git a/llvm/include/llvm/IR/NVVMIntrinsicUtils.h b/llvm/include/llvm/IR/NVVMIntrinsicUtils.h
index d55100e5e709d..2ce1461f7f3c2 100644
--- a/llvm/include/llvm/IR/NVVMIntrinsicUtils.h
+++ b/llvm/include/llvm/IR/NVVMIntrinsicUtils.h
@@ -18,6 +18,9 @@
 #include <stdint.h>
 
 #include "llvm/ADT/APFloat.h"
+#include "llvm/ADT/APInt.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/Support/raw_ostream.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/IntrinsicsNVPTX.h"
 
@@ -659,6 +662,66 @@ inline APFloat::roundingMode GetFMARoundingMode(Intrinsic::ID IntrinsicID) {
   llvm_unreachable("Invalid FP instrinsic rounding mode for NVVM fma");
 }
 
+inline void printTcgen05MMAKind(raw_ostream &OS, const Constant *ImmArgVal) {
+  if (const ConstantInt *CI = dyn_cast<ConstantInt>(ImmArgVal)) {
+    uint64_t Val = CI->getZExtValue();
+    switch (static_cast<Tcgen05MMAKind>(Val)) {
+    case Tcgen05MMAKind::F16:
+      OS << "kind::f16";
+      return;
+    case Tcgen05MMAKind::TF32:
+      OS << "kind::tf32";
+      return;
+    case Tcgen05MMAKind::F8F6F4:
+      OS << "kind::f8f6f4";
+      return;
+    case Tcgen05MMAKind::I8:
+      OS << "kind::i8";
+      return;
+    }
+  }
+  llvm_unreachable("printTcgen05MMAKind called with invalid value for immediate argument");
+}
+
+inline void printTcgen05CollectorUsageOp(raw_ostream &OS, const Constant *ImmArgVal) {
+  if (const ConstantInt *CI = dyn_cast<ConstantInt>(ImmArgVal)) {
+    uint64_t Val = CI->getZExtValue();
+    switch (static_cast<Tcgen05CollectorUsageOp>(Val)) {
+    case Tcgen05CollectorUsageOp::DISCARD:
+      OS << "collector::a::discard";
+      return;
+    case Tcgen05CollectorUsageOp::LASTUSE:
+      OS << "collector::a::lastuse";
+      return;
+    case Tcgen05CollectorUsageOp::FILL:
+      OS << "collector::a::fill";
+      return;
+    case Tcgen05CollectorUsageOp::USE:
+      OS << "collector::a::use";
+      return;
+    }
+  }
+  llvm_unreachable("printTcgen05CollectorUsageOp called with invalid value for immediate argument");
+}
+
+inline void printCTAGroupKind(raw_ostream &OS, const Constant *ImmArgVal) {
+  if (const ConstantInt *CI = dyn_cast<ConstantInt>(ImmArgVal)) {
+    uint64_t Val = CI->getZExtValue();
+    switch (static_cast<CTAGroupKind>(Val)) {
+    case CTAGroupKind::CG_NONE: 
+      OS << "cta_group::0";
+      return;
+    case CTAGroupKind::CG_1: 
+      OS << "cta_group::1";
+      return;
+    case CTAGroupKind::CG_2: 
+      OS << "cta_group::2";
+      return;
+    }
+  }
+  llvm_unreachable("printCTAGroupKind called with invalid value for immediate argument");
+}
+
 } // namespace nvvm
 } // namespace llvm
 #endif // LLVM_IR_NVVMINTRINSICUTILS_H
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index ae086bcd3902d..c859cf980ef66 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -53,6 +53,7 @@
 #include "llvm/IR/Instruction.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/LLVMContext.h"
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
@@ -2831,6 +2832,7 @@ class AssemblyWriter {
   SetVector<const Comdat *> Comdats;
   bool IsForDebug;
   bool ShouldPreserveUseListOrder;
+  bool PrettyPrintIntrinsicArgs;
   UseListOrderMap UseListOrders;
   SmallVector<StringRef, 8> MDNames;
   /// Synchronization scope names registered with LLVMContext.
@@ -2841,7 +2843,8 @@ class AssemblyWriter {
   /// Construct an AssemblyWriter with an external SlotTracker
   AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac, const Module *M,
                  AssemblyAnnotationWriter *AAW, bool IsForDebug,
-                 bool ShouldPreserveUseListOrder = false);
+                 bool ShouldPreserveUseListOrder = false,
+                 bool PrettyPrintIntrinsicArgs = false);
 
   AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac,
                  const ModuleSummaryIndex *Index, bool IsForDebug);
@@ -2930,10 +2933,12 @@ class AssemblyWriter {
 
 AssemblyWriter::AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac,
                                const Module *M, AssemblyAnnotationWriter *AAW,
-                               bool IsForDebug, bool ShouldPreserveUseListOrder)
+                               bool IsForDebug, bool ShouldPreserveUseListOrder,
+                               bool PrettyPrintIntrinsicArgs)
     : Out(o), TheModule(M), Machine(Mac), TypePrinter(M), AnnotationWriter(AAW),
       IsForDebug(IsForDebug),
-      ShouldPreserveUseListOrder(ShouldPreserveUseListOrder) {
+      ShouldPreserveUseListOrder(ShouldPreserveUseListOrder),
+      PrettyPrintIntrinsicArgs(PrettyPrintIntrinsicArgs) {
   if (!TheModule)
     return;
   for (const GlobalObject &GO : TheModule->global_objects())
@@ -2944,7 +2949,8 @@ AssemblyWriter::AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac,
 AssemblyWriter::AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac,
                                const ModuleSummaryIndex *Index, bool IsForDebug)
     : Out(o), TheIndex(Index), Machine(Mac), TypePrinter(/*Module=*/nullptr),
-      IsForDebug(IsForDebug), ShouldPreserveUseListOrder(false) {}
+      IsForDebug(IsForDebug), ShouldPreserveUseListOrder(false),
+      PrettyPrintIntrinsicArgs(false) {}
 
 void AssemblyWriter::writeOperand(const Value *Operand, bool PrintType) {
   if (!Operand) {
@@ -4561,12 +4567,38 @@ void AssemblyWriter::printInstruction(const Instruction &I) {
     Out << ' ';
     writeOperand(Operand, false);
     Out << '(';
+    bool HasPrettyPrintedArgs = PrettyPrintIntrinsicArgs && isa<IntrinsicInst>(CI) && 
+                              Intrinsic::hasPrettyPrintedArgs(CI->getIntrinsicID());
+
     ListSeparator LS;
-    for (unsigned op = 0, Eop = CI->arg_size(); op < Eop; ++op) {
-      Out << LS;
-      writeParamOperand(CI->getArgOperand(op), PAL.getParamAttrs(op));
-    }
+    if (HasPrettyPrintedArgs) {
+      Function *CalledFunc = CI->getCalledFunction();
+      auto PrintArgComment = [&](unsigned ArgNo) {
+        if (!CalledFunc->hasParamAttribute(ArgNo, Attribute::ImmArg))
+          return;
+        const Constant *ConstArg = dyn_cast<Constant>(CI->getArgOperand(ArgNo));
+        if (!ConstArg)
+          return;
+        std::string ArgComment;
+        raw_string_ostream ArgCommentStream(ArgComment);
+        Intrinsic::ID IID = CalledFunc->getIntrinsicID();
+        Intrinsic::printImmArg(IID, ArgNo, ArgCommentStream, ConstArg);
+        if (ArgComment.empty())
+          return;
+        Out << "/* " << ArgComment << " */ ";
+      };
 
+      for (unsigned ArgNo = 0, NumArgs = CI->arg_size(); ArgNo < NumArgs; ++ArgNo) {
+        Out << LS;
+        PrintArgComment(ArgNo);
+        writeParamOperand(CI->getArgOperand(ArgNo), PAL.getParamAttrs(ArgNo));
+      }
+    } else {
+      for (unsigned op = 0, Eop = CI->arg_size(); op < Eop; ++op) {
+        Out << LS;
+        writeParamOperand(CI->getArgOperand(op), PAL.getParamAttrs(op));
+      }
+    } 
     // Emit an ellipsis if this is a musttail call in a vararg function.  This
     // is only to aid readability, musttail calls forward varargs by default.
     if (CI->isMustTailCall() && CI->getParent() &&
@@ -4991,12 +5023,14 @@ void AssemblyWriter::printUseLists(const Function *F) {
 
 void Function::print(raw_ostream &ROS, AssemblyAnnotationWriter *AAW,
                      bool ShouldPreserveUseListOrder,
-                     bool IsForDebug) const {
+                     bool IsForDebug,
+                     bool PrettyPrintIntrinsicArgs) const {
   SlotTracker SlotTable(this->getParent());
   formatted_raw_ostream OS(ROS);
   AssemblyWriter W(OS, SlotTable, this->getParent(), AAW,
                    IsForDebug,
-                   ShouldPreserveUseListOrder);
+                   ShouldPreserveUseListOrder,
+                   PrettyPrintIntrinsicArgs);
   W.printFunction(this);
 }
 
@@ -5012,11 +5046,11 @@ void BasicBlock::print(raw_ostream &ROS, AssemblyAnnotationWriter *AAW,
 }
 
 void Module::print(raw_ostream &ROS, AssemblyAnnotationWriter *AAW,
-                   bool ShouldPreserveUseListOrder, bool IsForDebug) const {
+                   bool ShouldPreserveUseListOrder, bool IsForDebug, bool PrettyPrintIntrinsicArgs) const {
   SlotTracker SlotTable(this);
   formatted_raw_ostream OS(ROS);
   AssemblyWriter W(OS, SlotTable, this, AAW, IsForDebug,
-                   ShouldPreserveUseListOrder);
+                   ShouldPreserveUseListOrder, PrettyPrintIntrinsicArgs);
   W.printModule(this);
 }
 
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index 6797a100ff732..4159de3e630c8 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -23,6 +23,7 @@
 #include "llvm/IR/IntrinsicsLoongArch.h"
 #include "llvm/IR/IntrinsicsMips.h"
 #include "llvm/IR/IntrinsicsNVPTX.h"
+#include "llvm/IR/NVVMIntrinsicUtils.h"
 #include "llvm/IR/IntrinsicsPowerPC.h"
 #include "llvm/IR/IntrinsicsR600.h"
 #include "llvm/IR/IntrinsicsRISCV.h"
@@ -601,6 +602,12 @@ bool Intrinsic::isOverloaded(ID id) {
 #undef GET_INTRINSIC_OVERLOAD_TABLE
 }
 
+bool Intrinsic::hasPrettyPrintedArgs(ID id) {
+#define GET_INTRINSIC_PRETTY_PRINT_TABLE
+#include "llvm/IR/IntrinsicImpl.inc"
+#undef GET_INTRINSIC_PRETTY_PRINT_TABLE
+}
+
 /// Table of per-target intrinsic name tables.
 #define GET_INTRINSIC_TARGET_DATA
 #include "llvm/IR/IntrinsicImpl.inc"
@@ -1129,3 +1136,7 @@ Intrinsic::ID Intrinsic::getDeinterleaveIntrinsicID(unsigned Factor) {
   assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
   return InterleaveIntrinsics[Factor - 2].Deinterleave;
 }
+
+#define GET_INTRINSIC_PRETTY_PRINT_ARGUMENTS
+#include "llvm/IR/IntrinsicImpl.inc"
+#undef GET_INTRINSIC_PRETTY_PRINT_ARGUMENTS
\ No newline at end of file
diff --git a/llvm/test/CodeGen/NVPTX/tcgen05-mma.ll b/llvm/test/CodeGen/NVPTX/tcgen05-mma.ll
index 711e566df5034..ea3a5db82684f 100644
--- a/llvm/test/CodeGen/NVPTX/tcgen05-mma.ll
+++ b/llvm/test/CodeGen/NVPTX/tcgen05-mma.ll
@@ -34,13 +34,17 @@ define void @tcgen05_mma_fp16_cta1(ptr addrspace(6) %dtmem, ptr addrspace(6) %at
 ; CHECK-NEXT:    tcgen05.mma.cta_group::1.kind::f16.collector::a::use [%r1], [%r3], %rd2, %r2, %p1;
 ; CHECK-NEXT:    ret;
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 0)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f16 */ i32 0, /* cta_group::1 */ i32 1, /* collector::a::discard */ i32 0)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 0)
   call void @llvm.nvvm.tcgen05.mma.tensor.ashift(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 0)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 1)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f16 */ i32 0, /* cta_group::1 */ i32 1, /* collector::a::lastuse */ i32 1)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 1)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 2)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f16 */ i32 0, /* cta_group::1 */ i32 1, /* collector::a::fill */ i32 2)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 2)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 3)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f16 */ i32 0, /* cta_group::1 */ i32 1, /* collector::a::use */ i32 3)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 3)
   ret void
 }
@@ -113,13 +117,17 @@ define void @tcgen05_mma_tf32_cta1(ptr addrspace(6) %dtmem, ptr addrspace(6) %at
 ; CHECK-NEXT:    tcgen05.mma.cta_group::1.kind::tf32.collector::a::use [%r1], [%r3], %rd2, %r2, %p1;
 ; CHECK-NEXT:    ret;
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 0)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::tf32 */ i32 1, /* cta_group::1 */ i32 1, /* collector::a::discard */ i32 0)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 0)
   call void @llvm.nvvm.tcgen05.mma.tensor.ashift(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 0)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 1)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::tf32 */ i32 1, /* cta_group::1 */ i32 1, /* collector::a::lastuse */ i32 1)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 1)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 2)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::tf32 */ i32 1, /* cta_group::1 */ i32 1, /* collector::a::fill */ i32 2)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 2)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 3)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::tf32 */ i32 1, /* cta_group::1 */ i32 1, /* collector::a::use */ i32 3)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 3)
   ret void
 }
@@ -192,13 +200,17 @@ define void @tcgen05_mma_f8f6f4_cta1(ptr addrspace(6) %dtmem, ptr addrspace(6) %
 ; CHECK-NEXT:    tcgen05.mma.cta_group::1.kind::f8f6f4.collector::a::use [%r1], [%r3], %rd2, %r2, %p1;
 ; CHECK-NEXT:    ret;
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 2, i32 1, i32 0)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f8f6f4 */ i32 2, /* cta_group::1 */ i32 1, /* collector::a::discard */ i32 0)
   call vo...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 9, 2025

@llvm/pr-subscribers-tablegen

Author: Dharuni R Acharya (DharuniRAcharya)

Changes

This patch adds LLVM infrastructure to support pretty printing of immediate arguments of the intrinsics. The motivation is to increase the readability of LLVM intrinsics and facilitate easy modifications and debugging of LLVM IR. This adds a property PrettyPrintImmArg&lt;ArgIndex, "functionName"&gt; to the immediate arguments of the intrinsic that enables printing self-explanatory inline comment for the immediate arguments. The addition of pretty print support can provide a simple, low-overhead feature that enhances usability of LLVM intrinsics without disrupting existing workflows.


Patch is 40.68 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/162629.diff

13 Files Affected:

  • (modified) llvm/include/llvm/IR/Function.h (+2-1)
  • (modified) llvm/include/llvm/IR/Intrinsics.h (+7)
  • (modified) llvm/include/llvm/IR/Intrinsics.td (+7)
  • (modified) llvm/include/llvm/IR/IntrinsicsNVVM.td (+15-1)
  • (modified) llvm/include/llvm/IR/Module.h (+2-1)
  • (modified) llvm/include/llvm/IR/NVVMIntrinsicUtils.h (+63)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+46-12)
  • (modified) llvm/lib/IR/Intrinsics.cpp (+11)
  • (modified) llvm/test/CodeGen/NVPTX/tcgen05-mma.ll (+32)
  • (modified) llvm/tools/llvm-dis/llvm-dis.cpp (+6-1)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.cpp (+21)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.h (+5)
  • (modified) llvm/utils/TableGen/Basic/IntrinsicEmitter.cpp (+69)
diff --git a/llvm/include/llvm/IR/Function.h b/llvm/include/llvm/IR/Function.h
index d3497716ca844..ee303618515fb 100644
--- a/llvm/include/llvm/IR/Function.h
+++ b/llvm/include/llvm/IR/Function.h
@@ -930,7 +930,8 @@ class LLVM_ABI Function : public GlobalObject, public ilist_node<Function> {
   /// AssemblyAnnotationWriter.
   void print(raw_ostream &OS, AssemblyAnnotationWriter *AAW = nullptr,
              bool ShouldPreserveUseListOrder = false,
-             bool IsForDebug = false) const;
+             bool IsForDebug = false, 
+             bool PrettyPrintIntrinsicArgs = false) const;
 
   /// viewCFG - This function is meant for use from the debugger.  You can just
   /// say 'call F->viewCFG()' and a ghostview window should pop up from the
diff --git a/llvm/include/llvm/IR/Intrinsics.h b/llvm/include/llvm/IR/Intrinsics.h
index 9577d0141f168..b197bf5340b15 100644
--- a/llvm/include/llvm/IR/Intrinsics.h
+++ b/llvm/include/llvm/IR/Intrinsics.h
@@ -30,6 +30,8 @@ class LLVMContext;
 class Module;
 class AttributeList;
 class AttributeSet;
+class raw_ostream;
+class Constant;
 
 /// This namespace contains an enum with a value for every intrinsic/builtin
 /// function known by LLVM. The enum values are returned by
@@ -81,6 +83,9 @@ namespace Intrinsic {
   /// Returns true if the intrinsic can be overloaded.
   LLVM_ABI bool isOverloaded(ID id);
 
+  /// Returns true if the intrinsic has pretty printed immediate arguments.
+  LLVM_ABI bool hasPrettyPrintedArgs(ID id);
+
   /// isTargetIntrinsic - Returns true if IID is an intrinsic specific to a
   /// certain target. If it is a generic intrinsic false is returned.
   LLVM_ABI bool isTargetIntrinsic(ID IID);
@@ -284,6 +289,8 @@ namespace Intrinsic {
   /// N.
   LLVM_ABI Intrinsic::ID getDeinterleaveIntrinsicID(unsigned Factor);
 
+  LLVM_ABI void printImmArg(ID IID, unsigned ArgIdx, raw_ostream &OS, const Constant *ImmArgVal);
+
   } // namespace Intrinsic
 
   } // namespace llvm
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 96da698538314..f56a7603f4963 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -87,6 +87,13 @@ class NoUndef<AttrIndex idx> : IntrinsicProperty {
   int ArgNo = idx.Value;
 }
 
+// PrettyPrintImmArg - The specified immediate argument has a custom pretty-print
+// function for diagnostic output.
+class PrettyPrintImmArg<AttrIndex idx, string funcname> : IntrinsicProperty {
+  int ArgNo = idx.Value;
+  string FunctionName = funcname;
+}
+
 // NonNull - The return value or specified argument is not null.
 class NonNull<AttrIndex idx> : IntrinsicProperty {
   int ArgNo = idx.Value;
diff --git a/llvm/include/llvm/IR/IntrinsicsNVVM.td b/llvm/include/llvm/IR/IntrinsicsNVVM.td
index 3af1750ffcf3f..6a132de3413bb 100644
--- a/llvm/include/llvm/IR/IntrinsicsNVVM.td
+++ b/llvm/include/llvm/IR/IntrinsicsNVVM.td
@@ -2871,7 +2871,14 @@ foreach sp = [0, 1] in {
         defvar nargs = !size(args);
         defvar scale_d_imm = ArgIndex<!sub(nargs, 1)>;
         defvar scale_d_imm_range = [ImmArg<scale_d_imm>, Range<scale_d_imm, 0, 16>];
-        defvar intrinsic_properties = !listconcat(
+
+        // Check if this is the specific llvm.nvvm.tcgen05.mma.tensor intrinsic
+        defvar is_target_intrinsic = !and(!eq(sp, 0), 
+                                          !eq(space, "tensor"), 
+                                          !eq(scale_d, 0), 
+                                          !eq(ashift, 0));
+
+        defvar base_properties = !listconcat(
           mma.common_intr_props,
           !if(!eq(scale_d, 1), scale_d_imm_range, []),
           [Range<ArgIndex<nargs>, 0, !if(!eq(scale_d, 1), 2, 4)>, // kind
@@ -2881,6 +2888,13 @@ foreach sp = [0, 1] in {
           ]
         );
 
+        defvar intrinsic_properties = !if(is_target_intrinsic, 
+          !listconcat(base_properties,
+            [PrettyPrintImmArg<ArgIndex<nargs>, "printTcgen05MMAKind">,
+             PrettyPrintImmArg<ArgIndex<!add(nargs, 1)>, "printCTAGroupKind">,
+             PrettyPrintImmArg<ArgIndex<!add(nargs, 2)>, "printTcgen05CollectorUsageOp">]),
+          base_properties);
+
         def mma.record:
               DefaultAttrsIntrinsicFlags<[], args, flags, intrinsic_properties,
                 mma.intr>;
diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h
index a99937a90cbb7..fe5c1308a190a 100644
--- a/llvm/include/llvm/IR/Module.h
+++ b/llvm/include/llvm/IR/Module.h
@@ -866,7 +866,8 @@ class LLVM_ABI Module {
   /// the assembly.
   void print(raw_ostream &OS, AssemblyAnnotationWriter *AAW,
              bool ShouldPreserveUseListOrder = false,
-             bool IsForDebug = false) const;
+             bool IsForDebug = false, 
+             bool PrettyPrintIntrinsicArgs = false) const;
 
   /// Dump the module to stderr (for debugging).
   void dump() const;
diff --git a/llvm/include/llvm/IR/NVVMIntrinsicUtils.h b/llvm/include/llvm/IR/NVVMIntrinsicUtils.h
index d55100e5e709d..2ce1461f7f3c2 100644
--- a/llvm/include/llvm/IR/NVVMIntrinsicUtils.h
+++ b/llvm/include/llvm/IR/NVVMIntrinsicUtils.h
@@ -18,6 +18,9 @@
 #include <stdint.h>
 
 #include "llvm/ADT/APFloat.h"
+#include "llvm/ADT/APInt.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/Support/raw_ostream.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/IntrinsicsNVPTX.h"
 
@@ -659,6 +662,66 @@ inline APFloat::roundingMode GetFMARoundingMode(Intrinsic::ID IntrinsicID) {
   llvm_unreachable("Invalid FP instrinsic rounding mode for NVVM fma");
 }
 
+inline void printTcgen05MMAKind(raw_ostream &OS, const Constant *ImmArgVal) {
+  if (const ConstantInt *CI = dyn_cast<ConstantInt>(ImmArgVal)) {
+    uint64_t Val = CI->getZExtValue();
+    switch (static_cast<Tcgen05MMAKind>(Val)) {
+    case Tcgen05MMAKind::F16:
+      OS << "kind::f16";
+      return;
+    case Tcgen05MMAKind::TF32:
+      OS << "kind::tf32";
+      return;
+    case Tcgen05MMAKind::F8F6F4:
+      OS << "kind::f8f6f4";
+      return;
+    case Tcgen05MMAKind::I8:
+      OS << "kind::i8";
+      return;
+    }
+  }
+  llvm_unreachable("printTcgen05MMAKind called with invalid value for immediate argument");
+}
+
+inline void printTcgen05CollectorUsageOp(raw_ostream &OS, const Constant *ImmArgVal) {
+  if (const ConstantInt *CI = dyn_cast<ConstantInt>(ImmArgVal)) {
+    uint64_t Val = CI->getZExtValue();
+    switch (static_cast<Tcgen05CollectorUsageOp>(Val)) {
+    case Tcgen05CollectorUsageOp::DISCARD:
+      OS << "collector::a::discard";
+      return;
+    case Tcgen05CollectorUsageOp::LASTUSE:
+      OS << "collector::a::lastuse";
+      return;
+    case Tcgen05CollectorUsageOp::FILL:
+      OS << "collector::a::fill";
+      return;
+    case Tcgen05CollectorUsageOp::USE:
+      OS << "collector::a::use";
+      return;
+    }
+  }
+  llvm_unreachable("printTcgen05CollectorUsageOp called with invalid value for immediate argument");
+}
+
+inline void printCTAGroupKind(raw_ostream &OS, const Constant *ImmArgVal) {
+  if (const ConstantInt *CI = dyn_cast<ConstantInt>(ImmArgVal)) {
+    uint64_t Val = CI->getZExtValue();
+    switch (static_cast<CTAGroupKind>(Val)) {
+    case CTAGroupKind::CG_NONE: 
+      OS << "cta_group::0";
+      return;
+    case CTAGroupKind::CG_1: 
+      OS << "cta_group::1";
+      return;
+    case CTAGroupKind::CG_2: 
+      OS << "cta_group::2";
+      return;
+    }
+  }
+  llvm_unreachable("printCTAGroupKind called with invalid value for immediate argument");
+}
+
 } // namespace nvvm
 } // namespace llvm
 #endif // LLVM_IR_NVVMINTRINSICUTILS_H
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index ae086bcd3902d..c859cf980ef66 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -53,6 +53,7 @@
 #include "llvm/IR/Instruction.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/LLVMContext.h"
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
@@ -2831,6 +2832,7 @@ class AssemblyWriter {
   SetVector<const Comdat *> Comdats;
   bool IsForDebug;
   bool ShouldPreserveUseListOrder;
+  bool PrettyPrintIntrinsicArgs;
   UseListOrderMap UseListOrders;
   SmallVector<StringRef, 8> MDNames;
   /// Synchronization scope names registered with LLVMContext.
@@ -2841,7 +2843,8 @@ class AssemblyWriter {
   /// Construct an AssemblyWriter with an external SlotTracker
   AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac, const Module *M,
                  AssemblyAnnotationWriter *AAW, bool IsForDebug,
-                 bool ShouldPreserveUseListOrder = false);
+                 bool ShouldPreserveUseListOrder = false,
+                 bool PrettyPrintIntrinsicArgs = false);
 
   AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac,
                  const ModuleSummaryIndex *Index, bool IsForDebug);
@@ -2930,10 +2933,12 @@ class AssemblyWriter {
 
 AssemblyWriter::AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac,
                                const Module *M, AssemblyAnnotationWriter *AAW,
-                               bool IsForDebug, bool ShouldPreserveUseListOrder)
+                               bool IsForDebug, bool ShouldPreserveUseListOrder,
+                               bool PrettyPrintIntrinsicArgs)
     : Out(o), TheModule(M), Machine(Mac), TypePrinter(M), AnnotationWriter(AAW),
       IsForDebug(IsForDebug),
-      ShouldPreserveUseListOrder(ShouldPreserveUseListOrder) {
+      ShouldPreserveUseListOrder(ShouldPreserveUseListOrder),
+      PrettyPrintIntrinsicArgs(PrettyPrintIntrinsicArgs) {
   if (!TheModule)
     return;
   for (const GlobalObject &GO : TheModule->global_objects())
@@ -2944,7 +2949,8 @@ AssemblyWriter::AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac,
 AssemblyWriter::AssemblyWriter(formatted_raw_ostream &o, SlotTracker &Mac,
                                const ModuleSummaryIndex *Index, bool IsForDebug)
     : Out(o), TheIndex(Index), Machine(Mac), TypePrinter(/*Module=*/nullptr),
-      IsForDebug(IsForDebug), ShouldPreserveUseListOrder(false) {}
+      IsForDebug(IsForDebug), ShouldPreserveUseListOrder(false),
+      PrettyPrintIntrinsicArgs(false) {}
 
 void AssemblyWriter::writeOperand(const Value *Operand, bool PrintType) {
   if (!Operand) {
@@ -4561,12 +4567,38 @@ void AssemblyWriter::printInstruction(const Instruction &I) {
     Out << ' ';
     writeOperand(Operand, false);
     Out << '(';
+    bool HasPrettyPrintedArgs = PrettyPrintIntrinsicArgs && isa<IntrinsicInst>(CI) && 
+                              Intrinsic::hasPrettyPrintedArgs(CI->getIntrinsicID());
+
     ListSeparator LS;
-    for (unsigned op = 0, Eop = CI->arg_size(); op < Eop; ++op) {
-      Out << LS;
-      writeParamOperand(CI->getArgOperand(op), PAL.getParamAttrs(op));
-    }
+    if (HasPrettyPrintedArgs) {
+      Function *CalledFunc = CI->getCalledFunction();
+      auto PrintArgComment = [&](unsigned ArgNo) {
+        if (!CalledFunc->hasParamAttribute(ArgNo, Attribute::ImmArg))
+          return;
+        const Constant *ConstArg = dyn_cast<Constant>(CI->getArgOperand(ArgNo));
+        if (!ConstArg)
+          return;
+        std::string ArgComment;
+        raw_string_ostream ArgCommentStream(ArgComment);
+        Intrinsic::ID IID = CalledFunc->getIntrinsicID();
+        Intrinsic::printImmArg(IID, ArgNo, ArgCommentStream, ConstArg);
+        if (ArgComment.empty())
+          return;
+        Out << "/* " << ArgComment << " */ ";
+      };
 
+      for (unsigned ArgNo = 0, NumArgs = CI->arg_size(); ArgNo < NumArgs; ++ArgNo) {
+        Out << LS;
+        PrintArgComment(ArgNo);
+        writeParamOperand(CI->getArgOperand(ArgNo), PAL.getParamAttrs(ArgNo));
+      }
+    } else {
+      for (unsigned op = 0, Eop = CI->arg_size(); op < Eop; ++op) {
+        Out << LS;
+        writeParamOperand(CI->getArgOperand(op), PAL.getParamAttrs(op));
+      }
+    } 
     // Emit an ellipsis if this is a musttail call in a vararg function.  This
     // is only to aid readability, musttail calls forward varargs by default.
     if (CI->isMustTailCall() && CI->getParent() &&
@@ -4991,12 +5023,14 @@ void AssemblyWriter::printUseLists(const Function *F) {
 
 void Function::print(raw_ostream &ROS, AssemblyAnnotationWriter *AAW,
                      bool ShouldPreserveUseListOrder,
-                     bool IsForDebug) const {
+                     bool IsForDebug,
+                     bool PrettyPrintIntrinsicArgs) const {
   SlotTracker SlotTable(this->getParent());
   formatted_raw_ostream OS(ROS);
   AssemblyWriter W(OS, SlotTable, this->getParent(), AAW,
                    IsForDebug,
-                   ShouldPreserveUseListOrder);
+                   ShouldPreserveUseListOrder,
+                   PrettyPrintIntrinsicArgs);
   W.printFunction(this);
 }
 
@@ -5012,11 +5046,11 @@ void BasicBlock::print(raw_ostream &ROS, AssemblyAnnotationWriter *AAW,
 }
 
 void Module::print(raw_ostream &ROS, AssemblyAnnotationWriter *AAW,
-                   bool ShouldPreserveUseListOrder, bool IsForDebug) const {
+                   bool ShouldPreserveUseListOrder, bool IsForDebug, bool PrettyPrintIntrinsicArgs) const {
   SlotTracker SlotTable(this);
   formatted_raw_ostream OS(ROS);
   AssemblyWriter W(OS, SlotTable, this, AAW, IsForDebug,
-                   ShouldPreserveUseListOrder);
+                   ShouldPreserveUseListOrder, PrettyPrintIntrinsicArgs);
   W.printModule(this);
 }
 
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index 6797a100ff732..4159de3e630c8 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -23,6 +23,7 @@
 #include "llvm/IR/IntrinsicsLoongArch.h"
 #include "llvm/IR/IntrinsicsMips.h"
 #include "llvm/IR/IntrinsicsNVPTX.h"
+#include "llvm/IR/NVVMIntrinsicUtils.h"
 #include "llvm/IR/IntrinsicsPowerPC.h"
 #include "llvm/IR/IntrinsicsR600.h"
 #include "llvm/IR/IntrinsicsRISCV.h"
@@ -601,6 +602,12 @@ bool Intrinsic::isOverloaded(ID id) {
 #undef GET_INTRINSIC_OVERLOAD_TABLE
 }
 
+bool Intrinsic::hasPrettyPrintedArgs(ID id) {
+#define GET_INTRINSIC_PRETTY_PRINT_TABLE
+#include "llvm/IR/IntrinsicImpl.inc"
+#undef GET_INTRINSIC_PRETTY_PRINT_TABLE
+}
+
 /// Table of per-target intrinsic name tables.
 #define GET_INTRINSIC_TARGET_DATA
 #include "llvm/IR/IntrinsicImpl.inc"
@@ -1129,3 +1136,7 @@ Intrinsic::ID Intrinsic::getDeinterleaveIntrinsicID(unsigned Factor) {
   assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
   return InterleaveIntrinsics[Factor - 2].Deinterleave;
 }
+
+#define GET_INTRINSIC_PRETTY_PRINT_ARGUMENTS
+#include "llvm/IR/IntrinsicImpl.inc"
+#undef GET_INTRINSIC_PRETTY_PRINT_ARGUMENTS
\ No newline at end of file
diff --git a/llvm/test/CodeGen/NVPTX/tcgen05-mma.ll b/llvm/test/CodeGen/NVPTX/tcgen05-mma.ll
index 711e566df5034..ea3a5db82684f 100644
--- a/llvm/test/CodeGen/NVPTX/tcgen05-mma.ll
+++ b/llvm/test/CodeGen/NVPTX/tcgen05-mma.ll
@@ -34,13 +34,17 @@ define void @tcgen05_mma_fp16_cta1(ptr addrspace(6) %dtmem, ptr addrspace(6) %at
 ; CHECK-NEXT:    tcgen05.mma.cta_group::1.kind::f16.collector::a::use [%r1], [%r3], %rd2, %r2, %p1;
 ; CHECK-NEXT:    ret;
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 0)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f16 */ i32 0, /* cta_group::1 */ i32 1, /* collector::a::discard */ i32 0)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 0)
   call void @llvm.nvvm.tcgen05.mma.tensor.ashift(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 0)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 1)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f16 */ i32 0, /* cta_group::1 */ i32 1, /* collector::a::lastuse */ i32 1)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 1)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 2)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f16 */ i32 0, /* cta_group::1 */ i32 1, /* collector::a::fill */ i32 2)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 2)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 3)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f16 */ i32 0, /* cta_group::1 */ i32 1, /* collector::a::use */ i32 3)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 0, i32 1, i32 3)
   ret void
 }
@@ -113,13 +117,17 @@ define void @tcgen05_mma_tf32_cta1(ptr addrspace(6) %dtmem, ptr addrspace(6) %at
 ; CHECK-NEXT:    tcgen05.mma.cta_group::1.kind::tf32.collector::a::use [%r1], [%r3], %rd2, %r2, %p1;
 ; CHECK-NEXT:    ret;
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 0)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::tf32 */ i32 1, /* cta_group::1 */ i32 1, /* collector::a::discard */ i32 0)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 0)
   call void @llvm.nvvm.tcgen05.mma.tensor.ashift(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 0)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 1)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::tf32 */ i32 1, /* cta_group::1 */ i32 1, /* collector::a::lastuse */ i32 1)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 1)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 2)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::tf32 */ i32 1, /* cta_group::1 */ i32 1, /* collector::a::fill */ i32 2)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 2)
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 3)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::tf32 */ i32 1, /* cta_group::1 */ i32 1, /* collector::a::use */ i32 3)
   call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 1, i32 1, i32 3)
   ret void
 }
@@ -192,13 +200,17 @@ define void @tcgen05_mma_f8f6f4_cta1(ptr addrspace(6) %dtmem, ptr addrspace(6) %
 ; CHECK-NEXT:    tcgen05.mma.cta_group::1.kind::f8f6f4.collector::a::use [%r1], [%r3], %rd2, %r2, %p1;
 ; CHECK-NEXT:    ret;
   call void @llvm.nvvm.tcgen05.mma.shared(ptr addrspace(6) %dtmem, i64 %ashared, i64 %b, i32 %idesc, i1 %enable_inp_d, i32 2, i32 1, i32 0)
+  ; call void @llvm.nvvm.tcgen05.mma.tensor(ptr addrspace(6) %dtmem, ptr addrspace(6) %atensor, i64 %b, i32 %idesc, i1 %enable_inp_d, /* kind::f8f6f4 */ i32 2, /* cta_group::1 */ i32 1, /* collector::a::discard */ i32 0)
   call vo...
[truncated]

Copy link

github-actions bot commented Oct 9, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@DharuniRAcharya DharuniRAcharya force-pushed the dev/dharunira/pretty_printing_immargs branch from c505654 to c4caeb3 Compare October 9, 2025 10:24
@DharuniRAcharya DharuniRAcharya changed the title [NVVM] Pretty Printing Immediate Arguments in LLVM Intrinsics [LLVM] Pretty Printing Immediate Arguments in LLVM Intrinsics Oct 9, 2025
@DharuniRAcharya DharuniRAcharya changed the title [LLVM] Pretty Printing Immediate Arguments in LLVM Intrinsics [LLVM-Tablegen] Pretty Printing Immediate Arguments in LLVM Intrinsics Oct 9, 2025
@DharuniRAcharya DharuniRAcharya force-pushed the dev/dharunira/pretty_printing_immargs branch 8 times, most recently from 053ee98 to 203c7e3 Compare October 13, 2025 10:45
@DharuniRAcharya DharuniRAcharya changed the title [LLVM-Tablegen] Pretty Printing Immediate Arguments in LLVM Intrinsics [LLVM-Tablegen] Pretty Printing Arguments in LLVM Intrinsics Oct 13, 2025
@DharuniRAcharya DharuniRAcharya force-pushed the dev/dharunira/pretty_printing_immargs branch 2 times, most recently from 0ec7236 to 2df4ad3 Compare October 13, 2025 12:58
@DharuniRAcharya DharuniRAcharya marked this pull request as ready for review October 15, 2025 08:30
@DharuniRAcharya DharuniRAcharya force-pushed the dev/dharunira/pretty_printing_immargs branch 2 times, most recently from 30d2f0b to 4726e73 Compare October 15, 2025 08:55
@durga4github durga4github requested a review from nikic October 15, 2025 09:04
? PreserveAssemblyUseListOrder
: ShouldPreserveUseListOrder) {
: ShouldPreserveUseListOrder),
PrettyPrintIntrinsicArgs(PrintFormattedIntrinsics) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cl::opt objects are thread-unsafe, so they cannot be correctly used for LLVM-based jit compilers in common case. However, this feature is useful to debug such compilers.

Could you please add an extra optional constructor parameter to allow users enabling the pretty printing independently in multithreaded environments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should enable this option by default -- would that be sufficient for your purposes?

(I'm not entirely sure this functionality needs to be guarded by an option at all, but at least enabling it by default should be reasonable...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with both options: either enabled by default or a thread-safe knob.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabled this feature by default in the latest revision. So, intrinsics with the ArgInfo property will have their arguments pretty printed as specified.

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a TableGen LIT test under test/TableGen?

@DharuniRAcharya DharuniRAcharya force-pushed the dev/dharunira/pretty_printing_immargs branch from 4726e73 to fadac9b Compare October 16, 2025 09:26
@DharuniRAcharya
Copy link
Contributor Author

Could you also add a TableGen LIT test under test/TableGen?

Added in the latest revision!

@DharuniRAcharya DharuniRAcharya force-pushed the dev/dharunira/pretty_printing_immargs branch from fadac9b to 16f0420 Compare October 16, 2025 09:53
This patch adds LLVM infrastructure to support pretty printing arguments of the intrinsics. The motivation is to increase the readability of LLVM intrinsics and facilitate easy modifications and debugging of LLVM IR.
This adds a property ArgInfo<ArgIndex, "argName", "functionName"> to the intrinsic arguments that enables printing self-explanatory inline comment for the arguments.
The addition of pretty print support can provide a simple, low-overhead feature that enhances usability of LLVM intrinsics without disrupting existing workflows.
Link to the RFC:
https://discourse.llvm.org/t/rfc-pretty-printing-immediate-arguments-in-llvm-intrinsics/88536

Signed-off-by: Dharuni R Acharya <[email protected]>
@DharuniRAcharya DharuniRAcharya force-pushed the dev/dharunira/pretty_printing_immargs branch from 16f0420 to ffd5991 Compare October 16, 2025 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants