[mlir][LLVM] handle ArrayAttr for constant array of structs #139724

jeanPerier · 2025-05-13T12:59:40Z

While LLVM IR dialect has a way to represent arbitrary LLVM constant array of structs via an insert chain, it is in practice very expensive for the compilation time as soon as the array is bigger than a couple hundred elements. This is because generating and later folding such insert chain is really not cheap.

This is an issue for flang because it is very easy to generate rather big array of struct constant initializer in Fortran, and unlike C++, dynamic initialization of globals is not a feature of the language. Initializers must be static.

For instance, here are the compile time I measuring for the following program changing N size:

! test.F90
module m
  type t
    integer :: i = 42
    real :: x = 1.0
  end type
  type(t) :: some_global(N)
end module

/usr/bin/time flang -c -DN=1000 test.F90 
0.08user 0.07system 0:00.11elapsed 140%CPU (0avgtext+0avgdata 83968maxresident)k
8inputs+40outputs (13major+4840minor)pagefaults 0swaps

/usr/bin/time  flang -c -DN=10000 test.F90 
1.40user 0.08system 0:01.44elapsed 102%CPU (0avgtext+0avgdata 89088maxresident)k
8inputs+184outputs (13major+8764minor)pagefaults 0swaps

/usr/bin/time  flang -c -DN=100000 test.F90 
137.79user 0.22system 2:18.00elapsed 100%CPU (0avgtext+0avgdata 145540maxresident)k
8inputs+1584outputs (10major+82461minor)pagefaults 0swap

In the last case, more than 99.99% of the time is spend folding the insert chain in ModuleTranslation.cpp
With this patch (and updating flang to generate an ArrayAttr instead of an insert chain), the last case with 100000 elements takes 0.15s to compile (~1000x compilation speed up :)).

This is not a silver bullet because there are cases where an insert chain will still currently be needed, like when the initial values contain symbol reference, but this is not very common for my use case.

llvmbot · 2025-05-13T13:00:15Z

@llvm/pr-subscribers-mlir

Author: None (jeanPerier)

Changes

While LLVM IR dialect has a way to represent arbitrary LLVM constant array of structs via an insert chain, it is in practice very expensive for the compilation time as soon as the array is bigger than a couple hundred elements. This is because generating and later folding such insert chain is really not cheap.

This is an issue for flang because it is very easy to generate rather big array of struct constant initializer in Fortran, and unlike C++, dynamic initialization of globals is not a feature of the language. Initializers must be static.

For instance, here are the compile time I measuring for the following program changing N size:

! test.F90
module m
  type t
    integer :: i = 42
    real :: x = 1.0
  end type
  type(t) :: some_global(N)
end module

/usr/bin/time flang -c -DN=1000 test.F90 
0.08user 0.07system 0:00.11elapsed 140%CPU (0avgtext+0avgdata 83968maxresident)k
8inputs+40outputs (13major+4840minor)pagefaults 0swaps

/usr/bin/time  flang -c -DN=10000 test.F90 
1.40user 0.08system 0:01.44elapsed 102%CPU (0avgtext+0avgdata 89088maxresident)k
8inputs+184outputs (13major+8764minor)pagefaults 0swaps

/usr/bin/time  flang -c -DN=100000 test.F90 
137.79user 0.22system 2:18.00elapsed 100%CPU (0avgtext+0avgdata 145540maxresident)k
8inputs+1584outputs (10major+82461minor)pagefaults 0swap

In the last case, more than 99.99% of the time is spend folding the insert chain in ModuleTranslation.cpp
With this patch (and updating flang to generate an ArrayAttr instead of an insert chain), the last case with 100000 elements takes 0.15s to compile (~1000x compilation speed up :)).

This is not a silver bullet because there are cases where an insert chain will still currently be needed, like when the initial values contain symbol reference, but this is not very common for my use case.

Full diff: https://github.com/llvm/llvm-project/pull/139724.diff

5 Files Affected:

(modified) mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp (+44-9)
(modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+27)
(modified) mlir/test/Dialect/LLVMIR/invalid.mlir (+22)
(modified) mlir/test/Target/LLVMIR/llvmir-invalid.mlir (-5)
(modified) mlir/test/Target/LLVMIR/llvmir.mlir (+17)

diff --git a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
index c757f3ceb90e3..1868995e3f5ed 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
@@ -3221,15 +3221,50 @@ LogicalResult LLVM::ConstantOp::verify() {
     if (!isa<VectorType, LLVM::LLVMArrayType>(getType()))
       return emitOpError() << "expected vector or array type";
     // The number of elements of the attribute and the type must match.
-    int64_t attrNumElements;
-    if (auto elementsAttr = dyn_cast<ElementsAttr>(getValue()))
-      attrNumElements = elementsAttr.getNumElements();
-    else
-      attrNumElements = cast<ArrayAttr>(getValue()).size();
-    if (getNumElements(getType()) != attrNumElements)
-      return emitOpError()
-             << "type and attribute have a different number of elements: "
-             << getNumElements(getType()) << " vs. " << attrNumElements;
+    if (auto elementsAttr = dyn_cast<ElementsAttr>(getValue())) {
+      int64_t attrNumElements = elementsAttr.getNumElements();
+      if (getNumElements(getType()) != attrNumElements)
+        return emitOpError()
+               << "type and attribute have a different number of elements: "
+               << getNumElements(getType()) << " vs. " << attrNumElements;
+    } else {
+      // When the attribute is an ArrayAttr, check that its nesting matches the
+      // corresponding ArrayType or VectorType nesting.
+      Type dimType = getType();
+      Attribute dimVal = getValue();
+      int dim = 0;
+      while (true) {
+        int64_t dimSize =
+            llvm::TypeSwitch<Type, int64_t>(dimType)
+                .Case<VectorType, LLVMArrayType>([&dimType](auto t) -> int64_t {
+                  dimType = t.getElementType();
+                  return t.getNumElements();
+                })
+                .Default([](auto) -> int64_t { return -1; });
+        if (dimSize < 0)
+          break;
+        auto arrayAttr = dyn_cast<ArrayAttr>(dimVal);
+        if (!arrayAttr)
+          return emitOpError()
+                 << "array attribute nesting must match array type nesting";
+        if (dimSize != static_cast<int64_t>(arrayAttr.size()))
+          return emitOpError()
+                 << "array attribute size does not match array type size in "
+                    "dimension "
+                 << dim << ": " << arrayAttr.size() << " vs. " << dimSize;
+        if (arrayAttr.size() == 0)
+          break;
+        dimVal = arrayAttr.getValue()[0];
+        ++dim;
+      }
+      if (auto structType = dyn_cast<LLVMStructType>(dimType)) {
+        auto arrayAttr = dyn_cast<ArrayAttr>(dimVal);
+        if (!arrayAttr || arrayAttr.size() != structType.getBody().size())
+          return emitOpError()
+                 << "nested attribute must be an array attribute with the same "
+                    "number of elements as the struct type";
+      }
+    }
   } else {
     return emitOpError()
            << "only supports integer, float, string or elements attributes";
diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index 1168b9f339904..1d4509ccb044e 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -713,6 +713,33 @@ llvm::Constant *mlir::LLVM::detail::getLLVMConstant(
         ArrayRef<char>{stringAttr.getValue().data(),
                        stringAttr.getValue().size()});
   }
+
+  // Handle arrays of structs that cannot be represented as DenseElementsAttr
+  // in MLIR.
+  if (auto arrayAttr = dyn_cast<ArrayAttr>(attr)) {
+    if (auto *arrayTy = dyn_cast<llvm::ArrayType>(llvmType)) {
+      llvm::Type *elementType = arrayTy->getElementType();
+      Attribute previousElementAttr;
+      llvm::Constant *elementCst = nullptr;
+      SmallVector<llvm::Constant *> constants;
+      constants.reserve(arrayTy->getNumElements());
+      for (auto elementAttr : arrayAttr) {
+        // Arrays with a single value or with repeating values are quite common.
+        // short-circuit the translation when the element value is the same as
+        // the previous one.
+        if (!previousElementAttr || previousElementAttr != elementAttr) {
+          previousElementAttr = elementAttr;
+          elementCst =
+              getLLVMConstant(elementType, elementAttr, loc, moduleTranslation);
+          if (!elementCst)
+            return nullptr;
+        }
+        constants.push_back(elementCst);
+      }
+      return llvm::ConstantArray::get(arrayTy, constants);
+    }
+  }
+
   emitError(loc, "unsupported constant value");
   return nullptr;
 }
diff --git a/mlir/test/Dialect/LLVMIR/invalid.mlir b/mlir/test/Dialect/LLVMIR/invalid.mlir
index f9ea066a63624..4c82e586b8a3c 100644
--- a/mlir/test/Dialect/LLVMIR/invalid.mlir
+++ b/mlir/test/Dialect/LLVMIR/invalid.mlir
@@ -1850,3 +1850,25 @@ llvm.func @gep_inbounds_flag_usage(%ptr: !llvm.ptr, %idx: i64) {
   llvm.getelementptr inbounds_flag %ptr[%idx, 0, %idx] : (!llvm.ptr, i64, i64) -> !llvm.ptr, !llvm.struct<(array<10 x f32>)>
   llvm.return
 }
+
+// -----
+
+llvm.mlir.global @x1() : !llvm.array<2x!llvm.struct<(i32, f32)>> {
+  // expected-error@+1{{'llvm.mlir.constant' op array attribute size does not match array type size in dimension 0: 1 vs. 2}}
+  %0 = llvm.mlir.constant([[42 : i32, 1.000000e+00 : f32]]) : !llvm.array<2x!llvm.struct<(i32, f32)>>
+  llvm.return %0 : !llvm.array<2x!llvm.struct<(i32, f32)>>
+}
+
+// -----
+llvm.mlir.global @x2() : !llvm.array<1x!llvm.array<1x!llvm.array<1x!llvm.struct<(i32)>>>> {
+  // expected-error@+1{{'llvm.mlir.constant' op array attribute nesting must match array type nesting}}
+  %0 = llvm.mlir.constant([[1 : i32]]) : !llvm.array<1x!llvm.array<1x!llvm.array<1x!llvm.struct<(i32)>>>>
+  llvm.return %0 : !llvm.array<1x!llvm.array<1x!llvm.array<1x!llvm.struct<(i32)>>>>
+}
+
+// -----
+llvm.mlir.global @x3() : !llvm.array<1x!llvm.struct<(i32, f32)>> {
+  // expected-error@+1{{'llvm.mlir.constant' op nested attribute must be an array attribute with the same number of elements as the struct type}}
+  %0 = llvm.mlir.constant([[1 : i32]]) : !llvm.array<1x!llvm.struct<(i32, f32)>>
+  llvm.return %0 : !llvm.array<1x!llvm.struct<(i32, f32)>>
+}
diff --git a/mlir/test/Target/LLVMIR/llvmir-invalid.mlir b/mlir/test/Target/LLVMIR/llvmir-invalid.mlir
index 90c0f5ac55cb1..24a7b42557278 100644
--- a/mlir/test/Target/LLVMIR/llvmir-invalid.mlir
+++ b/mlir/test/Target/LLVMIR/llvmir-invalid.mlir
@@ -79,11 +79,6 @@ llvm.func @incompatible_integer_type_for_float_attr() -> i32 {
 
 // -----
 
-// expected-error @below{{unsupported constant value}}
-llvm.mlir.global internal constant @test([2.5, 7.4]) : !llvm.array<2 x f64>
-
-// -----
-
 // expected-error @below{{LLVM attribute 'readonly' does not expect a value}}
 llvm.func @passthrough_unexpected_value() attributes {passthrough = [["readonly", "42"]]}
 
diff --git a/mlir/test/Target/LLVMIR/llvmir.mlir b/mlir/test/Target/LLVMIR/llvmir.mlir
index 4ef68fa83a70d..242a151116fb3 100644
--- a/mlir/test/Target/LLVMIR/llvmir.mlir
+++ b/mlir/test/Target/LLVMIR/llvmir.mlir
@@ -3000,3 +3000,20 @@ llvm.func internal @i(%arg0: i32) attributes {dso_local} {
   llvm.call @testfn3(%arg0) : (i32 {llvm.alignstack = 8 : i64}) -> ()
   llvm.return
 }
+
+// -----
+
+// CHECK: @test_array_attr_1 = internal constant [2 x double] [double 2.500000e+00, double 7.400000e+00]
+llvm.mlir.global internal constant @test_array_attr_1([2.5, 7.4]) : !llvm.array<2 x f64>
+
+// CHECK: @test_array_attr_2 = global [2 x { i32, float }] [{ i32, float } { i32 42, float 1.000000e+00 }, { i32, float } { i32 42, float 1.000000e+00 }]
+llvm.mlir.global @test_array_attr_2() : !llvm.array<2x!llvm.struct<(i32, f32)>> {
+  %0 = llvm.mlir.constant([[42 : i32, 1.000000e+00 : f32],[42 : i32, 1.000000e+00 : f32]]) : !llvm.array<2x!llvm.struct<(i32, f32)>>
+  llvm.return %0 : !llvm.array<2x!llvm.struct<(i32, f32)>>
+}
+
+// CHECK: @test_array_attr_3 = global [2 x [3 x { i32, float }]{{.*}}[3 x { i32, float }] [{ i32, float } { i32 1, float 1.000000e+00 }, { i32, float } { i32 2, float 1.000000e+00 }, { i32, float } { i32 3, float 1.000000e+00 }], [3 x { i32, float }] [{ i32, float } { i32 4, float 1.000000e+00 }, { i32, float } { i32 5, float 1.000000e+00 }, { i32, float } { i32 6, float 1.000000e+00 }
+llvm.mlir.global @test_array_attr_3() : !llvm.array<2x!llvm.array<3x!llvm.struct<(i32, f32)>>> {
+  %0 = llvm.mlir.constant([[[1 : i32, 1.000000e+00 : f32],[2 : i32, 1.000000e+00 : f32],[3 : i32, 1.000000e+00 : f32]],[[4 : i32, 1.000000e+00 : f32],[5 : i32, 1.000000e+00 : f32],[6 : i32, 1.000000e+00 : f32]]]) : !llvm.array<2x!llvm.array<3x!llvm.struct<(i32, f32)>>>
+  llvm.return %0 : !llvm.array<2x!llvm.array<3x!llvm.struct<(i32, f32)>>>
+}

llvmbot · 2025-05-13T13:00:16Z

@llvm/pr-subscribers-mlir-llvm

Author: None (jeanPerier)

Changes

While LLVM IR dialect has a way to represent arbitrary LLVM constant array of structs via an insert chain, it is in practice very expensive for the compilation time as soon as the array is bigger than a couple hundred elements. This is because generating and later folding such insert chain is really not cheap.

This is an issue for flang because it is very easy to generate rather big array of struct constant initializer in Fortran, and unlike C++, dynamic initialization of globals is not a feature of the language. Initializers must be static.

For instance, here are the compile time I measuring for the following program changing N size:

! test.F90
module m
  type t
    integer :: i = 42
    real :: x = 1.0
  end type
  type(t) :: some_global(N)
end module

/usr/bin/time flang -c -DN=1000 test.F90 
0.08user 0.07system 0:00.11elapsed 140%CPU (0avgtext+0avgdata 83968maxresident)k
8inputs+40outputs (13major+4840minor)pagefaults 0swaps

/usr/bin/time  flang -c -DN=10000 test.F90 
1.40user 0.08system 0:01.44elapsed 102%CPU (0avgtext+0avgdata 89088maxresident)k
8inputs+184outputs (13major+8764minor)pagefaults 0swaps

/usr/bin/time  flang -c -DN=100000 test.F90 
137.79user 0.22system 2:18.00elapsed 100%CPU (0avgtext+0avgdata 145540maxresident)k
8inputs+1584outputs (10major+82461minor)pagefaults 0swap

In the last case, more than 99.99% of the time is spend folding the insert chain in ModuleTranslation.cpp
With this patch (and updating flang to generate an ArrayAttr instead of an insert chain), the last case with 100000 elements takes 0.15s to compile (~1000x compilation speed up :)).

This is not a silver bullet because there are cases where an insert chain will still currently be needed, like when the initial values contain symbol reference, but this is not very common for my use case.

Full diff: https://github.com/llvm/llvm-project/pull/139724.diff

5 Files Affected:

(modified) mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp (+44-9)
(modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+27)
(modified) mlir/test/Dialect/LLVMIR/invalid.mlir (+22)
(modified) mlir/test/Target/LLVMIR/llvmir-invalid.mlir (-5)
(modified) mlir/test/Target/LLVMIR/llvmir.mlir (+17)

diff --git a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
index c757f3ceb90e3..1868995e3f5ed 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
@@ -3221,15 +3221,50 @@ LogicalResult LLVM::ConstantOp::verify() {
     if (!isa<VectorType, LLVM::LLVMArrayType>(getType()))
       return emitOpError() << "expected vector or array type";
     // The number of elements of the attribute and the type must match.
-    int64_t attrNumElements;
-    if (auto elementsAttr = dyn_cast<ElementsAttr>(getValue()))
-      attrNumElements = elementsAttr.getNumElements();
-    else
-      attrNumElements = cast<ArrayAttr>(getValue()).size();
-    if (getNumElements(getType()) != attrNumElements)
-      return emitOpError()
-             << "type and attribute have a different number of elements: "
-             << getNumElements(getType()) << " vs. " << attrNumElements;
+    if (auto elementsAttr = dyn_cast<ElementsAttr>(getValue())) {
+      int64_t attrNumElements = elementsAttr.getNumElements();
+      if (getNumElements(getType()) != attrNumElements)
+        return emitOpError()
+               << "type and attribute have a different number of elements: "
+               << getNumElements(getType()) << " vs. " << attrNumElements;
+    } else {
+      // When the attribute is an ArrayAttr, check that its nesting matches the
+      // corresponding ArrayType or VectorType nesting.
+      Type dimType = getType();
+      Attribute dimVal = getValue();
+      int dim = 0;
+      while (true) {
+        int64_t dimSize =
+            llvm::TypeSwitch<Type, int64_t>(dimType)
+                .Case<VectorType, LLVMArrayType>([&dimType](auto t) -> int64_t {
+                  dimType = t.getElementType();
+                  return t.getNumElements();
+                })
+                .Default([](auto) -> int64_t { return -1; });
+        if (dimSize < 0)
+          break;
+        auto arrayAttr = dyn_cast<ArrayAttr>(dimVal);
+        if (!arrayAttr)
+          return emitOpError()
+                 << "array attribute nesting must match array type nesting";
+        if (dimSize != static_cast<int64_t>(arrayAttr.size()))
+          return emitOpError()
+                 << "array attribute size does not match array type size in "
+                    "dimension "
+                 << dim << ": " << arrayAttr.size() << " vs. " << dimSize;
+        if (arrayAttr.size() == 0)
+          break;
+        dimVal = arrayAttr.getValue()[0];
+        ++dim;
+      }
+      if (auto structType = dyn_cast<LLVMStructType>(dimType)) {
+        auto arrayAttr = dyn_cast<ArrayAttr>(dimVal);
+        if (!arrayAttr || arrayAttr.size() != structType.getBody().size())
+          return emitOpError()
+                 << "nested attribute must be an array attribute with the same "
+                    "number of elements as the struct type";
+      }
+    }
   } else {
     return emitOpError()
            << "only supports integer, float, string or elements attributes";
diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index 1168b9f339904..1d4509ccb044e 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -713,6 +713,33 @@ llvm::Constant *mlir::LLVM::detail::getLLVMConstant(
         ArrayRef<char>{stringAttr.getValue().data(),
                        stringAttr.getValue().size()});
   }
+
+  // Handle arrays of structs that cannot be represented as DenseElementsAttr
+  // in MLIR.
+  if (auto arrayAttr = dyn_cast<ArrayAttr>(attr)) {
+    if (auto *arrayTy = dyn_cast<llvm::ArrayType>(llvmType)) {
+      llvm::Type *elementType = arrayTy->getElementType();
+      Attribute previousElementAttr;
+      llvm::Constant *elementCst = nullptr;
+      SmallVector<llvm::Constant *> constants;
+      constants.reserve(arrayTy->getNumElements());
+      for (auto elementAttr : arrayAttr) {
+        // Arrays with a single value or with repeating values are quite common.
+        // short-circuit the translation when the element value is the same as
+        // the previous one.
+        if (!previousElementAttr || previousElementAttr != elementAttr) {
+          previousElementAttr = elementAttr;
+          elementCst =
+              getLLVMConstant(elementType, elementAttr, loc, moduleTranslation);
+          if (!elementCst)
+            return nullptr;
+        }
+        constants.push_back(elementCst);
+      }
+      return llvm::ConstantArray::get(arrayTy, constants);
+    }
+  }
+
   emitError(loc, "unsupported constant value");
   return nullptr;
 }
diff --git a/mlir/test/Dialect/LLVMIR/invalid.mlir b/mlir/test/Dialect/LLVMIR/invalid.mlir
index f9ea066a63624..4c82e586b8a3c 100644
--- a/mlir/test/Dialect/LLVMIR/invalid.mlir
+++ b/mlir/test/Dialect/LLVMIR/invalid.mlir
@@ -1850,3 +1850,25 @@ llvm.func @gep_inbounds_flag_usage(%ptr: !llvm.ptr, %idx: i64) {
   llvm.getelementptr inbounds_flag %ptr[%idx, 0, %idx] : (!llvm.ptr, i64, i64) -> !llvm.ptr, !llvm.struct<(array<10 x f32>)>
   llvm.return
 }
+
+// -----
+
+llvm.mlir.global @x1() : !llvm.array<2x!llvm.struct<(i32, f32)>> {
+  // expected-error@+1{{'llvm.mlir.constant' op array attribute size does not match array type size in dimension 0: 1 vs. 2}}
+  %0 = llvm.mlir.constant([[42 : i32, 1.000000e+00 : f32]]) : !llvm.array<2x!llvm.struct<(i32, f32)>>
+  llvm.return %0 : !llvm.array<2x!llvm.struct<(i32, f32)>>
+}
+
+// -----
+llvm.mlir.global @x2() : !llvm.array<1x!llvm.array<1x!llvm.array<1x!llvm.struct<(i32)>>>> {
+  // expected-error@+1{{'llvm.mlir.constant' op array attribute nesting must match array type nesting}}
+  %0 = llvm.mlir.constant([[1 : i32]]) : !llvm.array<1x!llvm.array<1x!llvm.array<1x!llvm.struct<(i32)>>>>
+  llvm.return %0 : !llvm.array<1x!llvm.array<1x!llvm.array<1x!llvm.struct<(i32)>>>>
+}
+
+// -----
+llvm.mlir.global @x3() : !llvm.array<1x!llvm.struct<(i32, f32)>> {
+  // expected-error@+1{{'llvm.mlir.constant' op nested attribute must be an array attribute with the same number of elements as the struct type}}
+  %0 = llvm.mlir.constant([[1 : i32]]) : !llvm.array<1x!llvm.struct<(i32, f32)>>
+  llvm.return %0 : !llvm.array<1x!llvm.struct<(i32, f32)>>
+}
diff --git a/mlir/test/Target/LLVMIR/llvmir-invalid.mlir b/mlir/test/Target/LLVMIR/llvmir-invalid.mlir
index 90c0f5ac55cb1..24a7b42557278 100644
--- a/mlir/test/Target/LLVMIR/llvmir-invalid.mlir
+++ b/mlir/test/Target/LLVMIR/llvmir-invalid.mlir
@@ -79,11 +79,6 @@ llvm.func @incompatible_integer_type_for_float_attr() -> i32 {
 
 // -----
 
-// expected-error @below{{unsupported constant value}}
-llvm.mlir.global internal constant @test([2.5, 7.4]) : !llvm.array<2 x f64>
-
-// -----
-
 // expected-error @below{{LLVM attribute 'readonly' does not expect a value}}
 llvm.func @passthrough_unexpected_value() attributes {passthrough = [["readonly", "42"]]}
 
diff --git a/mlir/test/Target/LLVMIR/llvmir.mlir b/mlir/test/Target/LLVMIR/llvmir.mlir
index 4ef68fa83a70d..242a151116fb3 100644
--- a/mlir/test/Target/LLVMIR/llvmir.mlir
+++ b/mlir/test/Target/LLVMIR/llvmir.mlir
@@ -3000,3 +3000,20 @@ llvm.func internal @i(%arg0: i32) attributes {dso_local} {
   llvm.call @testfn3(%arg0) : (i32 {llvm.alignstack = 8 : i64}) -> ()
   llvm.return
 }
+
+// -----
+
+// CHECK: @test_array_attr_1 = internal constant [2 x double] [double 2.500000e+00, double 7.400000e+00]
+llvm.mlir.global internal constant @test_array_attr_1([2.5, 7.4]) : !llvm.array<2 x f64>
+
+// CHECK: @test_array_attr_2 = global [2 x { i32, float }] [{ i32, float } { i32 42, float 1.000000e+00 }, { i32, float } { i32 42, float 1.000000e+00 }]
+llvm.mlir.global @test_array_attr_2() : !llvm.array<2x!llvm.struct<(i32, f32)>> {
+  %0 = llvm.mlir.constant([[42 : i32, 1.000000e+00 : f32],[42 : i32, 1.000000e+00 : f32]]) : !llvm.array<2x!llvm.struct<(i32, f32)>>
+  llvm.return %0 : !llvm.array<2x!llvm.struct<(i32, f32)>>
+}
+
+// CHECK: @test_array_attr_3 = global [2 x [3 x { i32, float }]{{.*}}[3 x { i32, float }] [{ i32, float } { i32 1, float 1.000000e+00 }, { i32, float } { i32 2, float 1.000000e+00 }, { i32, float } { i32 3, float 1.000000e+00 }], [3 x { i32, float }] [{ i32, float } { i32 4, float 1.000000e+00 }, { i32, float } { i32 5, float 1.000000e+00 }, { i32, float } { i32 6, float 1.000000e+00 }
+llvm.mlir.global @test_array_attr_3() : !llvm.array<2x!llvm.array<3x!llvm.struct<(i32, f32)>>> {
+  %0 = llvm.mlir.constant([[[1 : i32, 1.000000e+00 : f32],[2 : i32, 1.000000e+00 : f32],[3 : i32, 1.000000e+00 : f32]],[[4 : i32, 1.000000e+00 : f32],[5 : i32, 1.000000e+00 : f32],[6 : i32, 1.000000e+00 : f32]]]) : !llvm.array<2x!llvm.array<3x!llvm.struct<(i32, f32)>>>
+  llvm.return %0 : !llvm.array<2x!llvm.array<3x!llvm.struct<(i32, f32)>>>
+}

mlir/test/Dialect/LLVMIR/invalid.mlir

mlir/test/Target/LLVMIR/llvmir.mlir

gysit

There have been multiple gradual additions to constant operation in the past and I wonder if it would be worth to try to come up with some alternative modeling for such large constants.

Do you need the full flexibility of this PR or are you mainly interested in a solution of arrays of structs? If it is the later, I would favor making the PR more restrictive to only support what we need.

Also note that the doc string of the constant operation well need to be update if we extend the supported element types.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

gysit · 2025-05-13T13:26:39Z

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

+                 << dim << ": " << arrayAttr.size() << " vs. " << dimSize;
+        if (arrayAttr.size() == 0)
+          break;
+        dimVal = arrayAttr.getValue()[0];


My understanding is we want to support, potentially large, arrays with different element types? In the fortran examples all types are the same though?

In any case just testing the first element of the array seems a bit arbitrary? Assuming the array is large we may not want to verify at all? But maybe some solution with not checking previously verified attributes may be fine?

My understanding is we want to support, potentially large, arrays with different element types?

No, the arrays element all have the same types, but the element type is a struct type which itself contains different element types (which can be any of int, floats, pointers, struct, arrays).

I do not think there is an attribute other than the ArrayAttr that allow representing such array of values.
The issue is that verifying that all the ArrayAttr have the same "layout" (attribute kind, and type/shape if relevant) will be complex and expensive for big arrays, so the choice here is to make a sanity check of one element.

New verifier helper does verify all array elements, with a set to avoid rechecking the same attributes again and again. I tested on my flang use case, and the extra cost was reasonable (~+1% compilation time on big array constants).

gysit · 2025-05-13T13:34:09Z

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

+               << "type and attribute have a different number of elements: "
+               << getNumElements(getType()) << " vs. " << attrNumElements;
+    } else {
+      // When the attribute is an ArrayAttr, check that its nesting matches the


I suggest to factor out some helper function(s) to reduce the complexity of the verify function (I know that it was already complex before...). Maybe the check for arrays/vectors could be part of a separate function? Additionally, the test, if a scalar value is supported, could possibly be factored out as well? It seems that the struct test below is similar to the struct test at the beginning of the verify function?

Refactored the code to use a helper function for the array of struct case.

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp

jeanPerier · 2025-05-14T09:46:18Z

Do you need the full flexibility of this PR or are you mainly interested in a solution of arrays of structs? If it is the later, I would favor making the PR more restrictive to only support what we need.

Thanks for the review! Yes that is my use case (the array potentially being nested 2 or more times: [1000 x [50 x %STRUCT]] ), the struct type containing any element from int, float, pointers, struct, or arrays (potentially of struct themselves).

Here is an example of a more complex struct type for which I want to use attributes instead of insert chains when creating initializers for array of that type: %STRUCT = type { { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } }.

By more restrictive, do you mean rejecting the ArrayAttr on an mlir::Constant for an array whose element is not a struct type?

I did not do it because although the ModuleTranslation did not support it, the verifier was already accepting those, so I wondered if someone had a use case for that purely in the LLVM MLIR dialect. I am ok with rejecting it if you think that is not needed.

I am open to alternatives if you see a better path for my goal.

gysit · 2025-05-14T11:00:19Z

By more restrictive, do you mean rejecting the ArrayAttr on an mlir::Constant for an array whose element is not a struct type?

Yes that was my though. I believe for floats / integers etc we use a ElementsAttr or even a SplatElementsAttr to ensure the types are consistent. ArrayAttr should really be our last resort since consistency is hard to verify.

I did not do it because although the ModuleTranslation did not support it, the verifier was already accepting those, so I wondered if someone had a use case for that purely in the LLVM MLIR dialect.

I assume this is a gap in the verifier but maybe there are users nevertheless.

I am open to alternatives if you see a better path for my goal.

I think highest priority is probably trying to make the verifier more readable.

Second priority would be having something like a SplatLLVMStructElementsAttr / DenseLLVMStructElementsAttr. That would offload a lot of the verification work to these attributes and would make the verifier much nicer. Also a splat attribute would be much more efficient if all elements are initialized to the same value.

Unfortunately, I am not sure how easy it is to add such attributes that extend the existing built-in splat / dense array attributes. At least, there seem to be some base classes that probably could be extended (such as ElementsAttrBase). However, properly supporting nesting for arrays and structs etc is probably quite difficult. So for sure consider this optional.

@Dinistro @matthias-springer do you have an idea if it would be feasible / make sense to strive for something like Splat/Dense LLVMStruct/LLVMArray ElementsAttr?

jeanPerier · 2025-05-14T14:25:40Z

I think highest priority is probably trying to make the verifier more readable.

I refactored the verifier code to check array of struct attribute in a helper, which allow recursive checking of sub-dimension making things clearer.

I restricted the verifier to reject ArrayAttr for array of simple element types, and removed the handling of vector type from my code since it does not make sense to have vectors of structs.

I also added support and tests for undef/zero attributes appearing in some dimension or struct element field since I do need this in my flang use case.

gysit

Thanks for addressing the comments.

LGTM module nits.

I think the verifier is more readable now and the compromise in terms of verification quality seem ok. At a first glance, it also looks like the LLVM IR import should not be able to produce code that does not pass the new verifier (let's see if this statement survives the test of time...).

Let's give other reviewers some time tomorrow to follow-up should there be concerns.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp

gysit · 2025-05-14T20:03:52Z

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

+  // Forbid usages of ArrayAttr for simple array types that should use
+  // DenseElementsAttr instead. Note that there would be a use case for such
+  // array types when one element value is obtained via a ptr-to-int conversion
+  // from a symbol and cannot be represented in a DenseElementsAttr, but no MLIR


I believe the import from LLVM IR will generate operations to model this. I guess importing some large IR file will tell us if the verifier is now too restrictive (@bcardosolopes FYI). But let's give it a shot and then relax the verifier again if needed.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

gysit · 2025-05-14T20:21:08Z

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

+    return op.emitOpError() << "for array with an array attribute must have a "
+                               "struct element type";
+
+  // Shallow verification that leaf attributes are appropriate as struct initial


Shallow bcs we are not verifying the members of the first struct we hit. I guess that is an ok compromise.

Co-authored-by: Tobias Gysi <[email protected]>

github-actions · 2025-05-15T12:22:45Z

✅ With the latest revision this PR passed the C/C++ code formatter.

…140268) This patch relies on #140235 and #139724 to speed-up compilations of files with derived type array global with initial value. Currently, such derived type global init was lowered to an llvm.mlir.insertvalue chain in the LLVM IR dialect because there was no way to represent such value via attributes. This chain was later folded in LLVM dialect to LLVM IR using LLVM IR (not dialect) folding. This insert chain generation and folding is very expensive for big arrays. For instance, this patch brings down the compilation of FM_lib fmsave.f95 from 50s to 0.5s.

) While LLVM IR dialect has a way to represent arbitrary LLVM constant array of structs via an insert chain, it is in practice very expensive for the compilation time as soon as the array is bigger than a couple hundred elements. This is because generating and later folding such insert chain is really not cheap. This patch allows representing array of struct constants via ArrayAttr in the LLVM dialect.

…lvm#140268) This patch relies on llvm#140235 and llvm#139724 to speed-up compilations of files with derived type array global with initial value. Currently, such derived type global init was lowered to an llvm.mlir.insertvalue chain in the LLVM IR dialect because there was no way to represent such value via attributes. This chain was later folded in LLVM dialect to LLVM IR using LLVM IR (not dialect) folding. This insert chain generation and folding is very expensive for big arrays. For instance, this patch brings down the compilation of FM_lib fmsave.f95 from 50s to 0.5s.

) While LLVM IR dialect has a way to represent arbitrary LLVM constant array of structs via an insert chain, it is in practice very expensive for the compilation time as soon as the array is bigger than a couple hundred elements. This is because generating and later folding such insert chain is really not cheap. This patch allows representing array of struct constants via ArrayAttr in the LLVM dialect.

…lvm#140268) This patch relies on llvm#140235 and llvm#139724 to speed-up compilations of files with derived type array global with initial value. Currently, such derived type global init was lowered to an llvm.mlir.insertvalue chain in the LLVM IR dialect because there was no way to represent such value via attributes. This chain was later folded in LLVM dialect to LLVM IR using LLVM IR (not dialect) folding. This insert chain generation and folding is very expensive for big arrays. For instance, this patch brings down the compilation of FM_lib fmsave.f95 from 50s to 0.5s.

[mlir][LLVM] handle ArrayAttr for constant array of structs

9b5f1d6

jeanPerier requested review from matthias-springer, gysit and Dinistro May 13, 2025 12:59

llvmbot added mlir:llvm mlir labels May 13, 2025

Dinistro reviewed May 13, 2025

View reviewed changes

mlir/test/Dialect/LLVMIR/invalid.mlir Outdated Show resolved Hide resolved

mlir/test/Dialect/LLVMIR/invalid.mlir Outdated Show resolved Hide resolved

mlir/test/Target/LLVMIR/llvmir.mlir Outdated Show resolved Hide resolved

gysit reviewed May 13, 2025

View reviewed changes

jeanPerier added 4 commits May 14, 2025 07:06

support ZeroAttr and UndefAttr translation to llvm::Constant

4c9f182

address test formatting comments

63b199b

refactor and restrict array of struct constant verification

c04ff07

update llvm.constant doc

d0861fd

gysit approved these changes May 14, 2025

View reviewed changes

Apply suggestions from code review

c6c49b6

Co-authored-by: Tobias Gysi <[email protected]>

format

5773350

jeanPerier mentioned this pull request May 16, 2025

[flang] translate derived type array init to attribute if possible #140268

Merged

jeanPerier merged commit 80816e7 into llvm:main May 20, 2025
11 checks passed

jeanPerier deleted the llvm_struct_init branch May 20, 2025 08:45

[mlir][LLVM] handle ArrayAttr for constant array of structs #139724

[mlir][LLVM] handle ArrayAttr for constant array of structs #139724

Uh oh!

Conversation

jeanPerier commented May 13, 2025

Uh oh!

llvmbot commented May 13, 2025

Uh oh!

llvmbot commented May 13, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gysit May 13, 2025

Choose a reason for hiding this comment

Uh oh!

jeanPerier May 14, 2025

Choose a reason for hiding this comment

Uh oh!

jeanPerier May 14, 2025

Choose a reason for hiding this comment

Uh oh!

gysit May 13, 2025

Choose a reason for hiding this comment

Uh oh!

jeanPerier May 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeanPerier commented May 14, 2025

Uh oh!

gysit commented May 14, 2025

Uh oh!

jeanPerier commented May 14, 2025

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gysit May 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gysit May 14, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 15, 2025 •

edited

Loading