-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[Flang] Turn on alias analysis for locally allocated objects #139682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flang] Turn on alias analysis for locally allocated objects #139682
Conversation
Previously, a bug in the MemCptOpt LLVM IR pass caused issues with adding alias tags for locally allocated objects for Fortran code. However, the bug has now been fixed ( llvm#129537 ), and we can safely enable alias tags for these objects. This change should improve the accuracy of the alias analysis.
@llvm/pr-subscribers-flang-fir-hlfir Author: Dominik Adamski (DominikAdamski) ChangesPreviously, a bug in the MemCptOpt LLVM IR pass caused issues with adding alias tags for locally allocated objects for Fortran code. However, the bug has now been fixed ( #129537 ), and we can safely enable alias tags for these objects. This change should improve the accuracy of the alias analysis. Patch is 24.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139682.diff 5 Files Affected:
diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
index 66b4b84998801..5cfbdc33285f9 100644
--- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
+++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
@@ -43,13 +43,10 @@ static llvm::cl::opt<bool>
static llvm::cl::opt<bool>
enableDirect("direct-tbaa", llvm::cl::init(true), llvm::cl::Hidden,
llvm::cl::desc("Add TBAA tags to direct variables"));
-// This is **known unsafe** (misscompare in spec2017/wrf_r). It should
-// not be enabled by default.
-// The code is kept so that these may be tried with new benchmarks to see if
-// this is worth fixing in the future.
-static llvm::cl::opt<bool> enableLocalAllocs(
- "local-alloc-tbaa", llvm::cl::init(false), llvm::cl::Hidden,
- llvm::cl::desc("Add TBAA tags to local allocations. UNSAFE."));
+static llvm::cl::opt<bool>
+ enableLocalAllocs("local-alloc-tbaa", llvm::cl::init(true),
+ llvm::cl::Hidden,
+ llvm::cl::desc("Add TBAA tags to local allocations."));
namespace {
diff --git a/flang/test/Fir/tbaa-codegen2.fir b/flang/test/Fir/tbaa-codegen2.fir
index 8f8b6a29129e7..e4bfa9087ec75 100644
--- a/flang/test/Fir/tbaa-codegen2.fir
+++ b/flang/test/Fir/tbaa-codegen2.fir
@@ -100,7 +100,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ
// [...]
// CHECK: %[[VAL50:.*]] = getelementptr i32, ptr %{{.*}}, i64 %{{.*}}
// store to the temporary:
-// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[DATA_ACCESS_TAG:.*]]
+// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[TMP_DATA_ACCESS_TAG:.*]]
// [...]
// CHECK: [[BOX_ACCESS_TAG]] = !{![[BOX_ACCESS_TYPE:.*]], ![[BOX_ACCESS_TYPE]], i64 0}
@@ -111,4 +111,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ
// CHECK: ![[A_ACCESS_TYPE]] = !{!"dummy arg data/_QFfuncEa", ![[ARG_ACCESS_TYPE:.*]], i64 0}
// CHECK: ![[ARG_ACCESS_TYPE]] = !{!"dummy arg data", ![[DATA_ACCESS_TYPE:.*]], i64 0}
// CHECK: ![[DATA_ACCESS_TYPE]] = !{!"any data access", ![[ANY_ACCESS_TYPE]], i64 0}
-// CHECK: ![[DATA_ACCESS_TAG]] = !{![[DATA_ACCESS_TYPE]], ![[DATA_ACCESS_TYPE]], i64 0}
+// CHECK: ![[TMP_DATA_ACCESS_TAG]] = !{![[TMP_DATA_ACCESS_TYPE:.*]], ![[TMP_DATA_ACCESS_TYPE]], i64 0}
+// CHECK: ![[TMP_DATA_ACCESS_TYPE]] = !{!"allocated data/", ![[TMP_ACCESS_TYPE:.*]], i64 0}
+// CHECK: ![[TMP_ACCESS_TYPE]] = !{!"allocated data", ![[TARGET_ACCESS_TAG:.*]], i64 0}
+// CHECK: ![[TARGET_ACCESS_TAG]] = !{!"target data", ![[DATA_ACCESS_TYPE]], i64 0}
diff --git a/flang/test/Transforms/tbaa-with-dummy-scope2.fir b/flang/test/Transforms/tbaa-with-dummy-scope2.fir
index c8f419fbee652..249471de458d3 100644
--- a/flang/test/Transforms/tbaa-with-dummy-scope2.fir
+++ b/flang/test/Transforms/tbaa-with-dummy-scope2.fir
@@ -43,12 +43,15 @@ func.func @_QPtest1() attributes {noinline} {
// CHECK: #[[$ATTR_0:.+]] = #llvm.tbaa_root<id = "Flang function root _QPtest1">
// CHECK: #[[$ATTR_1:.+]] = #llvm.tbaa_type_desc<id = "any access", members = {<#[[$ATTR_0]], 0>}>
// CHECK: #[[$ATTR_2:.+]] = #llvm.tbaa_type_desc<id = "any data access", members = {<#[[$ATTR_1]], 0>}>
+// CHECK: #[[$TARGETDATA:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[$ATTR_2]], 0>}>
// CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data", members = {<#[[$ATTR_2]], 0>}>
-// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[$ATTR_2]], 0>}>
+// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc<id = "allocated data", members = {<#[[$TARGETDATA]], 0>}>
// CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data/_QFtest1FinnerEx", members = {<#[[$ATTR_3]], 0>}>
-// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc<id = "global data", members = {<#[[TARGETDATA]], 0>}>
+// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc<id = "global data", members = {<#[[$TARGETDATA]], 0>}>
// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_tag<base_type = #[[$ATTR_5]], access_type = #[[$ATTR_5]], offset = 0>
+// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc<id = "allocated data/_QFtest1FinnerEy", members = {<#[[$LOCAL_ATTR_0]], 0>}>
// CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc<id = "global data/_QMmEglob", members = {<#[[$ATTR_4]], 0>}>
+// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag<base_type = #[[$LOCAL_ATTR_1]], access_type = #[[$LOCAL_ATTR_1]], offset = 0>
// CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_tag<base_type = #[[$ATTR_6]], access_type = #[[$ATTR_6]], offset = 0>
// CHECK-LABEL: func.func @_QPtest1() attributes {noinline} {
// CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest1FinnerEy"}
@@ -57,8 +60,8 @@ func.func @_QPtest1() attributes {noinline} {
// CHECK: %[[VAL_5:.*]] = fir.dummy_scope : !fir.dscope
// CHECK: %[[VAL_6:.*]] = fir.declare %[[VAL_4]] dummy_scope %[[VAL_5]] {uniq_name = "_QFtest1FinnerEx"} : (!fir.ref<i32>, !fir.dscope) -> !fir.ref<i32>
// CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest1FinnerEy"} : (!fir.ref<i32>) -> !fir.ref<i32>
-// CHECK: fir.store %{{.*}} to %[[VAL_7]] : !fir.ref<i32>
-// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] : !fir.ref<i32>
+// CHECK: fir.store %{{.*}} to %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref<i32>
+// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref<i32>
// CHECK: fir.store %[[VAL_8]] to %[[VAL_6]] {tbaa = [#[[$ATTR_7]]]} : !fir.ref<i32>
// CHECK: fir.store %{{.*}} to %[[VAL_4]] {tbaa = [#[[$ATTR_8]]]} : !fir.ref<i32>
@@ -87,12 +90,16 @@ func.func @_QPtest2() attributes {noinline} {
// CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc<id = "any access", members = {<#[[$ATTR_1]], 0>}>
// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc<id = "any data access", members = {<#[[$ATTR_2]], 0>}>
// CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc<id = "any data access", members = {<#[[$ATTR_3]], 0>}>
+// CHECK: #[[$TARGETDATA_0:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[$ATTR_4]], 0>}>
// CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data", members = {<#[[$ATTR_4]], 0>}>
-// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[$ATTR_5]], 0>}>
+// CHECK: #[[$TARGETDATA_1:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[$ATTR_5]], 0>}>
+// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc<id = "allocated data", members = {<#[[$TARGETDATA_0]], 0>}>
// CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data/_QFtest2FinnerEx", members = {<#[[$ATTR_6]], 0>}>
-// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc<id = "global data", members = {<#[[TARGETDATA]], 0>}>
+// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc<id = "global data", members = {<#[[$TARGETDATA_1]], 0>}>
// CHECK: #[[$ATTR_10:.+]] = #llvm.tbaa_tag<base_type = #[[$ATTR_8]], access_type = #[[$ATTR_8]], offset = 0>
+// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc<id = "allocated data/_QFtest2FinnerEy", members = {<#[[$LOCAL_ATTR_0]], 0>}>
// CHECK: #[[$ATTR_9:.+]] = #llvm.tbaa_type_desc<id = "global data/_QMmEglob", members = {<#[[$ATTR_7]], 0>}>
+// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag<base_type = #[[$LOCAL_ATTR_1]], access_type = #[[$LOCAL_ATTR_1]], offset = 0>
// CHECK: #[[$ATTR_11:.+]] = #llvm.tbaa_tag<base_type = #[[$ATTR_9]], access_type = #[[$ATTR_9]], offset = 0>
// CHECK-LABEL: func.func @_QPtest2() attributes {noinline} {
// CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest2FinnerEy"}
@@ -102,7 +109,7 @@ func.func @_QPtest2() attributes {noinline} {
// CHECK: %[[VAL_6:.*]] = fir.dummy_scope : !fir.dscope
// CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_5]] dummy_scope %[[VAL_6]] {uniq_name = "_QFtest2FinnerEx"} : (!fir.ref<i32>, !fir.dscope) -> !fir.ref<i32>
// CHECK: %[[VAL_8:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest2FinnerEy"} : (!fir.ref<i32>) -> !fir.ref<i32>
-// CHECK: fir.store %{{.*}} to %[[VAL_8]] : !fir.ref<i32>
-// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] : !fir.ref<i32>
+// CHECK: fir.store %{{.*}} to %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref<i32>
+// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref<i32>
// CHECK: fir.store %[[VAL_9]] to %[[VAL_7]] {tbaa = [#[[$ATTR_10]]]} : !fir.ref<i32>
// CHECK: fir.store %{{.*}} to %[[VAL_5]] {tbaa = [#[[$ATTR_11]]]} : !fir.ref<i32>
diff --git a/flang/test/Transforms/tbaa2.fir b/flang/test/Transforms/tbaa2.fir
index 4678a1cd4a686..1429d0b420766 100644
--- a/flang/test/Transforms/tbaa2.fir
+++ b/flang/test/Transforms/tbaa2.fir
@@ -50,6 +50,7 @@
// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[ANY_DATA]], 0>}>
// CHECK: #[[ANY_ARG:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data", members = {<#[[ANY_DATA]], 0>}>
// CHECK: #[[ANY_GLBL:.+]] = #llvm.tbaa_type_desc<id = "global data", members = {<#[[TARGETDATA]], 0>}>
+// CHECK: #[[ANY_LOCAL:.+]] = #llvm.tbaa_type_desc<id = "allocated data", members = {<#[[TARGETDATA]], 0>}>
// CHECK: #[[ARG_LOW:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data/_QMmodFcalleeElow", members = {<#[[ANY_ARG]], 0>}>
// CHECK: #[[ANY_DIRECT:.+]] = #llvm.tbaa_type_desc<id = "direct data", members = {<#[[TARGETDATA]], 0>}>
// CHECK: #[[ARG_Z:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data/_QMmodFcalleeEz", members = {<#[[ANY_ARG]], 0>}>
@@ -61,21 +62,31 @@
// CHECK: #[[GLBL_ZSTART:.+]] = #llvm.tbaa_type_desc<id = "global data/_QMmodEzstart", members = {<#[[ANY_GLBL]], 0>}>
// CHECK: #[[GLBL_ZSTOP:.+]] = #llvm.tbaa_type_desc<id = "global data/_QMmodEzstop", members = {<#[[ANY_GLBL]], 0>}>
+// CHECK: #[[LOCAL1_ALLOC:.+]] = #llvm.tbaa_type_desc<id = "allocated data/_QMmodFcalleeEk", members = {<#[[ANY_LOCAL]], 0>}>
// CHECK: #[[GLBL_YSTART:.+]] = #llvm.tbaa_type_desc<id = "global data/_QMmodEystart", members = {<#[[ANY_GLBL]], 0>}>
// CHECK: #[[GLBL_YSTOP:.+]] = #llvm.tbaa_type_desc<id = "global data/_QMmodEystop", members = {<#[[ANY_GLBL]], 0>}>
+// CHECK: #[[LOCAL2_ALLOC:.+]] = #llvm.tbaa_type_desc<id = "allocated data/_QMmodFcalleeEj", members = {<#[[ANY_LOCAL]], 0>}>
// CHECK: #[[GLBL_XSTART:.+]] = #llvm.tbaa_type_desc<id = "global data/_QMmodExstart", members = {<#[[ANY_GLBL]], 0>}>
+// CHECK: #[[LOCAL3_ALLOC:.+]] = #llvm.tbaa_type_desc<id = "allocated data/_QMmodFcalleeEi", members = {<#[[ANY_LOCAL]], 0>}>
+// CHECK: #[[LOCAL4_ALLOC:.+]] = #llvm.tbaa_type_desc<id = "allocated data/_QMmodFcalleeEdxold", members = {<#[[ANY_LOCAL]], 0>}>
// CHECK: #[[DIRECT_A:.+]] = #llvm.tbaa_type_desc<id = "direct data/_QMmodEa", members = {<#[[ANY_DIRECT]], 0>}>
// CHECK: #[[DIRECT_B:.+]] = #llvm.tbaa_type_desc<id = "direct data/_QMmodEb", members = {<#[[ANY_DIRECT]], 0>}>
// CHECK: #[[GLBL_DYINV:.+]] = #llvm.tbaa_type_desc<id = "global data/_QMmodEdyinv", members = {<#[[ANY_GLBL]], 0>}>
+// CHECK: #[[LOCAL5_ALLOC:.+]] = #llvm.tbaa_type_desc<id = "allocated data/_QMmodFcalleeEdzinv", members = {<#[[ANY_LOCAL]], 0>}>
// CHECK: #[[GLBL_ZSTART_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[GLBL_ZSTART]], access_type = #[[GLBL_ZSTART]], offset = 0>
// CHECK: #[[GLBL_ZSTOP_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[GLBL_ZSTOP]], access_type = #[[GLBL_ZSTOP]], offset = 0>
+// CHECK: #[[LOCAL1_ALLOC_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[LOCAL1_ALLOC]], access_type = #[[LOCAL1_ALLOC]], offset = 0>
// CHECK: #[[GLBL_YSTART_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[GLBL_YSTART]], access_type = #[[GLBL_YSTART]], offset = 0>
// CHECK: #[[GLBL_YSTOP_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[GLBL_YSTOP]], access_type = #[[GLBL_YSTOP]], offset = 0>
+// CHECK: #[[LOCAL2_ALLOC_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[LOCAL2_ALLOC]], access_type = #[[LOCAL2_ALLOC]], offset = 0>
// CHECK: #[[GLBL_XSTART_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[GLBL_XSTART]], access_type = #[[GLBL_XSTART]], offset = 0>
+// CHECK: #[[LOCAL3_ALLOC_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[LOCAL3_ALLOC]], access_type = #[[LOCAL3_ALLOC]], offset = 0>
+// CHECK: #[[LOCAL4_ALLOC_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[LOCAL4_ALLOC]], access_type = #[[LOCAL4_ALLOC]], offset = 0>
// CHECK: #[[DIRECT_A_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[DIRECT_A]], access_type = #[[DIRECT_A]], offset = 0>
// CHECK: #[[DIRECT_B_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[DIRECT_B]], access_type = #[[DIRECT_B]], offset = 0>
// CHECK: #[[GLBL_DYINV_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[GLBL_DYINV]], access_type = #[[GLBL_DYINV]], offset = 0>
+// CHECK: #[[LOCAL5_ALLOC_TAG:.+]] = #llvm.tbaa_tag<base_type = #[[LOCAL5_ALLOC]], access_type = #[[LOCAL5_ALLOC]], offset = 0>
func.func @_QMmodPcallee(%arg0: !fir.box<!fir.array<?x?x?xf32>> {fir.bindc_name = "z"}, %arg1: !fir.box<!fir.array<?x?x?xf32>> {fir.bindc_name = "y"}, %arg2: !fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?xf32>>>> {fir.bindc_name = "low"}) {
%c2 = arith.constant 2 : index
@@ -277,7 +288,7 @@
// CHECK: %[[VAL_44:.*]] = fir.convert %[[VAL_43]] : (i32) -> index
// CHECK: %[[VAL_45:.*]] = fir.convert %[[VAL_42]] : (index) -> i32
// CHECK: %[[VAL_46:.*]]:2 = fir.do_loop %[[VAL_47:.*]] = %[[VAL_42]] to %[[VAL_44]] step %[[VAL_5]] iter_args(%[[VAL_48:.*]] = %[[VAL_45]]) -> (index, i32) {
-// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] : !fir.ref<i32>
+// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_49:.*]] = fir.load %[[VAL_18]] {tbaa = [#[[GLBL_YSTART_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_50:.*]] = arith.addi %[[VAL_49]], %[[VAL_6]] : i32
// CHECK: %[[VAL_51:.*]] = fir.convert %[[VAL_50]] : (i32) -> index
@@ -285,24 +296,20 @@
// CHECK: %[[VAL_53:.*]] = fir.convert %[[VAL_52]] : (i32) -> index
// CHECK: %[[VAL_54:.*]] = fir.convert %[[VAL_51]] : (index) -> i32
// CHECK: %[[VAL_55:.*]]:2 = fir.do_loop %[[VAL_56:.*]] = %[[VAL_51]] to %[[VAL_53]] step %[[VAL_5]] iter_args(%[[VAL_57:.*]] = %[[VAL_54]]) -> (index, i32) {
-// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] : !fir.ref<i32>
+// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_58:.*]] = fir.load %[[VAL_16]] {tbaa = [#[[GLBL_XSTART_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_59:.*]] = arith.addi %[[VAL_58]], %[[VAL_6]] : i32
// CHECK: %[[VAL_60:.*]] = fir.convert %[[VAL_59]] : (i32) -> index
// CHECK: %[[VAL_61:.*]] = fir.convert %[[VAL_60]] : (index) -> i32
// CHECK: %[[VAL_62:.*]]:2 = fir.do_loop %[[VAL_63:.*]] = %[[VAL_60]] to %[[VAL_4]] step %[[VAL_5]] iter_args(%[[VAL_64:.*]] = %[[VAL_61]]) -> (index, i32) {
-// TODO: local allocation assumed to always alias
-// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] : !fir.ref<i32>
+// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref<i32>
// load from box tagged in CodeGen
// CHECK: %[[VAL_65:.*]] = fir.load %[[VAL_35]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?xf32>>>>
-// TODO: local allocation assumed to always alias
-// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] : !fir.ref<i32>
+// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_67:.*]] = fir.convert %[[VAL_66]] : (i32) -> i64
-// TODO: local allocation assumed to always alias
-// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] : !fir.ref<i32>
+// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_69:.*]] = fir.convert %[[VAL_68]] : (i32) -> i64
-// TODO: local allocation assumed to always alias
-// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] : !fir.ref<i32>
+// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_71:.*]] = fir.convert %[[VAL_70]] : (i32) -> i64
// CHECK: %[[VAL_72:.*]] = fir.box_addr %[[VAL_65]] : (!fir.box<!fir.heap<!fir.array<?x?x?xf32>>>) -> !fir.heap<!fir.array<?x?x?xf32>>
// CHECK: %[[VAL_73:.*]]:3 = fir.box_dims %[[VAL_65]], %[[VAL_4]] : (!fir.box<!fir.heap<!fir.array<?x?x?xf32>>>, index) -> (index, index, index)
@@ -311,11 +318,10 @@
// CHECK: %[[VAL_76:.*]] = fir.shape_shift %[[VAL_73]]#0, %[[VAL_73]]#1, %[[VAL_74]]#0, %[[VAL_74]]#1, %[[VAL_75]]#0, %[[VAL_75]]#1 : (index, index, index, index, index, index) -> !fir.shapeshift<3>
// CHECK: %[[VAL_77:.*]] = fir.array_coor %[[VAL_72]](%[[VAL_76]]) %[[VAL_67]], %[[VAL_69]], %[[VAL_71]] : (!fir.heap<!fir.array<?x?x?xf32>>, !fir.shapeshift<3>, i64, i64, i64) -> !fir.ref<f32>
// CHECK: %[[VAL_78:.*]] = fir.load %[[VAL_77]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref<f32>
-// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] : !fir.ref<f32>
+// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] {tbaa = [#[[LOCAL4_ALLOC_TAG]]]} : !fir.ref<f32>
// load from box tagged in CodeGen
// CHECK: %[[VAL_79:.*]] = fir.load %[[VAL_8]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
-// TODO: local allocation assumed to always alias
-// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] : !fir.ref<i32>
+// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_81:.*]] = fir.convert %[[VAL_80]] : (i32) -> i64
// CHECK: %[[VAL_82:.*]] = fir.box_addr %[[VAL_79]] : (!fir.box<!fir.heap<!fir.array<?xf32>>>) -> !fir.heap<!fir.array<?xf32>>
// CHECK: %[[VAL_83:.*]]:3 = fir.box_dims %[[VAL_79]], %[[VAL_4]] : (!fir.box<!fir.heap<!fir.array<?xf32>>>, index) -> (index, index, index)
@@ -324,11 +330,9 @@
// CHECK: %[[VAL_86:.*]] = fir.load %[[VAL_85]] {tbaa = [#[[DIRECT_A_TAG]]]} : !fir.ref<f32>
// load from box
// CHECK: %[[VAL_87:.*]] = fir.load %[[VAL_35]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?xf32>>>>
-// load from local allocation
-// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] : !fir.ref<i32>
+// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_89:.*]] = fir.convert %[[VAL_88]] : (i32) -> i64
-// load from local allocation
-// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] : !fir.ref<i32>
+// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref<i32>
// CHECK: %[[VAL_91:.*]] = fir.convert %[[VAL_90]] : (i32) -> i64
// CHECK: %[[VAL_92:.*]] = fir.box_addr %[[VAL_87]] : (!fir.box<!fir.heap<!fir.array<?x?x?xf32>>>) -> !fir.heap<!fir.array<?x?x?xf32>>
// CHECK: %[[VAL_93:.*]]:3 = fir.box_dims %[[VAL_87]], %[[VAL_4]] : (!fir.box<!fir.heap<!fir.array<?x?x?xf32>>>, index) -> (index, index, index)
@@ -363,8 +367,7 @@
// CHECK: %[[VAL_121:.*]] = fir.load %[[VAL_120]] {tbaa = [#[[ARG_Y_TAG]]]} : !fir.ref<f32>
// CHECK: %[[VAL_122:.*]] = arith.subf %[[VAL_119]], %[[VAL_121]] fastmath<contract> : f32
// CHECK: %[[VAL_123:.*]] = fir.no_reassoc %[[VAL_122]] : f32
-// load from local allocation
-// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] : !fir.ref<f32>
+// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] {tbaa = [#[[LOCAL5_ALLOC_TAG]]]} : !fir.ref<f32>
// CHECK: %[[VAL_125:.*]] = arith.mulf %[[VAL_123]], %[[VAL_124]] fastmath<contract> : f32
// CHECK: %[[VAL_126:.*]] = arith.addf %[[VAL_115]], %[[VAL_125]] fastmath<contract> : f32
// CHECK: %[[VAL_127:.*]] = fir.no_reassoc %[[VAL_126]] : f32
@@ -373,30 +376,24 @@
// CHECK: fir.store %[[VAL_129]] to %[[VAL_97]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref...
[truncated]
|
FWIW I tried it out and this improves the performance of the atmosphere kernel from https://github.com/E3SM-Project/codesign-kernels by well over 30%. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. I tried SPEC 2017 ref and speed, plus some HPC applications and didn't see any incorrect results.
Thank you for fixing this longstanding bug Dominik. Please wait for approval from Slava before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the resolving the bug and enabling it by default!
Previously, a bug in the MemCptOpt LLVM IR pass caused issues with adding alias tags for locally allocated objects for Fortran code.
However, the bug has now been fixed ( #129537 ), and we can safely enable alias tags for these objects. This change should improve the accuracy of the alias analysis.