-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[BOLT] Drop converting return profile to call cont #129477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
aaupov
merged 6 commits into
main
from
users/aaupov/spr/bolt-drop-converting-return-profile-to-call-cont
May 7, 2025
Merged
[BOLT] Drop converting return profile to call cont #129477
aaupov
merged 6 commits into
main
from
users/aaupov/spr/bolt-drop-converting-return-profile-to-call-cont
May 7, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Created using spr 1.3.4
@llvm/pr-subscribers-bolt Author: Amir Ayupov (aaupov) ChangesThe workaround was not implemented for BAT case, and it is no longer Test Plan: updated callcont-fallthru.s Full diff: https://github.com/llvm/llvm-project/pull/129477.diff 3 Files Affected:
diff --git a/bolt/include/bolt/Profile/DataAggregator.h b/bolt/include/bolt/Profile/DataAggregator.h
index 56eb463fc98fc..bfb9a97eb9dc7 100644
--- a/bolt/include/bolt/Profile/DataAggregator.h
+++ b/bolt/include/bolt/Profile/DataAggregator.h
@@ -197,10 +197,6 @@ class DataAggregator : public DataReader {
BoltAddressTranslation *BAT{nullptr};
- /// Whether pre-aggregated profile needs to convert branch profile into call
- /// to continuation fallthrough profile.
- bool NeedsConvertRetProfileToCallCont{false};
-
/// Update function execution profile with a recorded trace.
/// A trace is region of code executed between two LBR entries supplied in
/// execution order.
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index d20626bd5062f..8a0e54020f56b 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -720,23 +720,6 @@ bool DataAggregator::doBranch(uint64_t From, uint64_t To, uint64_t Count,
: isReturn(Func.disassembleInstructionAtOffset(Offset));
};
- // Returns whether \p Offset in \p Func may be a call continuation excluding
- // entry points and landing pads.
- auto checkCallCont = [&](const BinaryFunction &Func, const uint64_t Offset) {
- // No call continuation at a function start.
- if (!Offset)
- return false;
-
- // FIXME: support BAT case where the function might be in empty state
- // (split fragments declared non-simple).
- if (!Func.hasCFG())
- return false;
-
- // The offset should not be an entry point or a landing pad.
- const BinaryBasicBlock *ContBB = Func.getBasicBlockAtOffset(Offset);
- return ContBB && !ContBB->isEntryPoint() && !ContBB->isLandingPad();
- };
-
// Mutates \p Addr to an offset into the containing function, performing BAT
// offset translation and parent lookup.
//
@@ -749,35 +732,26 @@ bool DataAggregator::doBranch(uint64_t From, uint64_t To, uint64_t Count,
Addr -= Func->getAddress();
- bool IsRetOrCallCont =
- IsFrom ? checkReturn(*Func, Addr) : checkCallCont(*Func, Addr);
+ bool IsRet = IsFrom && checkReturn(*Func, Addr);
if (BAT)
Addr = BAT->translate(Func->getAddress(), Addr, IsFrom);
BinaryFunction *ParentFunc = getBATParentFunction(*Func);
if (!ParentFunc)
- return std::pair{Func, IsRetOrCallCont};
+ return std::pair{Func, IsRet};
if (IsFrom)
NumColdSamples += Count;
- return std::pair{ParentFunc, IsRetOrCallCont};
+ return std::pair{ParentFunc, IsRet};
};
- uint64_t ToOrig = To;
auto [FromFunc, IsReturn] = handleAddress(From, /*IsFrom*/ true);
- auto [ToFunc, IsCallCont] = handleAddress(To, /*IsFrom*/ false);
+ auto [ToFunc, _] = handleAddress(To, /*IsFrom*/ false);
if (!FromFunc && !ToFunc)
return false;
- // Record call to continuation trace.
- if (NeedsConvertRetProfileToCallCont && FromFunc != ToFunc &&
- (IsReturn || IsCallCont)) {
- LBREntry First{ToOrig - 1, ToOrig - 1, false};
- LBREntry Second{ToOrig, ToOrig, false};
- return doTrace(First, Second, Count);
- }
// Ignore returns.
if (IsReturn)
return true;
@@ -1229,13 +1203,10 @@ std::error_code DataAggregator::parseAggregatedLBREntry() {
// return profile into call to continuation fall-through.
auto Type = AggregatedLBREntry::BRANCH;
if (TypeOrErr.get() == "B") {
- NeedsConvertRetProfileToCallCont = true;
Type = AggregatedLBREntry::BRANCH;
} else if (TypeOrErr.get() == "F") {
- NeedsConvertRetProfileToCallCont = true;
Type = AggregatedLBREntry::FT;
} else if (TypeOrErr.get() == "f") {
- NeedsConvertRetProfileToCallCont = true;
Type = AggregatedLBREntry::FT_EXTERNAL_ORIGIN;
} else if (TypeOrErr.get() == "T") {
// Trace is expanded into B and [Ff]
diff --git a/bolt/test/X86/callcont-fallthru.s b/bolt/test/X86/callcont-fallthru.s
index 95cb4c5fc2df4..50eac3bcd73eb 100644
--- a/bolt/test/X86/callcont-fallthru.s
+++ b/bolt/test/X86/callcont-fallthru.s
@@ -4,30 +4,10 @@
# RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so
## Link against a DSO to ensure PLT entries.
# RUN: %clangxx %cxxflags %s %t.so -o %t -Wl,-q -nostdlib
-# RUN: link_fdata %s %t %t.pa1 PREAGG1
-# RUN: link_fdata %s %t %t.pa2 PREAGG2
-# RUN: link_fdata %s %t %t.pa3 PREAGG3
# RUN: link_fdata %s %t %t.pat PREAGGT1
# RUN: link_fdata %s %t %t.pat2 PREAGGT2
-## Check normal case: fallthrough is not LP or secondary entry.
# RUN: llvm-strip --strip-unneeded %t -o %t.strip
-# RUN: llvm-objcopy --remove-section=.eh_frame %t.strip %t.noeh
-# RUN: llvm-bolt %t.strip --pa -p %t.pa1 -o %t.out \
-# RUN: --print-cfg --print-only=main | FileCheck %s
-
-## Check that getFallthroughsInTrace correctly handles a trace starting at plt
-## call continuation
-# RUN: llvm-bolt %t.strip --pa -p %t.pa2 -o %t.out2 \
-# RUN: --print-cfg --print-only=main | FileCheck %s --check-prefix=CHECK2
-
-## Check that we don't treat secondary entry points as call continuation sites.
-# RUN: llvm-bolt %t --pa -p %t.pa3 -o %t.out \
-# RUN: --print-cfg --print-only=main | FileCheck %s --check-prefix=CHECK3
-
-## Check fallthrough to a landing pad case.
-# RUN: llvm-bolt %t.strip --pa -p %t.pa3 -o %t.out \
-# RUN: --print-cfg --print-only=main | FileCheck %s --check-prefix=CHECK3
## Check pre-aggregated traces attach call continuation fallthrough count
# RUN: llvm-bolt %t.noeh --pa -p %t.pat -o %t.out \
@@ -67,7 +47,6 @@ main:
movq %rsi, -0x10(%rbp)
callq puts@PLT
## Target is an external-origin call continuation
-# PREAGG1: B X:0 #Ltmp1# 2 0
# PREAGGT1: T X:0 #Ltmp1# #Ltmp4_br# 2
# CHECK: callq puts@PLT
# CHECK-NEXT: count: 2
@@ -87,18 +66,15 @@ Ltmp4_br:
movl $0xa, -0x18(%rbp)
callq foo
## Target is a binary-local call continuation
-# PREAGG1: B #Lfoo_ret# #Ltmp3# 1 0
# PREAGGT1: T #Lfoo_ret# #Ltmp3# #Ltmp3_br# 1
# CHECK: callq foo
# CHECK-NEXT: count: 1
## PLT call continuation fallthrough spanning the call
-# PREAGG2: F #Ltmp1# #Ltmp3_br# 3
# CHECK2: callq foo
# CHECK2-NEXT: count: 3
## Target is a secondary entry point (unstripped) or a landing pad (stripped)
-# PREAGG3: B X:0 #Ltmp3# 2 0
# PREAGGT2: T X:0 #Ltmp3# #Ltmp3_br# 2
# CHECK3: callq foo
# CHECK3-NEXT: count: 0
|
Created using spr 1.3.4
Created using spr 1.3.4
maksfb
approved these changes
May 7, 2025
GeorgeARM
pushed a commit
to GeorgeARM/llvm-project
that referenced
this pull request
May 7, 2025
The workaround was not implemented for BAT case, and it is no longer needed with pre-aggregated traces, alternatively, the effect can be achieved with `infer-fall-throughs` with old pre-aggregated format (branches + ranges). Test Plan: updated callcont-fallthru.s
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The workaround was not implemented for BAT case, and it is no longer
needed with pre-aggregated traces, alternatively, the effect can be
achieved with
infer-fall-throughs
with old pre-aggregated format(branches + ranges).
Test Plan: updated callcont-fallthru.s