-
Notifications
You must be signed in to change notification settings - Fork 786
Always inline trivial calls that always shrink #7669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
A "trivial call" is a function that just calls another function. (func $foo ... (call $target ...)) Currently we inline these functions always only when not optimizing for code size. When optimizing for code size, these functions can always be inlined when 1. The arguments to `$target` are all function argument locals. 2. The locals are not used more than once 3. The locals are used in the same order they appear in the function arguments. When these hold, inlining `$foo` never increases code size as it doesn't cause introducing more locals at call sites. Improve `FunctionInfo` type and `FunctionInfoScanner` to annotate functions with "trivial call" information that also contains whether inlining shrinks code size. If a function shrinks when inlined always inline it even with `-Os`. Otherwise inline it as before, i.e. when not optimizing for code size.
;; CHECK-NEXT: (local.get $17) | ||
;; CHECK-NEXT: ) | ||
;; CHECK-NEXT: ) | ||
;; CHECK-NEXT: ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little bit verbose. Should we run a simple pass to clean this up? (eliminate blocks and locals)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have --inlining-optimizing
for that, and that is what the normal pass pipeline uses. --inlining
without optimizing is mainly useful for debugging and tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. --inlining-optimizing
simplifies the wat in this test quite a bit. Should I use it in these tests, or do we prefer testing smaller units of code with just --inlining
?
src/passes/Inlining.cpp
Outdated
// Whether a function just calls another function with only `local.get`s as | ||
// arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Whether a function just calls another function with only `local.get`s as | |
// arguments. | |
// Whether a function just calls another function with only trivial | |
// (`local.get` or constant) arguments. |
I think handling constants is important too, and the old code did so (by the heuristic that not having children - that's not precise, but it's probably good enough?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that increase code size when you have more than one call site, by duplicating the constants at the call sites?
Where did the old code do this? I'm trying
(module
(type $0 (func (param i32 i32 i32)))
(type $1 (func))
(type $2 (func))
(import "env" "foo" (func $imported-foo (type $0) (param i32 i32 i32)))
(func $call-foo (type $1)
(call $imported-foo
(i32.const 1)
(i32.const 2)
(i32.const 3)))
(func $main (type $2)
(call $call-foo)
(call $call-foo)
(call $call-foo)))
with wasm-opt -all --inlining -S -o - < test.wat
but it doesn't inline call-foo
. (trying with main
branch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see the old code where we check the size now. However I'm unable to make wasm-opt inline function calls with const arguments, using the main
branch. I'll investigate.
I think the point about binary size increase still applies though..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it out, it actually works the same way as before. I updated the comments to clarify, and added a test. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unresolving as I don't follow. Are you saying we did not inline constants before, that is, the old comment about us doing so was wrong? If so, where was the bug, as we measured size, and the size of a const is 1, same as local.get
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old code that made us inline const arguments is still there and running as before, on lines 268-271 on the new file. I added a test to check const argument inlining. It works the same way before and after this PR.
The way it works in new code is that if we know that the code definitely shrinks when inlined, we have an early return. If we don't know that, then the old code that checks the size still runs as before and marks the function as MayNotShrink
. In which case we only inline with -O3
as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds good. In that case, the comments on new lines 77-83 can be updated to reflect that not only local.gets
are allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and also where this comment is, 70-71.
} | ||
// Trivial calls are already handled. Inline if | ||
// 1. The function doesn't have calls, and | ||
// 2. The function doesn't have loops, or we allow inlining with loops. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded the old comment block here but it just repeats the single line of code below, so I think we can also just drop this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your new comment is useful. Let's keep it.
@kripken It looks like this revealed an existing undefined behavior in this line: binaryen/src/passes/OptimizeInstructions.cpp Line 3612 in 8470f1b
Apparently Lines 267 to 270 in 8470f1b
Is this a blocker? It looks like an existing issue that |
Currently a "trivial call" is a function that just calls another function.
Where the arguments to
$target
are all flat instructions likelocal.get
, orconsts.
Currently we inline these functions always only when not optimizing for code
size.
When optimizing for code size, these functions can always be inlined when
$target
are all function argument locals.$foo
's signature.When these hold, inlining
$foo
never increases code size as it doesn't causeintroducing more locals (or
drop
s etc.) at the call sites.$foo
above when these hold looks like this:Update
FunctionInfo
type andFunctionInfoScanner
to annotate functions withmore detailed "trivial call" information that also contains whether inlining
shrinks code size.
If a function shrinks when inlined always inline it even with
-Os
.Otherwise inline it as before, i.e. when not optimizing for code size.