Skip to content

Remove prefer-inline for non-trivial Matrix functions #347

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 16, 2025
Merged

Conversation

kevmoo
Copy link
Collaborator

@kevmoo kevmoo commented May 13, 2025

No description provided.

@kevmoo kevmoo requested a review from rakudrama May 13, 2025 23:20
@kevmoo
Copy link
Collaborator Author

kevmoo commented May 13, 2025

CC @osa1 @mkustermann @mraleph

@coveralls
Copy link

coveralls commented May 13, 2025

Coverage Status

coverage: 26.388%. remained the same
when pulling 62f5681 on drop_inlining
into 0279cb8 on master.

osa1
osa1 previously requested changes May 14, 2025
Copy link
Member

@osa1 osa1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? These were added deliberately, see discussions in the PR that added them.

Edit: I saw the other ping. It's OK as long as we benchmark and measure. See my PR descriptions.

(Btw, my PR description should've been the commit message as I asked here #345 (comment) so that anyone who looks at the changes in their editor can see why they were made)

Copy link

@mkustermann mkustermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's important is that the old methods that do type checks e.g. translate, scale ... are inlined because often the type check code will then disappear and it will call the typed method.

What may be good (as mentioned in comments) to ensure if a public API functions delgates to another and passes constants, that those constants get folded away.

Apart from that, I agree that we don't have to force-inling the big implementation methods.

So generally lgtm

@kevmoo
Copy link
Collaborator Author

kevmoo commented May 14, 2025

See what I did here. I think this is better @mkustermann

@kevmoo
Copy link
Collaborator Author

kevmoo commented May 14, 2025

dart benchmark/matrix_bench.dart
MatrixMultiply(RunTime): 40.000315999447004 us.
SIMDMatrixMultiply(RunTime): 6.961272596818508 us.
VectorTransform(RunTime): 8.16805603992995 us.
SIMDVectorTransform(RunTime): 8.74327 us.
setViewMatrix(RunTime): 29.379677721933668 us.
aabb2Transform(RunTime): 61.48002828941299 us.
aabb2Rotate(RunTime): 44.13239467605324 us.
aabb3Transform(RunTime): 85.96119324181626 us.
aabb3Rotate(RunTime): 68.89850272545367 us.
Matrix3.determinant(RunTime): 2.73402125 us.
Matrix3.transform(Vector3)(RunTime): 194.45868089233753 us.
Matrix3.transform(Vector2)(RunTime): 188.2590231555264 us.
Matrix3.transposeMultiply(RunTime): 20.01067615658363 us.
Matrix4.translateByDoubleGeneric(RunTime): 5.354185 us.
Matrix4.translateByVector3Generic(RunTime): 5.305775 us.
Matrix4.translateByVector4Generic(RunTime): 6.551848758340293 us.
Matrix4.translateByDouble(RunTime): 5.4222275 us.
Matrix4.translateByVector3(RunTime): 5.401225 us.
Matrix4.translateByVector4(RunTime): 6.754482552414257 us.

➜  github/vector_math.dart git:(master) git checkout drop_inlining
Switched to branch 'drop_inlining'
Your branch is up to date with 'origin/drop_inlining'.

➜  github/vector_math.dart git:(drop_inlining) dart benchmark/matrix_bench.dart
MatrixMultiply(RunTime): 39.72650247862066 us.
SIMDMatrixMultiply(RunTime): 6.967740572583712 us.
VectorTransform(RunTime): 8.178169777287778 us.
SIMDVectorTransform(RunTime): 8.847343 us.
setViewMatrix(RunTime): 28.488177070937574 us.
aabb2Transform(RunTime): 61.502875065342394 us.
aabb2Rotate(RunTime): 44.00050023924486 us.
aabb3Transform(RunTime): 85.97221272229822 us.
aabb3Rotate(RunTime): 68.8850263716693 us.
Matrix3.determinant(RunTime): 2.73960125 us.
Matrix3.transform(Vector3)(RunTime): 194.14122211445198 us.
Matrix3.transform(Vector2)(RunTime): 188.34937494125387 us.
Matrix3.transposeMultiply(RunTime): 19.794754545897725 us.
Matrix4.translateByDoubleGeneric(RunTime): 5.305345 us.
Matrix4.translateByVector3Generic(RunTime): 5.306645 us.
Matrix4.translateByVector4Generic(RunTime): 6.560779738245589 us.
Matrix4.translateByDouble(RunTime): 5.3227675 us.
Matrix4.translateByVector3(RunTime): 5.3535375 us.
Matrix4.translateByVector4(RunTime): 6.567939472136188 us.

On the VM at least things are about the same, if not a bit faster with the new bits.

@@ -20,7 +20,7 @@ class Quad {
/// The third point of the quad.
Vector3 get point2 => _point2;

/// The third point of the quad.
/// The fourth point of the quad.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a missed typing thing here...

@pragma('vm:prefer-inline')
@pragma('dart2js:prefer-inline')
void translateByDouble(double tx, double ty, double tz, double tw) {
void translateByDouble(double tx, double ty, double tz, [double tw = 1.0]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional parameters compliate calling conventions, the function now needs to check every time whether tw was called or not. It's a good idea to avoid these in performance sensitive code, though I have no evidence that it matters in this particular case. We should benchmark and measure.

You can observe the difference by compiling to Wasm before and after and checking the code for this function. Make sure to call it with and without the optional parameter otherwise TFA will optimize it into a version that always takes tw or never takes it.

Alternatively you can inline it always, in which case you get optimized code with no "is tw passed" checks, as in each call site you'll either pass it or not, and generate the code based on that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I drop em? Please give me code 😄

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't I give you the code already in first round of review?

This usage of optional parameter isn't going to improve things - most likely the compiler will just make the call site implicitly pass 1.0 where before your changes this was done explicitly.

So it suffers from the same problem I outlined before, that even though a parameter is 1.0 constant, we don't inline, so we do unnecessary extra work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to revert this change then.

I'm confusing about the private function then.

@osa1
Copy link
Member

osa1 commented May 14, 2025

On the VM at least things are about the same, if not a bit faster with the new bits.

JIT mode doesn't matter AFAIK as we deploy in this mode. We care about AOT compiled, Wasm, and JS.

@kevmoo
Copy link
Collaborator Author

kevmoo commented May 14, 2025

On the VM at least things are about the same, if not a bit faster with the new bits.

JIT mode doesn't matter AFAIK as we deploy in this mode. We care about AOT compiled, Wasm, and JS.

We REALLY need a way to run benchmarks across all of our platforms easily...

@kevmoo kevmoo dismissed osa1’s stale review May 16, 2025 21:18

Addressed w/ discussion from Martin

Copy link

Package publishing

Package Version Status Publish tag (post-merge)
package:vector_math 2.2.0 ready to publish v2.2.0

Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation.

@kevmoo kevmoo removed the request for review from rakudrama May 16, 2025 21:32
@kevmoo kevmoo merged commit 13f185f into master May 16, 2025
6 checks passed
@kevmoo kevmoo deleted the drop_inlining branch May 16, 2025 21:33
copybara-service bot pushed a commit to dart-lang/sdk that referenced this pull request May 17, 2025
Revisions updated by `dart tools/rev_sdk_deps.dart`.

dartdoc (https://github.com/dart-lang/dartdoc/compare/95f4208..e38f392):
  e38f3921  2025-05-16  Sigurd Meldgaard  Remove analyzer_use_new_elements from analysis_options.yaml (dart-lang/dartdoc#4050)

protobuf (https://github.com/dart-lang/protobuf/compare/9bd149b..b7753f6):
  b7753f6  2025-05-16  Devon Carew  rev package:protobuf version (google/protobuf.dart#994)
  b5d20ff  2025-05-16  Ömer Sinan Ağacan  Update protobuf changelog with `#981` (google/protobuf.dart#995)

test (https://github.com/dart-lang/test/compare/55d1f9e..b9c59ea):
  b9c59ea0  2025-05-13  Liam Appelbe  Set a debug name for test isolates (dart-lang/test#2494)
  a1e295b4  2025-05-13  Liam Appelbe  Fix CI
  3c3878af  2025-05-13  Liam Appelbe  Include the test URI in the debug name
  90e64ec2  2025-05-13  Nate Bosch  Bump version
  d67c897b  2025-05-13  Liam Appelbe  Merge branch 'master' into isolate_debug_name
  e6d4877e  2025-05-12  Jacob MacDonald  release test packages (dart-lang/test#2495)
  4097e1be  2025-05-07  Liam Appelbe  revert workflow debugging
  7800c010  2025-05-07  Liam Appelbe  fmt
  455483b5  2025-05-07  Liam Appelbe  changelog
  c9b5b6fa  2025-05-07  Liam Appelbe  fmt
  39c4b31d  2025-05-07  Liam Appelbe  Set a debug name for test isolates

vector_math (https://github.com/google/vector_math.dart/compare/0279cb8..13f185f):
  13f185f  2025-05-16  Kevin Moore  Remove prefer-inline for non-trivial Matrix functions (google/vector_math.dart#347)

webdev (https://github.com/dart-lang/webdev/compare/1ea8462..5dbb30e):
  5dbb30eb  2025-05-12  Jessy Yameogo  Support hot reload over websocket (dart-lang/webdev#2616)

Change-Id: I85b001857d0864bd50390d82aa142938a4c530d4
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/428927
Commit-Queue: Devon Carew <[email protected]>
Reviewed-by: Kevin Moore <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants