Skip to content

8287788: Implement a better allocator for downcalls #23142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

mernst-github
Copy link
Contributor

@mernst-github mernst-github commented Jan 15, 2025

Certain signatures for foreign function calls (e.g. HVA return by value) require allocation of an intermediate buffer to adapt the FFM's to the native stub's calling convention. In the current implementation, this buffer is malloced and freed on every FFM invocation, a non-negligible overhead.

Sample stack trace:

   java.lang.Thread.State: RUNNABLE
	at jdk.internal.misc.Unsafe.allocateMemory0(java.base@25-ea/Native Method)
...
	at jdk.internal.foreign.abi.SharedUtils.newBoundedArena(java.base@25-ea/SharedUtils.java:386)
	at jdk.internal.foreign.abi.DowncallStub/0x000001f001084c00.invoke(java.base@25-ea/Unknown Source)
...
	at java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base@25-ea/Invokers$Holder)

To alleviate this, this PR implements a per carrier-thread stacked allocator.

Performance (MBA M3):

Before:
Benchmark                    Mode  Cnt   Score   Error  Units
CallOverheadByValue.byPtr    avgt   10   3.333 ? 0.152  ns/op
CallOverheadByValue.byValue  avgt   10  33.892 ? 0.034  ns/op

After:
Benchmark                    Mode  Cnt  Score   Error  Units
CallOverheadByValue.byPtr    avgt   30  3.311 ? 0.034  ns/op
CallOverheadByValue.byValue  avgt   30  6.143 ? 0.053  ns/op

-prof gc also shows that the new call path is fully scalar-replaced vs 160 byte/call before.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8287788: Implement a better allocator for downcalls (Enhancement - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23142/head:pull/23142
$ git checkout pull/23142

Update a local copy of the PR:
$ git checkout pull/23142
$ git pull https://git.openjdk.org/jdk.git pull/23142/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23142

View PR using the GUI difftool:
$ git pr show -t 23142

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23142.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper bridgekeeper bot added the oca Needs verification of OCA signatory status label Jan 15, 2025
@bridgekeeper
Copy link

bridgekeeper bot commented Jan 15, 2025

Hi @mernst-github, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user mernst-github" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

@openjdk
Copy link

openjdk bot commented Jan 15, 2025

@mernst-github This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8287788: Implement a better allocator for downcalls

Reviewed-by: jvernee

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 119 new commits pushed to the master branch:

  • afcc2b0: 8348562: ZGC: segmentation fault due to missing node type check in barrier elision analysis
  • 175e58b: 8332980: [IR Framework] Add option to measure IR rule processing time
  • b8c68c0: 8348207: Linux PPC64 PCH build broken after JDK-8347909
  • 70eec90: 8338303: Linux ppc64le with toolchain clang - detection failure in early JVM startup
  • a1fd5f4: 8348554: Enhance Linux kernel version checks
  • 002679a: 8347065: Add missing @SPEC tags for Java Security Standard Algorithm Names
  • 99002e4: 8318098: Update jfr tests to replace keyword jfr with vm.flagless
  • 5431668: 8348212: Need to add warn() step to JavacTaskImpl after JDK-8344148
  • 1d2eb2f: 8299504: Resolve uses and provides at run time if the service is optional and missing
  • f446cef: 8343962: [REDO] Move getChars to DecimalDigits
  • ... and 109 more: https://git.openjdk.org/jdk/compare/8460072f9ddcec5d1f86e3c4de3d1457771b805c...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@JornVernee, @mcimadamore) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk
Copy link

openjdk bot commented Jan 15, 2025

@mernst-github The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk
Copy link

openjdk bot commented Jan 16, 2025

@mernst-github this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout mernst/cache-segments
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Jan 16, 2025
# Conflicts:
#	src/java.base/share/classes/jdk/internal/foreign/abi/SharedUtils.java
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Jan 16, 2025
@mernst-github
Copy link
Contributor Author

/label remove core-libs
/label add panama

@openjdk
Copy link

openjdk bot commented Jan 16, 2025

@mernst-github
The core-libs label was successfully removed.

@mernst-github mernst-github marked this pull request as draft January 16, 2025 09:29
@openjdk
Copy link

openjdk bot commented Jan 16, 2025

@mernst-github
The label panama is not a valid label.
These labels are valid:

  • graal
  • serviceability
  • hotspot
  • hotspot-compiler
  • ide-support
  • kulla
  • i18n
  • shenandoah
  • jdk
  • javadoc
  • security
  • hotspot-runtime
  • jmx
  • build
  • nio
  • client
  • core-libs
  • compiler
  • net
  • hotspot-gc
  • hotspot-jfr

@mernst-github
Copy link
Contributor Author

/label core-libs

@openjdk
Copy link

openjdk bot commented Jan 16, 2025

@mernst-github
The core-libs label was successfully added.

- 2 element cache to support upcall
- avoid shared session

With this, escape analysis kicks in and the "BoundedArena" seems to get scalar-replaced. The call becomes allocation-free.

Introducing a confined session per call kills this and costs ~50%.
Should really just protect the CTL handling.
alloc/free should happen outside (even if it practically doesn't matter)
no need to recreate these
simplifies code and no longer needs @ForceInlining
Careful massaging to get scalar replacement. Triggers fastdebug assertion, though.
@JornVernee
Copy link
Member

The TestBufferStack test is timing out in our CI on fastdebug builds when running with -Duse.JTREG_TEST_THREAD_FACTORY=Virtual -XX:-VerifyContinuations, I'm not sure what the issue is, as there's basically no output from the test, except the timeout message, and the warning about illegal native access. We need to fix that before integrating though. I'll see if I can uncover some more info.

@mernst-github
Copy link
Contributor Author

test is timing out

I think it's the stress test, the starting Thread sleeps and never gets rescheduled because it's starved out by the others. I can repro. Need to strategically place yields or interrupt the competitors.

@JornVernee
Copy link
Member

JornVernee commented Jan 23, 2025

Need to strategically place yields or interrupt the competitors.

Maybe the number of iterations could be capped?

@JornVernee
Copy link
Member

Thanks for the fix, I'll submit another CI job

Copy link
Member

@JornVernee JornVernee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests came back green. I think this is good to go.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 27, 2025
@mernst-github
Copy link
Contributor Author

Thanks, looks ready to me, too. This is my first contribution - as far as I understand, any further concerns can still be voiced between '/integrate' and '/sponsor', so I'll go ahead 😱

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jan 27, 2025
@openjdk
Copy link

openjdk bot commented Jan 27, 2025

@mernst-github
Your change (at version c314d6a) is now ready to be sponsored by a Committer.

@JornVernee
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented Jan 27, 2025

Going to push as commit 8cc1304.
Since your change was applied there have been 127 commits pushed to the master branch:

  • 039e73f: 8346736: Java Security Standard Algorithm Names spec should include key algorithm names
  • aba60a9: 8189441: Define algorithm names for keys derived from KeyAgreement
  • 03106eb: 8344119: CUPSPrinter does not respect PostScript printer definition specification in case of reading ImageableArea values from PPD files
  • ad01dfb: 8346920: Serial: Support allocation in old generation when heap is almost full
  • 1d8ccb8: 8342465: Improve API documentation for java.lang.classfile
  • 7d6055a: 8348429: Update cross-compilation devkits to Fedora 41/gcc 13.2
  • f1e0797: 8348586: Optionally silence make warnings about non-control variables
  • ffeb9b5: 8342807: Update links in java.base to use https://
  • afcc2b0: 8348562: ZGC: segmentation fault due to missing node type check in barrier elision analysis
  • 175e58b: 8332980: [IR Framework] Add option to measure IR rule processing time
  • ... and 117 more: https://git.openjdk.org/jdk/compare/8460072f9ddcec5d1f86e3c4de3d1457771b805c...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 27, 2025
@openjdk openjdk bot closed this Jan 27, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jan 27, 2025
@openjdk
Copy link

openjdk bot commented Jan 27, 2025

@JornVernee @mernst-github Pushed as commit 8cc1304.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@mcimadamore
Copy link
Contributor

Thanks a lot for the fine work!!

@JornVernee
Copy link
Member

Sorry to say, but the implementation seems to have a bug that is causing occasional heap corruption, which is being caught by mac's malloc guards. Since this is failing in tier 1, and the issue seems like it will take some time to investigate and fix, I'm backing out the change for now.

@JornVernee
Copy link
Member

As to what's causing the issue, I wonder if the stack buffer is being freed while it is still in use, somehow. We reinterpret the segment from the shared arena, maybe the frame should keep that arena alive until it is closed itself (with a Reference.reachabilityFence). Looking at the code, the shared per-thread arena should be strongly reachable though, so I'm not sure if that is a plausible explanation...

@mernst-github
Copy link
Contributor Author

Thanks for the backout. That's java/foreign/TestBufferStack.java, let me try if I can force anything to repro.

@mernst-github
Copy link
Contributor Author

mernst-github commented Feb 1, 2025

Tried for a few hours to repro with various approaches, to no avail.

@mernst-github mernst-github deleted the mernst/cache-segments branch February 2, 2025 18:29
@JornVernee
Copy link
Member

JornVernee commented Feb 3, 2025

@mernst-github

It should be possible to reproduce on either an x64 or AArch64 mac machine using:

make test TEST=test/jdk/java/foreign/TestBufferStack.java JTREG="REPEAT_COUNT=1000"

It will take a while to run/fail. The failure seems to be detected by the malloc implementation on mac. The process fails with exit code 134 without any output, but the core file should be usable.

@JornVernee
Copy link
Member

P.S. I have not tried this yet, but some of the malloc debugging flags available on mac may help trigger this faster: https://developer.apple.com/library/archive/documentation/Performance/Conceptual/ManagingMemory/Articles/MallocDebug.html

@JornVernee
Copy link
Member

@mernst-github We have investigated this issue further, and have narrowed it down to an issue with malloc/free. See: https://bugs.openjdk.org/browse/JDK-8350455

At this point, I think it's safe to try and re-integrate this patch, but the stress test should be separated out into a separate jtreg test, and then added to the problem list, with a reference to the above issue.

Note that we already have an issue to redo this patch: https://bugs.openjdk.org/browse/JDK-8349146 Is that something you'd be interested in taking on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs [email protected] integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants