Skip to content

8360000: RISC-V: implement aot #26101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

Conversation

Hamlin-Li
Copy link

@Hamlin-Li Hamlin-Li commented Jul 2, 2025

Hi,
Can you help to review this patch?

https://openjdk.org/jeps/483 introduced aot (class loading/linking) and subsequent prs introduced more features related to it, like preserve adapters(c2i/i2c) and runtime blobs in AOT code cache.

Riscv should support these features and resolve relative issues.

Test

jtreg

run tier1/2/3 and hotspot_cds tests, no new failure found compared to master jdk.

Performance

perf command to run the simplest Hello world java program:

  • (perf stat -r 100 ${JAVA_HOME}/bin/java -XX:AOTCache=$AOT_CACHE -cp $CLASS_PATH Hello > /dev/null) 2>&1 | grep elapsed

perf data:

  • (with patch): 0.181730 +- 0.000296 seconds time elapsed ( +- 0.16% )
  • (without patch): 0.196627 +- 0.000227 seconds time elapsed ( +- 0.12% )

Thanks


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26101/head:pull/26101
$ git checkout pull/26101

Update a local copy of the PR:
$ git checkout pull/26101
$ git pull https://git.openjdk.org/jdk.git pull/26101/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 26101

View PR using the GUI difftool:
$ git pr show -t 26101

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26101.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 2, 2025

👋 Welcome back mli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 2, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Jul 2, 2025

⚠️ @Hamlin-Li This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 2, 2025
@openjdk
Copy link

openjdk bot commented Jul 2, 2025

@Hamlin-Li The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Jul 2, 2025

Webrevs

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Thanks for working on enabling this feature.

illegal_instruction(Assembler::csr::time);
emit_int64((uintptr_t)msg);
emit_int64((uintptr_t)str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that our aarch64 counterpart puts the address of msg into r0 in #24740.

  // load msg into r0 so we can access it from the signal handler
  // ExternalAddress enables saving and restoring via the code cache
  lea(c_rarg0, ExternalAddress((address) str));

And fetches the address from r0 in PosixSignals::pd_hotspot_signal_handler:

        // A pointer to the message will have been placed in `r0`
        const char *detail_msg = (const char *)(uc->uc_mcontext.regs[0]);

I am not sure but I guess this is needed for the correct working of aot? Maybe we should do similar things.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My test shows that it's fine to keep the current way to pass the stop message to sig handler, in both dump and use time. Seems currently both ways work.
But to keep the consistency with other platforms, I'll change it too.

Copy link
Contributor

@adinn adinn Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do need to make this change consistent with how it was done for AArch64. The old compiler code embedded the address of the string into the instruction stream immediately after illegal instruction without a relocation. The signal handler used the faulting PC as a way to identify the location of the pointer. That won't work if the code gets saved and loaded as the string may not be at the same address. The new code marks the lea instruction as relocatable and adds the string address to the AOT external strings table. This allows the target of the lea to be relocated when the code is reloaded from the AOT cache.

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. Several minor comments remain after more closer look.

movptr(t1, entry_point, offset, t0);
jalr(t1, offset);
movptr(t1, RuntimeAddress(entry_point), t0);
jalr(t1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This movptr + jalr sequence could be further simplified into rt_call(entry_point), which could help save one add instruction.

@@ -4882,7 +4927,7 @@ void MacroAssembler::get_thread(Register thread) {
RegSet::range(x28, x31) + ra - thread;
push_reg(saved_regs, sp);

mv(t1, CAST_FROM_FN_PTR(address, Thread::current));
movptr(t1, ExternalAddress(CAST_FROM_FN_PTR(address, Thread::current)));
Copy link
Member

@RealFYang RealFYang Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be more consistent with other places in riscv code where we move an ExternalAddress into a register if we do: la(t1, ExternalAddress(CAST_FROM_FN_PTR(address, Thread::current))).

@@ -62,6 +63,11 @@ UncommonTrapBlob* OptoRuntime::generate_uncommon_trap_blob() {
ResourceMark rm;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more reasonable to move this ResourceMark rm; closer to its user CodeBuffer buffer(name, 2048, 1024);.
Similar for other changes in this file and src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp.

@adinn
Copy link
Contributor

adinn commented Jul 7, 2025

@Hamlin-Li @RealFYang I think it might be better to discuss this with on the leyden-dev mailing list before trying to implement the changes needed to match what has been done for AArch64 and x86_64. One good reason for caution is that the Leyden premain project is planning to add further code save/restore capabilities to mainline that have already been prototyped in the Leyden premain branch. So, if you enable AOT code cache initialization for RISCV then you will need to be able/ready to provide all the other parts of the implementation when they arrive.

It might be safer to implement what is needed in premain (or in a downstream clone) after discussing both what is needed and why it is needed with the Leyden devs. It would also help if you were to use the testing and benchmark programs that the project is using to check that the aot code cache is working correctly and actually boosting performance.

@Hamlin-Li
Copy link
Author

@adinn Thank you for the suggestion, I'll check the premain in leyden.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot [email protected] rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

3 participants