Skip to content

8350209: Preserve adapters in AOT cache #24740

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from

Conversation

vnkozlov
Copy link
Contributor

@vnkozlov vnkozlov commented Apr 17, 2025

JEP 483 preserves class information in AOT cache which helps Java startup performance.

We should also preserve adapters (i2c, c2i) to further improve performance of class linking where adapters are generated.

Short running Java application can see several percents improvement. I got 6% improvement when ran HelloWorld.java on Linux-x64 Ice Lake CPU (2.5Ghz):

(perf stat -r 100 java -XX:AOTCache=app.aotcache -cp hello.jar HelloWorld > /dev/null) 2>&1 | grep elapsed
         0.0299401 +- 0.0000504 seconds time elapsed  ( +-  0.17% )

(perf stat -r 100 java -XX:AOTCache=app.aotcache -XX:+UnlockDiagnosticVMOptions -XX:-AOTAdapterCaching -cp hello.jar HelloWorld > /dev/null) 2>&1 | grep elapsed
         0.0318654 +- 0.0000535 seconds time elapsed  ( +-  0.17% )

New diagnostic flags are introduced (use -XX:+UnlockDiagnosticVMOptions to unlock them):

-XX:+AOTAdapterCaching  - Enable or disable saving and restoring i2c2i adapters
-XX:AOTCodeMaxSize=10*M - buffer size in bytes for AOT code caching
-XX:+AbortVMOnAOTCodeFailure - Abort VM on the first occurrence of AOT code caching failure

By default AOTAdapterCaching is false and enabled ergonomically when -XX:AOTCache is specified.
This flag is ignored when AOTCache is not specified.

To use AOT adapters follow process described in JEP 483:

java -XX:AOTMode=record -XX:AOTConfiguration=app.aotconf -cp app.jar App
java -XX:AOTMode=create -XX:AOTConfiguration=app.aotconf -XX:AOTCache=app.aot -cp app.jar
java -XX:AOTCache=app.aot -cp app.jar App

There are several new UL flag combinations to trace the AOT code caching process:

-Xlog:aot+codecache+init -Xlog:aot+codecache+exit -Xlog:aot+codecache+stubs

@ashu-mehra is main author of changes. He implemented adapters caching.
I did main framework (AOTCodeCache class) for saving and loading AOT code.

Tested tier1-6,10, which includes tests with AOTClassLinking enabled. Also Xcomp,stress and JCK.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8350209: Preserve adapters in AOT cache (Enhancement - P3)

Reviewers

Contributors

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24740/head:pull/24740
$ git checkout pull/24740

Update a local copy of the PR:
$ git checkout pull/24740
$ git pull https://git.openjdk.org/jdk.git pull/24740/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24740

View PR using the GUI difftool:
$ git pr show -t 24740

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24740.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 17, 2025

👋 Welcome back kvn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 17, 2025

@vnkozlov This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8350209: Preserve adapters in AOT cache

Co-authored-by: Ashutosh Mehra <[email protected]>
Reviewed-by: vlivanov, asmehra, ihse, iklam

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 14 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Apr 17, 2025

@vnkozlov The following labels will be automatically applied to this pull request:

  • build
  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@vnkozlov
Copy link
Contributor Author

/contributor add @ashu-mehra

@openjdk
Copy link

openjdk bot commented Apr 17, 2025

@vnkozlov
Contributor Ashutosh Mehra <[email protected]> successfully added.

@vnkozlov vnkozlov marked this pull request as ready for review April 17, 2025 20:32
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 17, 2025
@mlbridge
Copy link

mlbridge bot commented Apr 17, 2025

@ashu-mehra
Copy link
Contributor

@vnkozlov I forgot to remove AOTCodeSection from code/aotCodeCache.hpp when I refactored the APIs. This struct is no longer used. Can you please remove it.

@vnkozlov
Copy link
Contributor Author

@vnkozlov I forgot to remove AOTCodeSection from code/aotCodeCache.hpp when I refactored the APIs. This struct is no longer used. Can you please remove it.

Done.

@vnkozlov
Copy link
Contributor Author

/cc hotspot-runtime
/cc hotspot-compiler

@openjdk
Copy link

openjdk bot commented Apr 21, 2025

@vnkozlov
The hotspot-runtime label was successfully added.

@vnkozlov
Copy link
Contributor Author

/label remove hotspot

@openjdk
Copy link

openjdk bot commented Apr 21, 2025

@vnkozlov
The hotspot-compiler label was successfully added.

@openjdk
Copy link

openjdk bot commented Apr 21, 2025

@vnkozlov
The hotspot label was successfully removed.

Comment on lines 868 to 869
bool CDSConfig::is_dumping_aot_code_enabled() {
return _is_dumping_aot_code_enabled;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other functions in CDSConfig don't have the _enabled() suffix. E.g., is_dumping_method_handles(). It doesn't mean we are doing method handle right now, but rather we have the ability to do so.

Since we rarely ask "am I in the middle of dumping X right now", I think adding _enabled() will be redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

@vnkozlov
Copy link
Contributor Author

Added missing changes for Aarch64 to always generate far jumps in AOT code. generate_i2c2i_adapters() use far_jump().
Note, for x64 ForceUnreachable flag is set to true in AOTCodeCache::initialize(). Unfortunately Aarch64 does not use this flag, so I need to add check to each query functions.

}

#ifndef PRODUCT
void AdapterHandlerLibrary::print_adapter_handler_info(AdapterHandlerEntry* handler, AdapterBlob* adapter_blob) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: pass tty explicitly as outputStream*

void AdapterHandlerLibrary::print_adapter_handler_info_on(outptutStream* st, AdapterHandlerEntry* handler, AdapterBlob* adapter_blob) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -1287,6 +1289,24 @@ class trampoline_stub_Relocation : public Relocation {

void pack_data_to(CodeSection * dest) override;
void unpack_data() override;
#if defined(AARCH64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate to see AArch64-specific code in shared code.

But I don't see anything besidespd_destination() and pd_set_destination() declarations. Where're their bodies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I copied it from leyden/premain but it is used only for trampoline jump relocation for calls in nmethods on Aarch64. They are not present in adapters. I remove this code.

/*
* AOT Code Cache collects code from Code Cache and corresponding metadata
* during application training run.
* In following "production" runs this code and data can me loaded into
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/me/be/

@@ -188,6 +204,8 @@ class CodeBlob {
nmethod* as_nmethod_or_null() const { return is_nmethod() ? (nmethod*) this : nullptr; }
nmethod* as_nmethod() const { assert(is_nmethod(), "must be nmethod"); return (nmethod*) this; }
CodeBlob* as_codeblob() const { return (CodeBlob*) this; }
AdapterBlob* as_adapter_blob() const { assert(is_adapter_blob(), "must be adapter blob"); return (AdapterBlob*) this; }
ExceptionBlob* as_exception_blob() const { assert(is_exception_stub(), "must be exception stub"); return (ExceptionBlob*) this; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExceptionBlob is C2-specific, but as_exception_blob() is unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

if (AbortVMOnAOTCodeFailure) {
vm_exit_during_initialization("Unable to use AOT Code Cache.", nullptr);
}
log_info(aot, codecache, init)("Unable to use AOT Code Cache.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be a warning instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

tty->print_cr("Unable to create AOT Code Cache.");
vm_abort(false);
}
log_info(aot, codecache, exit)("Unable to create AOT Code Cache.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (log_warning?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove exit_vm_on_ from these 2 methods to avoid confusion. And will add comment.

AOTAdapterCaching = false;
}

static void exit_vm_on_store_failure() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming (exit_vm_on_load_failure and exit_vm_on_store_failure) still look confusing to me. By default, they disable AOTAdapterCaching and issue a message, but the name strongly suggests that execution halts there:

if (!open_cache(is_dumping, is_using)) {
    if (is_using) {
      exit_vm_on_load_failure();
    } else {
      exit_vm_on_store_failure();
    }
    return;
  }

entry_offset[1] = handler->get_c2i_entry() - i2c_entry;
entry_offset[2] = handler->get_c2i_unverified_entry() - i2c_entry;
entry_offset[3] = handler->get_c2i_no_clinit_check_entry() - i2c_entry;
AOTCodeCache::store_code_blob(*adapter_blob, AOTCodeEntry::Adapter, id, name, AdapterHandlerEntry::ENTRIES_COUNT, entry_offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the intended behavior here when AOTCodeCache::store_code_blob fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If something happened when we tried to store blob (adapter in this case) into buffer (for example, if reserved for AOT code buffer is too small) the next will be called:

    set_failed();
    exit_vm_on_store_failure();

So we either abort VM based on the flag or issue warning (as you suggested) and continue execution.

set_failed() will prevent following attempts to cache blobs and prevent final dump any cached code into AOT cache:

 bool for_dump() const { return _for_dump && !_failed; }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an assert here?

bool success = AOTCodeCache::store_code_blob(...);
assert(success || !AOTCodeCache::is_dumping_adapters(), "");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert(success || !AOTCodeCache::is_dumping_adapters(), "");

This condtion !AOTCodeCache::is_dumping_adapters() in the assert is not very intuitive. I think what we need to assert is future stores in the aot code cache are disabled. So having a method like AOTCodeCache::is_store_disabled() would better communicate the intent here. But I don't mind keeping this condition for this initial PR. I will just suggest to add a better assert message like:

assert(success || !AOTCodeCache::is_dumping_adapters(), "storing of adapter must be disabled");

And I think we should also be setting _adapter_caching to false in report_load_failure and report_store_failure to be consistent, otherwise we may end up in a situation where AOTAdapterCaching is false but _adapter_caching is true. In fact, I feel we should only be setting _adapter_caching and not AOTAdapterCaching in report_load/store_failure because _adapter_caching is the flag used to gate store/load operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @ashu-mehra. You have good points. I will work on them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR I suggested !AOTCodeCache::is_dumping_adapters() because that's the guarding check for AOTCodeCache::store_code_blob() call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add assert message but keep check.

AOTAdapterCaching = false;
}

static void exit_vm_on_store_failure() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting AOTAdapterCaching to false on failure is simple indication that adapter caching is switched off for someone who will look on final state of flag.

But how does it affect execution? What's the intended behavior when a failure happens during store attempt? What are the consistency guarantees for AOT code cache during dumping in presense of store failures? What I see right now is that errors reported by AOTCodeCache::store_code_blob() are silently ignored. How does _cache notice a store failure during dumping phase?

@vnkozlov
Copy link
Contributor Author

vnkozlov commented May 1, 2025

How does _cache notice a store failure during dumping phase?
set_failed() sets flags AOTCodeCache::_failed. Which is checked when we trying to get _cache pointer:

  AOTCodeCache* cache = open_for_use();
  if (cache == nullptr) {
    return nullptr;
  }
  
  or
  
  AOTCodeCache* cache = open_for_dump();
  if (cache == nullptr) {
    return false;
  }

and

AOTCodeCache* AOTCodeCache::open_for_use() {
  if (AOTCodeCache::is_on_for_use()) {
    return AOTCodeCache::cache();
  }
  return nullptr;
}

AOTCodeCache* AOTCodeCache::open_for_dump() {
  if (AOTCodeCache::is_on_for_dump()) {
    AOTCodeCache* cache = AOTCodeCache::cache();
    cache->clear_lookup_failed(); // Reset bit <<<<<<<<<< this is different failure: we did not find one entry
    return cache;
  }
  return nullptr;
}

and 

  static bool is_on_for_use()  { return is_on() && _cache->for_use(); }
  static bool is_on_for_dump() { return is_on() && _cache->for_dump(); }

  bool for_use()  const { return _for_use  && !_failed; }
  bool for_dump() const { return _for_dump && !_failed; }

@vnkozlov
Copy link
Contributor Author

vnkozlov commented May 1, 2025

Thank you, @iwanowww, for review. I think I addressed your comments and answered your question.

Copy link
Contributor

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Minor suggestions follow.

entry_offset[1] = handler->get_c2i_entry() - i2c_entry;
entry_offset[2] = handler->get_c2i_unverified_entry() - i2c_entry;
entry_offset[3] = handler->get_c2i_no_clinit_check_entry() - i2c_entry;
AOTCodeCache::store_code_blob(*adapter_blob, AOTCodeEntry::Adapter, id, name, AdapterHandlerEntry::ENTRIES_COUNT, entry_offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an assert here?

bool success = AOTCodeCache::store_code_blob(...);
assert(success || !AOTCodeCache::is_dumping_adapters(), "");

#include <sys/stat.h>
#include <errno.h>

static void load_failure() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe on_(load|store)_failure() or report_(load|store)_failure()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use report_

@vnkozlov vnkozlov requested a review from ashu-mehra May 1, 2025 02:54
@vnkozlov
Copy link
Contributor Author

vnkozlov commented May 1, 2025

Thank you, @iwanowww. I need 2 re-reviews so I asked @ashu-mehra to review again.
Meanwhile I am re-running testing with latest mainline code.

@vnkozlov
Copy link
Contributor Author

vnkozlov commented May 1, 2025

I reverted back log_warning() to log_info() in report_*_failure() methods.

I removed AOTCodeCache::_adapter_caching field and use AOTAdapterCaching flag check instead. Setting the flag to false in case of a failure make more sense now.

@vnkozlov vnkozlov requested a review from iwanowww May 1, 2025 18:58
Copy link
Contributor

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@vnkozlov
Copy link
Contributor Author

vnkozlov commented May 1, 2025

/reviewers 1

@openjdk
Copy link

openjdk bot commented May 1, 2025

@vnkozlov
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 1 (with at least 1 Reviewer).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 1, 2025
Copy link
Contributor

@ashu-mehra ashu-mehra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vnkozlov
Copy link
Contributor Author

vnkozlov commented May 1, 2025

Thank you again, @iwanowww and @ashu-mehra

@vnkozlov
Copy link
Contributor Author

vnkozlov commented May 1, 2025

/integrate

@openjdk
Copy link

openjdk bot commented May 1, 2025

Going to push as commit aae2bb6.
Since your change was applied there have been 14 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 1, 2025
@openjdk openjdk bot closed this May 1, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 1, 2025
@openjdk
Copy link

openjdk bot commented May 1, 2025

@vnkozlov Pushed as commit aae2bb6.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

6 participants