-
Notifications
You must be signed in to change notification settings - Fork 6
Improve option invalidation granularity and reduce unnecessary cache invalidations #653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Improve option invalidation granularity and reduce unnecessary cache invalidations #653
Conversation
🦋 Changeset detectedLatest commit: e835b22 The changes in this PR will be included in the next version bump. This PR includes changesets to release 91 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
return String(value); | ||
} | ||
|
||
if (typeof value === 'object') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hashing every element (which is expensive for large collections), it creates a representative hash
using only the array's length and three sample elements (first, middle, last). This provides a reliable hash
signature that changes when significant array properties change, while avoiding the performance penalty of
processing every element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does that work when an element that's not in the hash changes that doesn't affect the length? Are these arrays of options from config getting passed here? Do we actually have any large collections in practice?
We hash multi-megabyte bundles, I'd argue that hashing a string with hundreds of options concatenated can't be too expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does that work when an element that's not in the hash changes that doesn't affect the length
Great point, TBH this didn't cross my mind at the time, but seems quite obvious now you mention 😬.
Since we don't really have large collections in practice either, I'll opt to remove this 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we should hash everything
return invalidatedKeys; | ||
} | ||
|
||
// New behavior with granular path tracking and blocklist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option invalidation blocklist allows users to specify configuration options that shouldn't trigger rebuilds when changed. This improves performance by preventing unnecessary invalidations for options that rarely affect build output.
The implementation:
- Uses a default blocklist for options known to be non-impactful (e.g., 'instanceId')
- Supports user-configured blocklists through options.optionInvalidation.blocklist
- Supports prefix-based blocking with wildcard notation (e.g., 'metrics.*')
- Optionally tracks skipped invalidations for reporting and diagnostics
…en/improve-granular-option-invalidation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one! Some minor feedback, but mostly looks good! (TBF I didn't go to deep on the hundreds of lines of new tests... but appreciate they're there!)
// Check if we should track object enumeration operations | ||
// This is controlled by a feature flag - the previous behavior was to track these | ||
// but it can cause over-invalidation | ||
const granularOptionInvalidationEnabled = getFeatureFlag( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary to assign this to a variable? (i.e. just stick the getFeatureFlag
call inside the if
- makes it easier to cleanup)
return String(value); | ||
} | ||
|
||
if (typeof value === 'object') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does that work when an element that's not in the hash changes that doesn't affect the length? Are these arrays of options from config getting passed here? Do we actually have any large collections in practice?
We hash multi-megabyte bundles, I'd argue that hashing a string with hundreds of options concatenated can't be too expensive.
assert.ok(onRead.calledOnce); | ||
}); | ||
|
||
it('handles Object.keys() operations based on the granularOptionInvalidation feature flag', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the test run instead with granularOptionInvalidationEnabled
set to true
and again set to false
(FWIW with Mocha I've been doing this lately for some tests:
// intentionally verbose so it's obvious rather than just `[true, false].forEach`
[{myFlag: true}, {myFlag: false}].forEach(({myFlag}) => {
it(`does something (myFlag: ${myFlag.toString()})`, () => {
.. set the feature flag whether it's in package.json or by using `setFeatureFlags`)
}
});
Example:
await inputFS.mkdirp(inputDir); | ||
}); | ||
|
||
async function createSimpleProject() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use fsFixture
for inline test code for integration tests that makes this kind of setup easier to reason about.. Just search the codebase, plenty of examples. Avoids the need for physical files for fixtures as well.
value: mixed, | ||
): RequestGraphNode => { | ||
// Normalize option to string format for node ID | ||
const optionKey = Array.isArray(option) ? option.join('.') : option; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should JSON.stringify
this to handle keys with .
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to serialise array based option paths using JSON.stringify(option)
instead of joining with dots, which properly handles keys containing dots, special characters etc.
I've used an array:
prefix to try to distinguish string/array based keys
e.g.
// Without prefix - ambiguous cases:
JSON.stringify(['a.b']) // → '["a.b"]'
'["a.b"]' // → could be a string key or serialized array
// With prefix - unambiguous:
`array:${JSON.stringify(['a.b'])}` // → 'array:["a.b"]' (clearly an array)
'["a.b"]'
* Other reads work normally. | ||
* NOTE: By default, we DO track Object.keys()/ownKeys operations for backward compatibility, | ||
* but this can be disabled with the 'granularOptionInvalidation' feature flag. When disabled, | ||
* property enumeration won't trigger cache invalidation, which prevents unnecessary invalidations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enumerating a field should trigger invalidations when keys are added/removed.
Because output from code listing a property might change if there are new items or keys.
It there code that was triggering this? Maybe we can try to rework it to avoid invalidation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specific commit - was a bit of a tricky one to figure out, but I'm fairly confident this addresses the root cause
packages/core/core/src/types.js
Outdated
// Maximum number of parent nodes to invalidate per option change | ||
maxInvalidationsPerOption?: number, | ||
// Batch option invalidations together | ||
batchInvalidations?: boolean, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this and the maxInvalidationsPerOptions
keys do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually used - just needed cleaning up
…en/improve-granular-option-invalidation
Motivation
Our build cache system was invalidating more often than necessary due to changes in options that don't actually affect build output. For example:
instanceId
would invalidate the cache even though it doesn't affect buildsObject.keys()
on option objects would trigger invalidationThis resulted in unnecessary rebuilds and reduced development performance, especially in watch mode where options might change frequently.
Changes
Implemented a more granular option invalidation system with the following improvements:
['featureFlags', 'granularOptionInvalidation']
) instead of just top-level keysObject.keys()
operations when they don't affect build outputThe changes maintain backward compatibility while significantly reducing unnecessary cache invalidations.
Changes
Initial testing shows:
Checklist