-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Add index version for match_only_text stored field in binary format #130363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add index version for match_only_text stored field in binary format #130363
Conversation
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
if (storedFieldInBinaryFormat) { | ||
final var bytesRef = new BytesRef(utfBytes.bytes(), utfBytes.offset(), utfBytes.length()); | ||
context.doc().add(new StoredField(fieldType().storedFieldNameForSyntheticSource(), bytesRef)); | ||
} else { | ||
context.doc().add(new StoredField(fieldType().storedFieldNameForSyntheticSource(), value.string())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure about this. Maybe we want to continue writing in binary format even for older indices? Since we already updated the mapper to handle mixed string and byteref values, we may as well take advantage of the throughput benefits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this way the logic is simpler? So maybe keep it like this? Indices will rollover at some point and then binary format will be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment LGTM otherwise.
@@ -178,6 +178,7 @@ private static Version parseUnchecked(String version) { | |||
public static final IndexVersion UPGRADE_TO_LUCENE_10_2_2 = def(9_030_0_00, Version.LUCENE_10_2_2); | |||
public static final IndexVersion SPARSE_VECTOR_PRUNING_INDEX_OPTIONS_SUPPORT = def(9_031_0_00, Version.LUCENE_10_2_2); | |||
public static final IndexVersion DEFAULT_DENSE_VECTOR_TO_BBQ_HNSW = def(9_032_0_00, Version.LUCENE_10_2_2); | |||
public static final IndexVersion MATCH_ONLY_TEXT_STORED_AS_BYTES = def(9_033_0_00, Version.LUCENE_10_2_2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should already add the 8.19 version (including logic) to this PR, which should make back porting easier.
if (storedFieldInBinaryFormat) { | ||
final var bytesRef = new BytesRef(utfBytes.bytes(), utfBytes.offset(), utfBytes.length()); | ||
context.doc().add(new StoredField(fieldType().storedFieldNameForSyntheticSource(), bytesRef)); | ||
} else { | ||
context.doc().add(new StoredField(fieldType().storedFieldNameForSyntheticSource(), value.string())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this way the logic is simpler? So maybe keep it like this? Indices will rollover at some point and then binary format will be used.
…lastic#130363) Follow-up to elastic#130049 to gate using the binary format for the stored field in match_only_text fields behind an index version.
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…ormat (#130363) (#130416) * Add index version for match_only_text stored field in binary format (#130363) Follow-up to #130049 to gate using the binary format for the stored field in match_only_text fields behind an index version. (cherry picked from commit a69c484) # Conflicts: # server/src/main/java/org/elasticsearch/index/IndexVersions.java * Fix IndexVersion in testLoadSyntheticSourceFromStringOrBytesRef * Trigger build for auto-merge
…lastic#130363) Follow-up to elastic#130049 to gate using the binary format for the stored field in match_only_text fields behind an index version.
Accidentally used the wrong backport index version in #130363.
…lastic#130363) Follow-up to elastic#130049 to gate using the binary format for the stored field in match_only_text fields behind an index version.
Accidentally used the wrong backport index version in elastic#130363.
Follow-up to #130049 to gate using the binary format for the stored field in match_only_text fields behind an index version (as we should have from the start).