Skip to content

Clean up semantic text field mapper tests #127853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,6 @@ protected final MapperService createMapperService(Settings settings, String mapp
protected final MapperService createMapperService(IndexVersion indexVersion, Settings settings, XContentBuilder mappings)
throws IOException {
MapperService mapperService = createMapperService(indexVersion, settings, () -> true, mappings);
merge(mapperService, mappings);
return mapperService;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,25 +37,44 @@ public void testIsEnabled() {
assertFalse(InferenceMetadataFieldsMapper.isEnabled(settings));

settings = Settings.builder()
.put(
IndexMetadata.SETTING_INDEX_VERSION_CREATED.getKey(),
getRandomCompatibleIndexVersion(true, IndexVersionUtils.getPreviousVersion(IndexVersions.INFERENCE_METADATA_FIELDS))
)
.put(InferenceMetadataFieldsMapper.USE_LEGACY_SEMANTIC_TEXT_FORMAT.getKey(), false)
.put(IndexMetadata.SETTING_INDEX_VERSION_CREATED.getKey(), getRandomCompatibleIndexVersion(false))
.put(InferenceMetadataFieldsMapper.USE_LEGACY_SEMANTIC_TEXT_FORMAT.getKey(), true)
.build();
assertFalse(InferenceMetadataFieldsMapper.isEnabled(settings));

settings = Settings.builder()
.put(IndexMetadata.SETTING_INDEX_VERSION_CREATED.getKey(), getRandomCompatibleIndexVersion(false))
.put(InferenceMetadataFieldsMapper.USE_LEGACY_SEMANTIC_TEXT_FORMAT.getKey(), true)
.put(InferenceMetadataFieldsMapper.USE_LEGACY_SEMANTIC_TEXT_FORMAT.getKey(), false)
.build();
assertTrue(InferenceMetadataFieldsMapper.isEnabled(settings));

// Test that index.mapping.semantic_text.use_legacy_format == false is ignored when the index version is too old to support the new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of the scope of this PR: Is it technically correct that we don't error in this case, since we're requesting a format that can't be applied?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps not, but it's a moot point now as erroring out on old index versions could be a breaking change. Add into that that this is an index setting that should not be used externally and it just doesn't seem worth following up on.

// format
settings = Settings.builder()
.put(
IndexMetadata.SETTING_INDEX_VERSION_CREATED.getKey(),
IndexVersionUtils.randomVersionBetween(
random(),
IndexVersions.SEMANTIC_TEXT_FIELD_TYPE,
IndexVersionUtils.getPreviousVersion(IndexVersions.INFERENCE_METADATA_FIELDS_BACKPORT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: We're starting to get a little confusing with index versions as the names aren't always clear what the change was. Would it make sense to do a favor to our future selves and future maintainers, since we're cleaning up anyway, to create private members to use for testing? e.g.

private static final IndexVersion INDEX_VERSION_THAT_WE_INTRODUCED_NEW_FORMAT  = ... 
... 

Naming for illustrative purposes only, I'm sure there are better ones, but you get the drift 😁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the comments more descriptive here, I think that has the same effect. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think readability would be enhanced by better names, but the comments are better, thanks for adding them

) // 8.x version range prior to the introduction of the new format
)
.put(InferenceMetadataFieldsMapper.USE_LEGACY_SEMANTIC_TEXT_FORMAT.getKey(), false)
.build();
assertFalse(InferenceMetadataFieldsMapper.isEnabled(settings));

settings = Settings.builder()
.put(IndexMetadata.SETTING_INDEX_VERSION_CREATED.getKey(), getRandomCompatibleIndexVersion(false))
.put(
IndexMetadata.SETTING_INDEX_VERSION_CREATED.getKey(),
IndexVersionUtils.randomVersionBetween(
random(),
IndexVersions.UPGRADE_TO_LUCENE_10_0_0,
IndexVersionUtils.getPreviousVersion(IndexVersions.INFERENCE_METADATA_FIELDS)
) // 9.x version range prior to the introduction of the new format
)
.put(InferenceMetadataFieldsMapper.USE_LEGACY_SEMANTIC_TEXT_FORMAT.getKey(), false)
.build();
assertTrue(InferenceMetadataFieldsMapper.isEnabled(settings));
assertFalse(InferenceMetadataFieldsMapper.isEnabled(settings));
}

public void testIsEnabledByDefault() {
Expand Down Expand Up @@ -117,19 +136,27 @@ public MappedFieldType getMappedFieldType() {
}

static IndexVersion getRandomCompatibleIndexVersion(boolean useLegacyFormat) {
return getRandomCompatibleIndexVersion(useLegacyFormat, IndexVersion.current());
}

static IndexVersion getRandomCompatibleIndexVersion(boolean useLegacyFormat, IndexVersion maxVersion) {
if (useLegacyFormat) {
// Randomly choose an index version compatible with the legacy semantic text format
if (randomBoolean()) {
return IndexVersionUtils.randomVersionBetween(random(), IndexVersions.UPGRADE_TO_LUCENE_10_0_0, maxVersion);
// 9.x+ version
return IndexVersionUtils.randomVersionBetween(random(), IndexVersions.UPGRADE_TO_LUCENE_10_0_0, IndexVersion.current());
}
return IndexVersionUtils.randomPreviousCompatibleVersion(random(), IndexVersions.INFERENCE_METADATA_FIELDS_BACKPORT);

// 8.x version
return IndexVersionUtils.randomVersionBetween(
random(),
IndexVersions.SEMANTIC_TEXT_FIELD_TYPE,
IndexVersionUtils.getPreviousVersion(IndexVersions.UPGRADE_TO_LUCENE_10_0_0)
);
} else {
// Randomly choose an index version compatible with the new semantic text format
if (randomBoolean()) {
return IndexVersionUtils.randomVersionBetween(random(), IndexVersions.INFERENCE_METADATA_FIELDS, maxVersion);
// 9.x+ version
return IndexVersionUtils.randomVersionBetween(random(), IndexVersions.INFERENCE_METADATA_FIELDS, IndexVersion.current());
}

// 8.x version
return IndexVersionUtils.randomVersionBetween(
random(),
IndexVersions.INFERENCE_METADATA_FIELDS_BACKPORT,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -153,35 +153,27 @@ protected Supplier<ModelRegistry> getModelRegistry() {

private MapperService createMapperService(XContentBuilder mappings, boolean useLegacyFormat) throws IOException {
IndexVersion indexVersion = SemanticInferenceMetadataFieldsMapperTests.getRandomCompatibleIndexVersion(useLegacyFormat);
return createMapperService(mappings, useLegacyFormat, indexVersion, indexVersion, false);
return createMapperService(mappings, useLegacyFormat, indexVersion, indexVersion);
}

private MapperService createMapperService(XContentBuilder mappings, boolean useLegacyFormat, IndexVersion minIndexVersion)
throws IOException {
return createMapperService(mappings, useLegacyFormat, minIndexVersion, IndexVersion.current(), false);
return createMapperService(mappings, useLegacyFormat, minIndexVersion, IndexVersion.current());
}

private MapperService createMapperService(
XContentBuilder mappings,
boolean useLegacyFormat,
IndexVersion minIndexVersion,
IndexVersion maxIndexVersion,
boolean propagateIndexVersion
IndexVersion maxIndexVersion
) throws IOException {
validateIndexVersion(minIndexVersion, useLegacyFormat);
IndexVersion indexVersion = IndexVersionUtils.randomVersionBetween(random(), minIndexVersion, maxIndexVersion);
var settings = Settings.builder()
.put(IndexMetadata.SETTING_INDEX_VERSION_CREATED.getKey(), indexVersion)
.put(InferenceMetadataFieldsMapper.USE_LEGACY_SEMANTIC_TEXT_FORMAT.getKey(), useLegacyFormat)
.build();
// TODO - This is added, because we discovered a bug where the index version was not being correctly propagated
// in our mappings even though we were specifying the index version in settings. We will fix this in a followup and
// remove the boolean flag accordingly.
if (propagateIndexVersion) {
return createMapperService(indexVersion, settings, mappings);
} else {
return createMapperService(settings, mappings);
}
return createMapperService(indexVersion, settings, mappings);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ Thanks for the cleanup

}

private static void validateIndexVersion(IndexVersion indexVersion, boolean useLegacyFormat) {
Expand Down Expand Up @@ -1189,8 +1181,7 @@ public void testDefaultIndexOptions() throws IOException {
}),
useLegacyFormat,
IndexVersions.INFERENCE_METADATA_FIELDS,
IndexVersionUtils.getPreviousVersion(IndexVersions.SEMANTIC_TEXT_DEFAULTS_TO_BBQ),
true
IndexVersionUtils.getPreviousVersion(IndexVersions.SEMANTIC_TEXT_DEFAULTS_TO_BBQ)
);
assertSemanticTextField(mapperService, "field", true, null, defaultDenseVectorIndexOptions());

Expand Down