-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[ML] Implement JSONPath replacement for Inference API #127036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Implement JSONPath replacement for Inference API #127036
Conversation
Pinging @elastic/ml-core (Team:ML) |
* Uses a subset of the JSONPath schema to extract fields from a map. | ||
* For more information <a href="https://en.wikipedia.org/wiki/JSONPath">see here</a>. | ||
* | ||
* This implementation differs in out it handles lists in that JSONPath will flatten inner lists. This implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* This implementation differs in out it handles lists in that JSONPath will flatten inner lists. This implementation | |
* This implementation differs in how it handles lists in that JSONPath will flatten inner lists. This implementation |
var cleanedPath = path.trim(); | ||
|
||
// Remove the prefix if it exists | ||
if (cleanedPath.startsWith(DOLLAR)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to assert or throw an exception if we don't start with $
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point I'll add an exception.
…ner/elasticsearch into ml-custom-model-json-paths
* Adding initial extractor * Finishing tests * Addressing feedback
💚 Backport successful
|
This PR adds a very minimal implementation of something similar to the JSONPath library. This is needed for the custom models PR here: #125679
MapPathExtractor
recursively iterates through a provided map navigating to the specified field and extracting the values. It handles nested maps and lists within the map.This code isn't currently used anywhere outside of the tests that reference it.
Difference from JSONPath
This implementation doesn't support many of the features that JSONPath does. It also deviates from JSONPath in its handling of arrays of maps. When extracting a field from a list of maps, JSONPath will flatten the result into a single array even if multiple arrays needed to be traversed to extract the field from the map. This implementation preserves each array that it encounters. This is important so that we can construct internal classes that represent the various result formats. For example when building the text embedding response we effectively need an array of an array of floats so it's helpful to preserve the outer and inner arrays when constructing the objects after we extract the data from the map.
The second example below depicts the difference.
Schema examples
I tried to keep the schema similar to JSONPath. There's no particular reason we need to do that though. Hopefully it's more familiar to users though.
$.
to start the path.
dot is used to traverse nested maps[*]
to indicate that it's an array$.field1.some_array[*].another_field
$.some_array[*].field1
Examples
Extracting arrays
MapPathExtractor.extract(map, "$.result.embeddings[*].embedding")
returns[[2, 4], [1, 2]]
Extracting multiple map fields
MapPathExtractor.extract(map, "$.result[*].key[*].a");
returns[[1.1, 2.2], [3.3, 4.4]]
NOTE: JSONPath will return:
[1.1, 2.2, 3.3, 4.4]