Skip to content

Pinned retriever #126401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 100 commits into from
May 1, 2025
Merged

Pinned retriever #126401

merged 100 commits into from
May 1, 2025

Conversation

mridula-s109
Copy link
Contributor

@mridula-s109 mridula-s109 commented Apr 7, 2025

Pinned Retriever for Elasticsearch

This PR introduces a new pinned retriever for Elasticsearch, available under the search-business-rules plugin.

The pinned retriever allows users to explicitly pin specific documents (via ids or docs) so they always appear at the top of the search results, regardless of their organic relevance score. It is implemented as a compound retriever that rewrites internally to a PinnedQueryBuilder.


Key Features

  • Pin Documents by ID or Full Doc: Pin specific documents either by their _id or by providing inline doc content.
  • Custom Explanation Support: When used with "explain": true, the search response now includes a custom _explanation field for pinned results.
    Example:
    "description": "Pinned document by ids, original explanation:"
    

Example API Request

POST my-index/_search
{
  "explain": true,
  "retriever": {
    "pinned": {
      "ids": ["1", "2"],
      "retriever": {
        "standard": {
          "query": {
            "match": {
              "title": "machine learning"
            }
          }
        }
      }
    }
  }
}

Sample Response

"_explanation": {
  "value": 1.7014124e+38,
  "description": "Pinned document, original explanation:",
  "details": [
    {
      "value": 1,
      "description": "doc [0] with an original score of [1.7014124E38] is at rank [1] from the following source queries.",
      "details": [
        {
          // additional explanation details...
        }
      ]
    }
  ]
}

@mridula-s109 mridula-s109 added >enhancement auto-backport Automatically create backport pull requests when merged :SearchOrg/Relevance Label for the Search (solution/org) Relevance team labels Apr 7, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @mridula-s109, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-eng (Team:SearchOrg)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-relevance (Team:Search - Relevance)

@mridula-s109 mridula-s109 enabled auto-merge (squash) April 7, 2025 21:43
@mridula-s109 mridula-s109 requested review from kderusso and a team April 7, 2025 21:44
Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick pass through, I'm surprised this works without registering the retriever?

@mridula-s109 mridula-s109 marked this pull request as draft April 8, 2025 09:44
auto-merge was automatically disabled April 8, 2025 09:44

Pull request was converted to draft

@elasticsearchmachine
Copy link
Collaborator

Hi @mridula-s109, I've created a changelog YAML for you.

kderusso and others added 2 commits April 30, 2025 14:17
* Fix compilation error

* Remove SearchPlugin from META-INF

* Remove duplicate FeatureSpecification

* Add service as test resource

* Move to test

* More file moving

* Remove from module-info, remove file

* Make search business rules plugin extensible

* fix ent search plugin

* Delete dup files

* [CI] Auto commit changes from spotless

* Add module info back in

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some hopefully final feedback! Thanks for your iterations on this.
As a best practice it would be good to request review from the Search Relevance team so everyone on the team is aware of this PR.

Awaiting CI and final changes but this is really close!

@mridula-s109 mridula-s109 requested a review from a team April 30, 2025 17:27
@mridula-s109 mridula-s109 requested a review from kderusso April 30, 2025 19:57
@mridula-s109
Copy link
Contributor Author

Some hopefully final feedback! Thanks for your iterations on this. As a best practice it would be good to request review from the Search Relevance team so everyone on the team is aware of this PR.

Awaiting CI and final changes but this is really close!

I have resolved all the comments @kderusso , hopefully the ci build passes through and everything works as intended.

Copy link
Contributor

@ioanatia ioanatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I tested it locally.
Checked the comments from @kderusso from the last review and it looks to me that all have been addressed.

@@ -231,6 +231,7 @@ static TransportVersion def(int id) {
public static final TransportVersion INTRODUCE_FAILURES_LIFECYCLE = def(9_065_0_00);
public static final TransportVersion PROJECT_METADATA_SETTINGS = def(9_066_00_0);
public static final TransportVersion AGGREGATE_METRIC_DOUBLE_BLOCK = def(9_067_00_0);
public static final TransportVersion PINNED_RETRIEVER = def(9_068_0_00);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a side note - but when we backport this to 8.19, we need to pay extra attention on how we backport this transport version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah @mridula-s109 - we will create a PINNED_RETRIEVER_8_19 transport version

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for all the iterations!

@@ -231,6 +231,7 @@ static TransportVersion def(int id) {
public static final TransportVersion INTRODUCE_FAILURES_LIFECYCLE = def(9_065_0_00);
public static final TransportVersion PROJECT_METADATA_SETTINGS = def(9_066_00_0);
public static final TransportVersion AGGREGATE_METRIC_DOUBLE_BLOCK = def(9_067_00_0);
public static final TransportVersion PINNED_RETRIEVER = def(9_068_0_00);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah @mridula-s109 - we will create a PINNED_RETRIEVER_8_19 transport version

@mridula-s109 mridula-s109 merged commit b742b02 into main May 1, 2025
18 checks passed
@mridula-s109 mridula-s109 deleted the pinned-retriever branch May 1, 2025 12:50
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 126401

@mridula-s109
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

mridula-s109 added a commit to mridula-s109/elasticsearch that referenced this pull request May 2, 2025
* Created PinnedQueryRetriever Builder

* Added retriever build test file

* Changed the test to accept all the retrievers as its not licensed

* [CI] Auto commit changes from spotless

* Added integration test and fixed the code style issues

* Fixed merge issues

* Added cluster test feature to the integration tests

* Update docs/changelog/126401.yaml

* Registered retriever plugin

* Enhanced changelog description

* Added validation to the constructor creation

* Introduced validations in the code and incorporated in tests - compiling version

* Included PinnedRankDoc to enhance the explain query, validations for inputs anfd sorting. Fixed the tests after these changes

* Working on improving integration test and introducing cluster features

* Removed cluster test temporarily

* Got pinnedretriever yaml to working state without cluster

* [CI] Auto commit changes from spotless

* Resolved no source yaml error

* Cluster loaded successfully

* [CI] Auto commit changes from spotless

* Fixed integration test

* Made validate sort less strict

* Included shareddoc sorting in the validate sort

* All yaml issues resolved

* Explanation yaml files added

* Trying to add clustering to the test

* [CI] Auto commit changes from spotless

* Removed explicit feature specification

* Deleted the empty file

* Put the node feature in the proper place

* Added additional validation in ids and docs

* Cleaned the create pinned query validation

* made unit tests robust

* Remove query.es from version control

* Remove result.json from version control

* [CI] Auto commit changes from spotless

* Everything except explanation is fixed

* added duplicate doc test

* Applied checkstyle fix, spotless and a failing yaml

* Improvements based on PR comments

* [CI] Auto commit changes from spotless

* Modified the unit test to accomodate the change in createPinnedQuery

* Split the yaml to test for basic and trial, cleanedup and acted on all the comments

* Remove result.json and query.es from version control

* Removed unnecessary comments

* Removed redundant file

* Fixing CI build error

* [CI] Auto commit changes from spotless

* Removed pinnedBy as it wasnt necessary

* Removed unnecessary ToXContent Information

* Fixed transport version charges and cleaned up null checks:
"

* [CI] Auto commit changes from spotless

* Retriever status changed to 9.1 version

* cleaned up 2 yamltestsuite for different licenses

* Trying to get the clustering works

* Cleaned up the yaml clustering and also reorganised the yaml tests

* Unnecessary file introduction removed

* Reverted the plugins to the previous state as the changes werent necessary

* Removed unnecessary transport versioning from pinnedrankdoc

* reverted

* [CI] Auto commit changes from spotless

* Edited the SearchBusinessRules to remove the class from getFeatures

* [CI] Auto commit changes from spotless

* Cleaned up Pinned retriever to allow only id or docs

* Added more test to the pinned retriever

* did spotless

* Introduced new transport versioning

* Cleaning it up

* Resolved validate module error

* BWT issues

* [CI] Auto commit changes from spotless

* fix NPE occuring in ci

* trying to fix duplicate feature issue

* [CI] Auto commit changes from spotless

* modified the tests

* Playing with pinned retriever CI  (elastic#127530)

* Fix compilation error

* Remove SearchPlugin from META-INF

* Remove duplicate FeatureSpecification

* Add service as test resource

* Move to test

* More file moving

* Remove from module-info, remove file

* Make search business rules plugin extensible

* fix ent search plugin

* Delete dup files

* [CI] Auto commit changes from spotless

* Add module info back in

---------

Co-authored-by: elasticsearchmachine <[email protected]>

* top document no pinned

* Removing this to see if it works without this transport version check, as i had made changes related to the cluster error

* Fixed the retriever builder comments and error message

* removed the unnencessary explain

* Fixed all the tests in the yaml file

* added extra test case

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: Kathleen DeRusso <[email protected]>
(cherry picked from commit b742b02)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
mridula-s109 added a commit that referenced this pull request May 2, 2025
* Pinned retriever (#126401)

* Created PinnedQueryRetriever Builder

* Added retriever build test file

* Changed the test to accept all the retrievers as its not licensed

* [CI] Auto commit changes from spotless

* Added integration test and fixed the code style issues

* Fixed merge issues

* Added cluster test feature to the integration tests

* Update docs/changelog/126401.yaml

* Registered retriever plugin

* Enhanced changelog description

* Added validation to the constructor creation

* Introduced validations in the code and incorporated in tests - compiling version

* Included PinnedRankDoc to enhance the explain query, validations for inputs anfd sorting. Fixed the tests after these changes

* Working on improving integration test and introducing cluster features

* Removed cluster test temporarily

* Got pinnedretriever yaml to working state without cluster

* [CI] Auto commit changes from spotless

* Resolved no source yaml error

* Cluster loaded successfully

* [CI] Auto commit changes from spotless

* Fixed integration test

* Made validate sort less strict

* Included shareddoc sorting in the validate sort

* All yaml issues resolved

* Explanation yaml files added

* Trying to add clustering to the test

* [CI] Auto commit changes from spotless

* Removed explicit feature specification

* Deleted the empty file

* Put the node feature in the proper place

* Added additional validation in ids and docs

* Cleaned the create pinned query validation

* made unit tests robust

* Remove query.es from version control

* Remove result.json from version control

* [CI] Auto commit changes from spotless

* Everything except explanation is fixed

* added duplicate doc test

* Applied checkstyle fix, spotless and a failing yaml

* Improvements based on PR comments

* [CI] Auto commit changes from spotless

* Modified the unit test to accomodate the change in createPinnedQuery

* Split the yaml to test for basic and trial, cleanedup and acted on all the comments

* Remove result.json and query.es from version control

* Removed unnecessary comments

* Removed redundant file

* Fixing CI build error

* [CI] Auto commit changes from spotless

* Removed pinnedBy as it wasnt necessary

* Removed unnecessary ToXContent Information

* Fixed transport version charges and cleaned up null checks:
"

* [CI] Auto commit changes from spotless

* Retriever status changed to 9.1 version

* cleaned up 2 yamltestsuite for different licenses

* Trying to get the clustering works

* Cleaned up the yaml clustering and also reorganised the yaml tests

* Unnecessary file introduction removed

* Reverted the plugins to the previous state as the changes werent necessary

* Removed unnecessary transport versioning from pinnedrankdoc

* reverted

* [CI] Auto commit changes from spotless

* Edited the SearchBusinessRules to remove the class from getFeatures

* [CI] Auto commit changes from spotless

* Cleaned up Pinned retriever to allow only id or docs

* Added more test to the pinned retriever

* did spotless

* Introduced new transport versioning

* Cleaning it up

* Resolved validate module error

* BWT issues

* [CI] Auto commit changes from spotless

* fix NPE occuring in ci

* trying to fix duplicate feature issue

* [CI] Auto commit changes from spotless

* modified the tests

* Playing with pinned retriever CI  (#127530)

* Fix compilation error

* Remove SearchPlugin from META-INF

* Remove duplicate FeatureSpecification

* Add service as test resource

* Move to test

* More file moving

* Remove from module-info, remove file

* Make search business rules plugin extensible

* fix ent search plugin

* Delete dup files

* [CI] Auto commit changes from spotless

* Add module info back in

---------

Co-authored-by: elasticsearchmachine <[email protected]>

* top document no pinned

* Removing this to see if it works without this transport version check, as i had made changes related to the cluster error

* Fixed the retriever builder comments and error message

* removed the unnencessary explain

* Fixed all the tests in the yaml file

* added extra test case

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: Kathleen DeRusso <[email protected]>
(cherry picked from commit b742b02)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java

* Fixed the compile issue

* changed pinned retriever name
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request May 2, 2025
In elastic#127623 we backported elastic#127299 and added a backport transport version
for it - `ESQL_AGGREGATE_METRIC_DOUBLE_BLOCK_8_19` aka `8_841_0_24`.
This brings that version forwards to `main` and adds support for parsing
streams with that version.

In elastic#127639 we backported elastic#126401 and added a backport transport version
for it - `PINNED_RETRIEVER_8_19` aka `8_841_0_23`. This brings that
version forwards to `main` and adds support for parsing streams with
that versions.

In elastic#127633 we a claimed a backport transport version to backport elastic#127314 -
`INTRODUCE_FAILURES_LIFECYCLE_BACKPORT_8_19` aka `8_841_0_23`. That's
the same versions as `PINNED_RETRIEVER_8_19`. It's just that this one is
in `main` and `PINNED_RETRIEVER_8_19` is in `8.19`. To allow me to bring
`PINNED_RETRIEVER_8_19` for wards I've had to revert elastic#127633.

Closes elastic#127667
elasticsearchmachine pushed a commit that referenced this pull request May 3, 2025
In #127623 we backported #127299 and added a backport transport version
for it - `ESQL_AGGREGATE_METRIC_DOUBLE_BLOCK_8_19` aka `8_841_0_24`.
This brings that version forwards to `main` and adds support for parsing
streams with that version.

In #127639 we backported #126401 and added a backport transport version
for it - `PINNED_RETRIEVER_8_19` aka `8_841_0_23`. This brings that
version forwards to `main` and adds support for parsing streams with
that versions.

In #127633 we a claimed a backport transport version to backport #127314
- `INTRODUCE_FAILURES_LIFECYCLE_BACKPORT_8_19` aka `8_841_0_23`. That's
the same versions as `PINNED_RETRIEVER_8_19`. It's just that this one is
in `main` and `PINNED_RETRIEVER_8_19` is in `8.19`. To allow me to bring
`PINNED_RETRIEVER_8_19` for wards I've had to revert #127633.

Closes #127667
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >enhancement :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants