-
Notifications
You must be signed in to change notification settings - Fork 25.2k
CAT API documents count incorrect #127354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
semantic_text utilizes nested documents internally. I am pretty sure that |
Pinging @elastic/search-eng (Team:SearchOrg) |
Pinging @elastic/search-relevance (Team:Search - Relevance) |
@benwtrent is correct, this is expected behavior as |
GET _cat/count/<index_name> does return a count consistent with the _count and _search endpoints. |
@mrklaney That's because the _cat/count API uses a search query to get document counts, which does not consider nested documents. In contrast, the _cat API "indices" option counts the number of documents in Lucene, which does include nested documents. These are different operations. ++ for @kderusso's suggestion to make this behavior clearer in the documentation. |
Elasticsearch Version
8.18.0
Installed Plugins
No response
Java Version
JVM home [.../elasticsearch-8.18.0/jdk.app/Contents/Home], using bundled JDK [true]
OS Version
Darwin Marks-MacBook-Pro.local 24.4.0 Darwin Kernel Version 24.4.0: Fri Apr 11 18:33:39 PDT 2025; root:xnu-11417.101.15~117/RELEASE_ARM64_T6020 arm64
Problem Description
The _cat API "indices" option gives an incorrect number of docs.count for indexes that have a field with data type semantic_text.
The _count and _search APIs are in agreement with what appears to be the correct document count.
Steps to Reproduce
Reindex to a dest index that has a semantic_text field, which creates embeddings.
Tested using both .elser-2-elasticsearch and .multilingual-e5-small models.
Logs (if relevant)
No response
The text was updated successfully, but these errors were encountered: