Skip to content

Rare terms aggregation false **positive** fix #126884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 24, 2025

Conversation

iamgd67
Copy link
Contributor

@iamgd67 iamgd67 commented Apr 16, 2025

I found that rare terms aggregation can return false positive results (some of the returned results are not real rear).
I traced it down and find out it's a bug in CuckooFilter's merge methord.

   if (isSetMode == false && other.isSetMode) {
        other.hashes.forEach(this::add);
    }

should be

   if (isSetMode == false && other.isSetMode) {            
        other.hashes.forEach(this::addHash);
    }

@iamgd67 iamgd67 requested a review from a team as a code owner April 16, 2025 01:22
@elasticsearchmachine elasticsearchmachine added v9.1.0 needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team labels Apr 16, 2025
@limotova limotova added :Analytics/Aggregations Aggregations and removed needs:triage Requires assignment of a team area label labels Apr 16, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 16, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@iamgd67
Copy link
Contributor Author

iamgd67 commented Apr 22, 2025

hello there, it's about a week since this pr was created, can anyone review this please?

@nik9000 nik9000 self-assigned this Apr 24, 2025
@nik9000 nik9000 added the >bug label Apr 24, 2025
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'm going to make the robot happy and get this in.

@nik9000 nik9000 enabled auto-merge (squash) April 24, 2025 19:13
@nik9000 nik9000 added auto-backport Automatically create backport pull requests when merged v9.0.1 labels Apr 24, 2025
@nik9000
Copy link
Member

nik9000 commented Apr 24, 2025

Changelogbot, I've pushed a changelog.

@nik9000
Copy link
Member

nik9000 commented Apr 24, 2025

@iamgd67, what version are you using?

@nik9000
Copy link
Member

nik9000 commented Apr 24, 2025

Sorry for the long delay. Busy looking at very different areas, but yeah. Well done. Thanks for waiting!

@nik9000
Copy link
Member

nik9000 commented Apr 24, 2025

elasticmachine, test this please

@nik9000
Copy link
Member

nik9000 commented Apr 24, 2025

elasticmachine test this please

@nik9000
Copy link
Member

nik9000 commented Apr 24, 2025

buildkite run elasticsearch-ci

@nik9000
Copy link
Member

nik9000 commented Apr 24, 2025

buildkite test this

@nik9000
Copy link
Member

nik9000 commented Apr 24, 2025

buildkite test this

@nik9000 nik9000 merged commit 77c77f2 into elastic:main Apr 24, 2025
18 of 19 checks passed
nik9000 pushed a commit to nik9000/elasticsearch that referenced this pull request Apr 24, 2025
I found that `rare_terms` aggregation can return **false positive**
results (some of the returned results are not real).  I traced it
down and find out it's a bug in `CuckooFilter`'s merge methord.

```
        if (isSetMode == false && other.isSetMode) {
            other.hashes.forEach(this::add);
        }
```

should be 

```
        if (isSetMode == false && other.isSetMode) {            
            other.hashes.forEach(this::addHash);
        }
```
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.18
8.x
9.0

nik9000 pushed a commit to nik9000/elasticsearch that referenced this pull request Apr 24, 2025
I found that `rare_terms` aggregation can return **false positive**
results (some of the returned results are not real).  I traced it
down and find out it's a bug in `CuckooFilter`'s merge methord.

```
        if (isSetMode == false && other.isSetMode) {
            other.hashes.forEach(this::add);
        }
```

should be 

```
        if (isSetMode == false && other.isSetMode) {            
            other.hashes.forEach(this::addHash);
        }
```
nik9000 pushed a commit to nik9000/elasticsearch that referenced this pull request Apr 24, 2025
I found that `rare_terms` aggregation can return **false positive**
results (some of the returned results are not real).  I traced it
down and find out it's a bug in `CuckooFilter`'s merge methord.

```
        if (isSetMode == false && other.isSetMode) {
            other.hashes.forEach(this::add);
        }
```

should be 

```
        if (isSetMode == false && other.isSetMode) {            
            other.hashes.forEach(this::addHash);
        }
```
elasticsearchmachine pushed a commit that referenced this pull request Apr 24, 2025
I found that `rare_terms` aggregation can return **false positive**
results (some of the returned results are not real).  I traced it
down and find out it's a bug in `CuckooFilter`'s merge methord.

```
        if (isSetMode == false && other.isSetMode) {
            other.hashes.forEach(this::add);
        }
```

should be 

```
        if (isSetMode == false && other.isSetMode) {            
            other.hashes.forEach(this::addHash);
        }
```

Co-authored-by: iamgd67 <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Apr 24, 2025
I found that `rare_terms` aggregation can return **false positive**
results (some of the returned results are not real).  I traced it
down and find out it's a bug in `CuckooFilter`'s merge methord.

```
        if (isSetMode == false && other.isSetMode) {
            other.hashes.forEach(this::add);
        }
```

should be 

```
        if (isSetMode == false && other.isSetMode) {            
            other.hashes.forEach(this::addHash);
        }
```

Co-authored-by: iamgd67 <[email protected]>
@iamgd67
Copy link
Contributor Author

iamgd67 commented Apr 25, 2025

@iamgd67, what version are you using?

thanks for your review and merge. I am using 7.17.8

nik9000 added a commit that referenced this pull request Apr 25, 2025
I found that `rare_terms` aggregation can return **false positive**
results (some of the returned results are not real).  I traced it
down and find out it's a bug in `CuckooFilter`'s merge methord.

```
        if (isSetMode == false && other.isSetMode) {
            other.hashes.forEach(this::add);
        }
```

should be 

```
        if (isSetMode == false && other.isSetMode) {            
            other.hashes.forEach(this::addHash);
        }
```

Co-authored-by: iamgd67 <[email protected]>
@nik9000
Copy link
Member

nik9000 commented Apr 25, 2025

thanks for your review and merge. I am using 7.17.8

Oof. The earliest version I can get this to is 8.17. I've already got it back to 8.18 but would have to manaully pull it one more back if you want it there. Sorry.

@iamgd67 iamgd67 deleted the bugfix/cuckooFilterMergeBug branch April 27, 2025 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations auto-backport Automatically create backport pull requests when merged >bug external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.1 v8.19.0 v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants