Skip to content

[Entitlements] Fix: consider case sensitiveness differences #126990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ldematte
Copy link
Contributor

@ldematte ldematte commented Apr 17, 2025

Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms.
However, Windows files and paths are (sort of) case insensitive - see https://discuss.elastic.co/t/not-entitled-component-repository-s3-after-update-from-8-17-4-to-8-18-0/377166 for a real issue where we hit this problem.
This PR fixes the problem by abstracting String comparison operations (moving them to FileUtils) and making them case sensitive or not based on the host OS.

Closes #127047

@ldematte ldematte added >bug test-windows Trigger CI checks on Windows auto-backport Automatically create backport pull requests when merged v8.18.1 v8.19.0 v9.0.1 v9.1.0 :Core/Infra/Entitlements Entitlements infrastructure labels Apr 17, 2025
@ldematte ldematte requested a review from a team as a code owner April 17, 2025 12:51
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Apr 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine
Copy link
Collaborator

Hi @ldematte, I've created a changelog YAML for you.

Copy link
Contributor

@prdoyle prdoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we'd be better of solving this via normalization rather than changing the comparator? If we normalize to lowercase on Windows, then the comparator will just work.

@ldematte
Copy link
Contributor Author

If we normalize to lowercase on Windows, then the comparator will just work.

I'm not sure this is the right way to do it; probably in the case of a Windows FS it does not matter (the char comparison in WindowsPath do lower case after all), but usually "case insensitive comparison" and "normalize to lower case" are different. Since we already have a comparator, I thought it was a comparable amount of work/comparable change.

@prdoyle
Copy link
Contributor

prdoyle commented Apr 17, 2025

I just worry that our comparator is going to become rather incomprehensible. Adding toLowerCase in the normalization on Windows would be a one-liner.

Do you have an example where normalize-then-compare is different from a case-insensitive compare?

@ldematte
Copy link
Contributor Author

@prdoyle
Copy link
Contributor

prdoyle commented Apr 17, 2025

Do I understand correctly that the concern would be that a user would supply two different path strings that compare equal under String.toLowercase(), but are actually distinct? I don't think the comparator logic in this PR solves that problem any better, because it's also doing toLowercase.

@prdoyle
Copy link
Contributor

prdoyle commented Apr 17, 2025

I notice there's a reply to the above post that mentions toLowercase(toUppercase(c)) as the solution. Perhaps we ought to do that either way, whether we use normalization or custom comparator logic?

@ldematte
Copy link
Contributor Author

I don't think the comparator logic in this PR solves that problem any better, because it's also doing toLowercase.

Yeah

in the case of a Windows FS it does not matter (the char comparison in WindowsPath do lower case after all)

the startsWith thing for isParent is OK though.I wonder if we should "do better", but ... better than WindowsPath? Not sure is worth it at all. Still, I like the approach of fixing the comparator better; I don't like the idea of having a "broken" comparator around, that happens to work because we normalized strings beforehand.

@ldematte
Copy link
Contributor Author

On the other hand it's true that normalization happens once, so there is a little performance benefit there... IDK.

@prdoyle
Copy link
Contributor

prdoyle commented Apr 17, 2025

the startsWith thing for isParent is OK though.I wonder if we should "do better", but ... better than WindowsPath?

Agreed, the idea of overachieving relative to WindowsPath seems very much like a goal not worth pursuing.

Not sure is worth it at all. Still, I like the approach of fixing the comparator better; I don't like the idea of having a "broken" comparator around, that happens to work because we normalized strings beforehand.

But this is the heart of why normalization is beneficial: it reduces the space of cases to consider, and then subsequent processing is greatly simplified. The comparator isn't "broken" if the cases it doesn't handle are ones that never occur thanks to normalization.

(In fact, the only reason I didn't propose normalizing the path separators is that there exists no reasonable separator character that would solve our sorting problem.)

@rjernst rjernst self-requested a review April 17, 2025 14:33
@prdoyle
Copy link
Contributor

prdoyle commented Apr 17, 2025

A note: at the team meeting, we chose to proceed with improved comparator instead of normalization, primarily based on the performance impact of doing a toLowercase() on every string we look up every time.

@ldematte ldematte requested a review from rjernst April 22, 2025 13:32
Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ldematte ldematte enabled auto-merge (squash) April 23, 2025 13:10
return 0;
}
return Character.compare(c1, c2);
return Character.compare(Character.toLowerCase(c1), Character.toLowerCase(c2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a significance to the fact that we switched from uppercase to lowercase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, just a suggestion from Ryan to make it shorter and more readable

Copy link
Contributor

@prdoyle prdoyle Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toUpperCase is equally short and readable, no?

I just think if we use one that's different from the actual filesystem code you referenced, we're asking for unexpected corner cases.

@ldematte ldematte merged commit 002fef7 into elastic:main Apr 23, 2025
22 checks passed
ldematte added a commit to ldematte/elasticsearch that referenced this pull request Apr 23, 2025
…126990)

Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms.
However, Windows files and paths are (sort of) case insensitive.
This PR fixes the problem by abstracting String comparison operations and making them case sensitive or not based on the host OS.
ldematte added a commit to ldematte/elasticsearch that referenced this pull request Apr 23, 2025
…126990)

Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms.
However, Windows files and paths are (sort of) case insensitive.
This PR fixes the problem by abstracting String comparison operations and making them case sensitive or not based on the host OS.
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.18
8.x
9.0

ldematte added a commit to ldematte/elasticsearch that referenced this pull request Apr 23, 2025
…126990)

Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms.
However, Windows files and paths are (sort of) case insensitive.
This PR fixes the problem by abstracting String comparison operations and making them case sensitive or not based on the host OS.
elasticsearchmachine pushed a commit that referenced this pull request Apr 23, 2025
…#127276)

Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms.
However, Windows files and paths are (sort of) case insensitive.
This PR fixes the problem by abstracting String comparison operations and making them case sensitive or not based on the host OS.
elasticsearchmachine pushed a commit that referenced this pull request Apr 23, 2025
…#127275)

Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms.
However, Windows files and paths are (sort of) case insensitive.
This PR fixes the problem by abstracting String comparison operations and making them case sensitive or not based on the host OS.
elasticsearchmachine pushed a commit that referenced this pull request Apr 23, 2025
…#127274)

Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms.
However, Windows files and paths are (sort of) case insensitive.
This PR fixes the problem by abstracting String comparison operations and making them case sensitive or not based on the host OS.
@ldematte ldematte deleted the entitlements/fileaccesstree-fix-windows-casing branch April 24, 2025 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Core/Infra/Entitlements Entitlements infrastructure Team:Core/Infra Meta label for core/infra team test-windows Trigger CI checks on Windows v8.18.1 v8.19.0 v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Files entitlements need to be case sensitive on Windows
4 participants