-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Fix use of precedence in endpoint routing DFA (#20801) #21200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is a PR against release/3.1 mostly for staging purposes and so we can review the change with the compat switch in place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+need shiproom template
Sure, I don't really intend for this to be ready for tactics yet. We don't have a branch for it to land in right now... |
👀 Tactics can review it ahead of time, and then we can hold it until the branch opens if you want. |
I suppose that's ultimately a good thing. Since we're giving out a build of this to someone, I'd really like to give out a build that has whatever solution we decide (opt-in vs opt-out). |
Hello human! Please make sure you've included the Shiproom Template in a comment or (preferably) the PR description. Also, make sure this PR is not marked as a draft and is ready-to-merge. |
Tactics approved this for 3.1.5 today, but would prefer to have it be an opt-in switch for 3.1 for now. We can revisit that if we continue to hear from people affected by this who aren't finding the switch, but we should default to not changing behavior unless it's from clearly broken to working in servicing. |
Should we set this switch in the template? I would expect we're going to hear a lot more about this. |
? The proxy uses 5.0. |
Ok that helps a lot then. |
994bdf0
to
4f5bfd9
Compare
@JamesNK - can you take a second look. I changed the pattern for the appcontext switch to be similar to our discussion, and I flipped the default behavior from fixed to broken:
|
src/Http/Routing/test/UnitTests/Matching/RouteMatcherConformanceTest.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will test duplication be removable in 5.0 when the default behavior changes? If so, worth adding a comment to those tests for future generations?
src/Http/Routing/test/UnitTests/Matching/DfaMatcherConformanceTest.cs
Outdated
Show resolved
Hide resolved
src/Http/Routing/test/UnitTests/Matching/RouteMatcherConformanceTest.cs
Outdated
Show resolved
Hide resolved
The fix is already in 5.0. This change will/should be noop merged into 5.0 and all of this code will just melt away. We're not porting forward the appcontext switch, etc. |
* Fix use of precedence in endpoint routing DFA Fixes: #18677 Fixes: #16579 This is a change to how sorting is use when building endpoint routing's graph of nodes that is eventually transformed into the route table. There were bugs in how this was done that made it incompatible in some niche scenarios both with previous implementations and how we describe the features in the abstract. There are a wide array of cases that might have been impacted by this bug because routing is a pattern language. Generally the bugs will involve a catch-all, and some something that changes ordering of templates. Issue #18677 has the simplest repro for this, the following templates would not behave as expected: ``` a/{*b} {a}/{b} ``` One would expect any URL Path starting with `/a` to match the first route, but that's not what happens. --- The change supports an opt-in via the following AppContext switch: ``` Microsoft.AspNetCore.Routing.UseCorrectCatchAllBehavior ``` Set to true to enable the correct behavior. --- The root cause of this bug was an issue in how the algorithm used to be build the DFA was designed. Specifically that it uses a BFS to build the graph, and it uses an up-front one-time sort of endpoints in order to drive that BFS. The building of the graph has the expectation that at each level, we will process **all** literal segments (`/a`) and then **all** parameter segments (`/{a}`) and then **all** catch-all segments (`/{*a}`). Routing defines a concept called *precedence* that defines the *conceptual* order in while segments types are ordered. So there are two problems: - We sort based on criteria other than precedence (#16579) - We can't rely on a one-time sort, it needs to be done at each level (#18677) --- The fix is to repeat the sort operation at each level and use precedence as the only key for sorting (as dictated by the graph building algo). We do a sort of the matches of each node *after* building the precedence-based part of the DFA, based on the full sorting criteria, to maintain compatibility. * Add test
4f5bfd9
to
e317089
Compare
This fix is ready to go once we have a branch. I've kept the no-merge label until then. |
Marked as approved. Noticed it was still in TacticView but was approved last meeting :). |
Still No Merge until 3.1.5 opens, as @rynowak said. |
Port of #20801
Description
Endpoint routing in 2.2->3.1 builds its execution graph incorrectly when a catch-all route (ex:
{**path}
) is used with higher order than a non-catchall route that overlaps it/test/route/{id?}
.As a result the matching behavior of routing does not match the documented behavior. The act of adding a route can cause an existing route to stop matching requests that previously worked.
The case that's impacted (using catch-all) is typical for a CMS-like "slug route" or for other frameworks like OData that integrate with routing or for a proxy.
Customer Impact
Yes. #18677 and #16579.
We've also received reports from internal partners.
Routing does not behave as documented. Requests that should match the catch-all route will not be matched. They may either be handled by lower priority routes, or "fall through" and result in a 404.
Depending on the scenario it may be possible to remove or reorder routes. It might be acceptable for the user's use case, but there's no workaround to restore the documented behavior of routing.
The other possible workaround is to disable endpoint routing and use the legacy routing system. This is a big backwards change for many applications. Depending on what features they are using this is not an option - Blazor Server and for instance is not supported on legacy routing.
In practice the cases that I have seen impacted are totally blocked. The user is either blocked from writing an application that functions the way they want, or they are blocked from adding features to an existing application.
Regression?
This is not a regression compared to the previous versions of endpoint routing, this bug existed with endpoint routing in 2.2 and 3.0.
Compared to legacy routing this is a regression. An app that functions on 2.1 could go through our migration guide and be broken by this bug.
Risk
Medium: The bug and fix are well understood and easily testable. We have a matrix of tests that runs with new and old implementations and are able to verify that the behavior is the same with 2.1 after the fix.
However, there's a risk that we could break an application by fixing the bug. Routing is a complex area and it's reasonable that some user might have tested their application and been happy with how it worked.
There's a set of possible cases where the fixed behavior will cause a different route to match than with the bug. Put another way the fix takes the application from a well behaved and stable (but incorrect) state to another well behaved and stable (but correct) state
Fixes: #18677
Fixes: #16579
This is a change to how sorting is use when building endpoint routing's graph of
nodes that is eventually transformed into the route table. There were
bugs in how this was done that made it incompatible in some niche
scenarios both with previous implementations and how we describe the
features in the abstract.
There are a wide array of cases that might have been impacted by this
bug because routing is a pattern language. Generally the bugs will involve a
catch-all, and some something that changes ordering of templates.
Issue #18677 has the simplest repro for this, the following templates
would not behave as expected:
One would expect any URL Path starting with
/a
to match the firstroute, but that's not what happens.
The change supports an opt-out via the following AppContext switch:
Set to true to enable the buggy behavior for compatibility.
The root cause of this bug was an issue in how the algorithm used to be
build the DFA was designed. Specifically that it uses a BFS to build the
graph, and it uses an up-front one-time sort of endpoints in order to
drive that BFS.
The building of the graph has the expectation that at each level, we
will process all literal segments (
/a
) and then all parametersegments (
/{a}
) and then all catch-all segments (/{*a}
). Routingdefines a concept called precedence that defines the conceptual
order in while segments types are ordered.
So there are two problems:
(Odd results mixing fully-tokenized attribute-route and wildcard-map within endpoint-routing #18677)
The fix is to repeat the sort operation at each level and use precedence
as the only key for sorting (as dictated by the graph building algo).
We do a sort of the matches of each node after building the
precedence-based part of the DFA, based on the full sorting criteria, to
maintain compatibility.