-
Notifications
You must be signed in to change notification settings - Fork 63
Test </p> and </br> in SVG #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Also need to test this in regular parsing mode (without |
Done. |
#data | ||
<math></p><foo> | ||
#errors | ||
10: HTML end tag “p” in a foreign namespace context. | ||
#document | ||
| <html> | ||
| <head> | ||
| <body> | ||
| <math math> | ||
| <p> | ||
| <foo> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test agrees with Firefox but I'm not entirely sure I see how the spec gives this parsing. (I assume the others are similar, this is just the one I happened to look into.)
As I understand it, the <math>
should go through a few insertion modes, inserting html
, head
, and body
elements before being processed in the "in body" insertion mode. The rules say to insert a math
element in the MathML
namespace.
The </p>
is handled by the rules for parsing in a foreign context under the "any other end tag." Step 2 gives a parse error. In step 6, node (the body
element) is an element in the HTML namespace so step 7 says to process according to the rules in the current insertion mode which is still "in body."
The rules for </p>
in the "in body" insertion mode say that if there isn't a p
element in button scope (and there is not), then this is another parse error (which is missing from this test) and to insert a p
element.
Inserting the p
should happen at the appropriate place for inserting a node. No override target is specified so target is the current node (the math
element). Foster parenting isn't enabled so the "adjusted insertion location [is] inside target, after its last child (if any)."
As far as I can tell, this should cause the following errors.
- expected doctype token
- unexpected end tag in foreign content (from the parsing in foreign context)
- unexpected end tag (from in body)
- unexpected EOF (because
foo
is not closed)
And the DOM should have the p
and foo
elements as children of math
.
There must be something wrong with my analysis, but I can't figure out what it is. Any help is appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading your analysis, I'm no longer convinced the test is correct either, but maybe you just managed to confuse me :)
(I didn't check the number of parse errors before.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test agrees with Firefox, Safari, and Chrome. I must be missing something. The Gumbo parser (at least the updated version that appears in Nokogumbo and now in Nokogiri) parses based on my understanding and puts the p
and foo
elements as children of math
.
I just checked html5ever
. It doesn't have an up-to-date html5lib-tests
submodule so I added the test and it produces the same output as Gumbo (including putting the foo
element in the MathML namespace):
input: <math></p><foo>
got:
| <html>
| <head>
| <body>
| <math math>
| <p>
| <math foo>
expected:
| <html>
| <head>
| <body>
| <math math>
| <p>
| <foo>
Could this be a spec bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed an issue in whatwg/html. I suspect I'm just misreading the spec, but I'm not sure where.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also whatwg/html#5113 and whatwg/html#6736
I believe your reading is correct - and that it was pointed out that this created a dangerous loophole for exploiting sanitizer round tripping. I'm not certain but I think maybe due to the nature of the problem vendors and some libraries may have addressed it before the spec was actually updated to reflect the necessary change.
Oh, I merged this before the corresponding spec PR; apologies for the resulting confusion. |
This is because of a change in the HTML5 parsing spec to special case closing </p> and </br> tags in foreign content (svg, mathml). See whatwg/html#5113 and html5lib/html5lib-tests#135
See whatwg/html#6736