-
Notifications
You must be signed in to change notification settings - Fork 39
Fix implied photo parsing #191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Mf2/Parser.php
Outdated
@@ -1166,8 +1166,8 @@ public function parseImpliedPhoto(\DOMElement $e) { | |||
$xpaths = array( | |||
'./img', | |||
'./object', | |||
'./*[not(contains(concat(" ", @class), " h-"))]/img[count(preceding-sibling::img)+count(following-sibling::img)=0]', | |||
'./*[not(contains(concat(" ", @class), " h-"))]/object[count(preceding-sibling::object)+count(following-sibling::object)=0]', | |||
'./*[count(preceding-sibling::*)+count(following-sibling::*)=0][not(contains(concat(" ", @class), " h-"))]/img[count(preceding-sibling::img)+count(following-sibling::img)=0]', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this XPath a little easier to read perhaps? I still need to test it, but seems like something as follows should do the same without requiring arithmetic:
./*[not(contains(concat(" ", @class), " h-")) and count(../*) = 1 and count(img) = 1]/img
count(../*)
counts the number of child elements of the parent. In other words our*
is only matched if it is the only child (= 1
).count(img)
counts the number of childimg
elements within. So*
is only matched if it contains only a singleimg
(= 1
).
This gets rid of all the sibling counting, summations, etc. Could also be adapted for object
elements.
Simplified xpath count() using traversal. Added selectors from spec as comments. Moved child h-* check into xpaths. Added check for xpath query() returning false.
@@ -1078,7 +1078,7 @@ public function parseH(\DOMElement $e, $is_backcompat = false, $has_nested_mf = | |||
$photo = $this->parseImpliedPhoto($e); | |||
|
|||
if ($photo !== false) { | |||
$return['photo'][] = $this->resolveUrl($photo); | |||
$return['photo'][] = $photo; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without resolving here, the implied values from img.h-x[src]
and object.h-x[data]
will never get resolved. Why not keep resolveUrl()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add resolveUrl to those two. My thinking is parseImpliedPhoto() should return an absolute URL or false.
LGTM 👍 |
Fixes #190