Skip to content

Properly handle nested objects #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aaronpk opened this issue Jul 5, 2014 · 7 comments
Closed

Properly handle nested objects #31

aaronpk opened this issue Jul 5, 2014 · 7 comments
Labels

Comments

@aaronpk
Copy link
Member

aaronpk commented Jul 5, 2014

It seems the parser is not handling nested objects properly.

For example, this URL: http://aaronparecki.com/notes/2014/07/04/2/indiewebcamp-latergram

It appears the comment authors and comment URLs all show up under the main h-entry when in reality they should be under children of the main h-entry as their own h-cite objects.

Compare the result of the PHP parser

@mmitchellg5
Copy link

i believe this is related to this section from microformats wiki:

http://microformats.org/wiki/microformats-2 Quote:

FOR PARSERS ONLY:

Without a property class name like 'p-org' holding all the nested objects together, we need to introduce >another array for nested children (similar to the existing DOM element notion of children) of a >microformat that are not attached to a specific property:

Parsed JSON:

{
"items": [{
"type": ["h-card"],
"properties": {
"name": ["Mitchell Baker"],
"url": ["http://blog.lizardwrangler.com/"]
},
"children": [{
"type": ["h-card","h-org"],
"properties": {
"name": ["Mozilla Foundation"],
"url": ["http://mozilla.org/"]
}
}]
}]
}

Since there's no property class name on the element with classes 'h-card' and 'h-org', the microformat representing that element is collected into the children array.

Such a nested microformat implies some relationship (containment, being related), but is not as useful as if the nested microformat was a specific property of its parent.

For this reason it's recommended that authors should not publish nested microformats without a property class name, and instead, when nesting microformats, authors should always specify a property class name (like 'p-org') on the same element as the root class name(s) of the nested microformat(s) (like 'h-card' and/or 'h-org').

which appears not yet implemented.

@aaronpk
Copy link
Member Author

aaronpk commented Dec 13, 2014

Any updates? I just had to do an ugly workaround for this, dropping down to use the to_hash version of the parsed data: aaronpk/webmention.io@d2cc836#diff-411ca3c70351e774091d525fab8264b9R330

@mmitchellg5
Copy link

Not as of yet, unfortunately

On Sat, Dec 13, 2014 at 9:29 AM, Aaron Parecki [email protected]
wrote:

Any updates? I just had to do an ugly workaround for this, dropping down
to use the to_hash version of the parsed data: aaronpk/webmention.io@
d2cc836#diff-411ca3c70351e774091d525fab8264b9R330
aaronpk/webmention.io@d2cc836#diff-411ca3c70351e774091d525fab8264b9R330


Reply to this email directly or view it on GitHub
#31 (comment).

Michael Mitchell
SOFTWARE ENGINEER

[image: G5 Website] http://www.getg5.com/ DIGITAL EXPERIENCE MANAGEMENT
www.GetG5.com http://www.getg5.com/
T 541.306.3374

FOLLOW US http://www.getg5.com/

https://plus.google.com/u/0/101198449642176712699/about

http://www.linkedin.com/company/getg5

https://twitter.com/G5Platform

http://www.getg5.com/blog/

https://www.facebook.com/GetG5

This email may contain information that is privileged, confidential, or
proprietary and is intended solely for the named addressee. If you are not
the addressee, or if it appears that you have received this email in error,
please advise me immediately by reply email, do not disclose, copy, or
distribute the contents, and immediately delete the message and any
attachments from your system. Thank you.

@jeena
Copy link
Collaborator

jeena commented Aug 3, 2015

I tried to parse http://tantek.com/ and it crashed the parser on something like this:

<div class="h-entry">
  <p class="u-comment h-cite">
    test .
  </p>
</div>

with:

URI::InvalidURIError: bad URI(is not URI?): test .

It would be nice if the parser at least wouldn't crash.

@veganstraightedge
Copy link
Contributor

@jeena It seems like the parser is pretty much abandoned by the G5 folks. We got paid to build it (when I was still there) as a part of a larger product. But if it does as much as G5 needs and they're otherwise too busy, it's not likely to get the attention it deserves.

If you're able and willing to submit a pull request with a patch, I could apply it for you.

Unfortunately, (like too many open source projects) this doesn't really have a maintainer anymore. 😕

@jeena
Copy link
Collaborator

jeena commented Aug 9, 2015

Oh ok, that's sad, but understandable. Perhaps one could write that somewhere into the README so people know and perhaps someone will be able to take it over. I'm not sure I will be able to fix something like this but if then I will make a pull request.

@veganstraightedge
Copy link
Contributor

@jeena @aaronpk I believe this is fixed in 3.0. I just did a test and it looks good to me. Please upgrade and run your comparison too. Re-open this issue if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants