php-mf2 is a generic microformats-2 parser. It doesn’t have a hard-coded list of all the different microformats, just a set of procedures to handle different property types (e.g. p-
for plaintext, u-
for URL, etc). This allows for a very small and maintainable parser.
Install with Composer by adding "mf2/mf2": "0.1.*"
to the require
object in your composer.json
and running php composer.phar update.
mf2 is PRS-0 autoloadable, so all you have to do to load it is:
- Include Composer’s auto-generated autoload file (
/vendor/autoload.php
) - Declare
mf2\Parser
in youruse
statement - Make a
new Parser($input)
where$input
can either be a string of HTML or a DOMDocument
<?php
include $_SERVER['DOCUMENT_ROOT'] . '/vendor/autoload.php';
use mf2\Parser;
$parser = new Parser('<div class="h-card"><p class="p-name">Barnaby Walters</p></div>');
$output = $parser -> parse();
print_r($output);
// EOF
Parser::parse() should return an array structure mirroring the canonical JSON serialisation introduced with µf2. print_r
ed, it looks something like this:
Array
(
[items] => Array
(
[0] => Array
(
[type] => Array
(
[0] => h-card
)
[properties] => Array
(
[name] => Barnaby Walters
)
)
)
)
Note that, whilst the property prefixes are stripped, the prefix of the h-*
classname is left on.
A baseurl can be provided as the second parameter of mf2\Parser::__construct()
— it’s prepended to any u-
properties which are relative URLs.
Different µf-2 property types are returned as different types.
h-*
are associative arrays containing more propertiesp-*
andu-
are returned as whitespace-trimmed stringsdt-*
are returned as \DateTime objectse-*
are returned as non HTML encoded strings of markup representing theinnerHTML
of the element classed ase-*
Little to no filtering of content takes place in mf2\Parser, so treat its output as you would any untrusted data from the source of the parsed document
php-mf2 follows the various µf2 parsing guidelines on the microformats wiki. Useful reference:
php-mf2 includes support for implied p-name
, u-url
and u-photo
as per the µf2 parsing process, with the result that every microformat will have a name
property whether or not it is explicitly declared. More info on what this is any why it exists in the µf2 FAQ.
It also includes an approximate implementation of the Value-Class Pattern, currently acting only on dt-*
properties but soon to be rolled out to all property types
When a DOMElement with a classname of e-* is found, the DOMNode::C14N() stringvalue of each of it’s children are concatenated and returned
Currently php-mf2 is tested fairly thoroughly, but the code itself is not hugely testable (lots of repetition and redundancy). This is something I’m working on changing
Tests are written in phpunit and are contained within /tests/
. Running phpunit . from the root dir will run them all.
There are enough tests to warrant putting them into separate suites for maintenance. The different suits are:
ParserTest.php
: Tests for internal,e-*
parsing and sanity checks.ParseImpliedTest.php
: Tests of the implied property patternsCombinedMicroformatsTest.php
: Tests of nested microformatsMicroformatsWikiExamplesTest.php
: Tests taken directly from the wiki pages about µf2Parse*Test.php
forP
,U
andDT
. Contains tests for a particular property type.
As of v0.1.6, the only property with any support for value-class is dt-*
, so that currently contains the value-class tests. These should be moved elsewhere as value-class and value-title are abstracted and rolled out to all properties.