Skip to content

[css-pseudo] Consider using Unicode ZWJ and ZWNJ to control :first-letter inclusion #6242

Closed
@faceless2

Description

@faceless2

I'm proposing we introduce control over which letters are considered part of the first-letter by using Unicode joiners, by specifying that the first-letter pseudo-element must not break at a ZWJ and must break at ZWNJ.

That would allow us to support the example in #3208 - two initial "V" letters forming an archaic "W", which could be represented as V‍V. It would also be a simpler way of solving some - although not all - of the various use-cases raised in #2040. I understand the requirements raised in that issue, but the solution is quite complex. Offering a quick and easy way of solving several of those cases with markup might be easier to understand for authors.

ZWJ/ZWNJ are currently not mentioned in this area of the spec at all, but it's acknowledged that the first letter might be more than a single base character - the Dutch "IJ" ligature are given as an example. Cases where ZWJ or ZWNJ might already exist in this context are where the first letter is emoji or from the arabic family (also theoretically seen in Hangul). The intent is to build a single typographic unit from multiple codepoints, which would all be part of the first letter if it applied. So there's no compat issue here that I can see.

Finally, and by far the most important, it should be very easy to implement. The existing logic that scans the start of the text for punctuation to determine where the first letter ends would just need adjusting to add tests for ZWJ and ZWNJ as well.

(originally an idea from #3208)

Metadata

Metadata

Assignees

No one assigned

    Labels

    css-pseudo-4Current Worki18n-trackerGroup bringing to attention of Internationalization, or tracked by i18n but not needing response.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions