Skip to content

[css-fonts-5][css-inline-3] Text Edge Metrics Registry #11384

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
fantasai opened this issue Dec 16, 2024 · 2 comments
Open

[css-fonts-5][css-inline-3] Text Edge Metrics Registry #11384

fantasai opened this issue Dec 16, 2024 · 2 comments
Labels
css-fonts-5 css-inline-3 Current Work i18n-afrlreq African language enablement i18n-alreq Arabic language enablement i18n-amlreq Americas Language Enablement i18n-clreq Chinese language enablement i18n-elreq Ethiopic language enablement i18n-eurlreq European language enablement i18n-hlreq Hebrew language enablement i18n-ilreq Indic language enablement i18n-jlreq Japanese language enablement i18n-klreq Korean language enablement i18n-mlreq Mongolian language enablement i18n-sealreq Southeast Asian language enablement i18n-tlreq Tibetan language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.

Comments

@fantasai
Copy link
Collaborator

fantasai commented Dec 16, 2024

One of the known issues with the text-box-trim property, the initial-letter property, and font-size-adjust, is that we only have metrics for Western and CJK writing systems. To the extent that these happen to correspond to the metrics for other scripts for a given font, authors can use them for other writing systems as well; but the correspondence is not guaranteed. For example, depending on the font, Hebrew's top edge sometimes coincides with the cap height, sometimes the ex height, and sometimes partway in between.

We need to support metrics for all writing systems. And to do that, we need to identify what's missing.

As I mentioned in #5244, I think we need a registry; but until we have one set up, I suggest we collect the information here. Specifically, the information we need for each writing system (ideally from actual graphic designers and typographers) is:

  • Name of the script
  • Name of top-edge metrics used for alignment and spacing (as known by typographers and calligraphers).
  • Name of bottom-edge metrics used for alignment and spacing (same)
  • Whether or not these metrics always correspond to the Latin cap-height/ex-height/baseline or CJK ideographic top/bottom when designed into the same font as Latin or CJK glyphs.
  • If the answer isn't always, design samples (ideally from multiple fonts with differing metrics) showing:
    • a mixture of Latin glyphs and the script's own glyphs (and optionally also CJK glyphs, if relevant and available) from the same font for comparison
    • text wrapped and optically aligned within a rectangular colored box
    • text optically centered against a rectangular colored box
    • text optically top-aligned to a rectangular colored box
    • text optically bottom-aligned to a rectangular colored box
  • Optionally: List of other scripts to which this script's metrics typographically correspond. (For example, the height of Greek capital letters basically always correspond to the height of Latin capital letters.)
  • Optionally: Ideas on how to figure out the top/bottom edge metrics if they're missing, whether by calculating from existing metrics or by measuring certain glyphs... (For example, CSS suggests a missing ex metric fall back to either 0.5em or a measurement of the letter 'o'.)
  • If not a member of the CSSWG, permission to include this information in a future W3C technical report. :)
@fantasai fantasai added css-inline-3 Current Work i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. css-fonts-5 Agenda+ i18n Add to agenda for CSS-i18n calls labels Dec 16, 2024
@w3c w3c deleted a comment Dec 30, 2024
@astearns astearns moved this to FTF agenda items in CSSWG January 2025 meeting Jan 27, 2025
@svgeesus svgeesus moved this from FTF agenda items to Thursday morning in CSSWG January 2025 meeting Jan 28, 2025
@bradkemper
Copy link
Contributor

Are symbol/dingbat font characters always cap height to baseline?

@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed [css-fonts-5][css-inline-3] Text Edge Metrics Registry, and agreed to the following:

  • RESOLVED: Start a shared registry on layout-affecting lines as a joint effort between CSS and i18n
The full IRC log of that discussion <emilio> fantasai: we've created a new feature, text-box-trim, that relies on knowing the strong top/bottom edges of the font
<bramus> ScribeNick: emilio
<emilio> ... we only have metrics for some writing systems
<emilio> ... e.g open type gives you cap height, but the top might or might not be it
<emilio> ... we've requested this 8 years ago
<emilio> ... and they still have not done anything about it, but maybe they don't know what it might look like
<emilio> ... so in order to get things moving somebody needs to collect this info
<emilio> ... so I propose it to be a joint effort
<emilio> ... w3c has the concept of registry
<emilio> ... we can create one where each metric is a writing system
<emilio> ... where we can have what the top / bottom are
<emilio> ... and if they don't correspond with latin then we need more metrics
<emilio> ... also examples and pictures
<emilio> ... and optionally how to derive that metric if you don't have it
<emilio> ... that's basically the idea
<emilio> astearns: if we could collect this info, what would we use it for?
<emilio> fantasai: one, tell opentype so that they can add metrics
<emilio> ... also we need to add keywords in text-box-trim to reference those edges
<florian> q+
<r12a> q+
<emilio> astearns: but if OT hasn't provided a metric in the format and fonts for that writing system aren't consistent there's nothing we can do with that keyword
<astearns> ack florian
<emilio> fantasai: well we could derive it from heuristics / existing metrics
<emilio> florian: agree it'd be useful, and someone needs to do this
<emilio> ... the keywords won't be super useful until fonts get there but they're also not useless
<emilio> ... in terms of who gives us info for the registry, there's 2 groups
<emilio> ... people givin examples
<emilio> ... on this language the top matches x other language or what not
<emilio> ... that could be anyone
<emilio> ... but then there needs to be a second group
<emilio> ... that reviews the former
<emilio> ... and as soon as their line is different from other line and we just collect
<emilio> ... that might give us too many lines
<emilio> ... and some of them might be the same
<emilio> ... so there's an interest on getting as many people as possible to contribute to adding data
<astearns> ack r12a
<emilio> ... but then we need another group to triage
<emilio> r12a: if we had these labels, how is it useful if OT has them?e
<emilio> s/e//
<emilio> ... we have hanging/roman/alphabetic baseline
<florian> q+
<emilio> ... I had the impression that you'd have those baselines defined but also you have the height right? But not sure how that would look on languages that have both upper and lowercase
<emilio> ... is it script-specific?
<emilio> ... e.g. x-height is not the same in many script
<emilio> ... those are questions I have about what this is for
<astearns> ack florian
<emilio> florian: 2 examples
<emilio> ... initial-letter, you align the top of the enlarged letter with the top of the regular text
<emilio> ... what is it? cap height? x height? hebrew top? thai top?
<emilio> ... if the language has a top that doesn't match cap / x height we need to know
<emilio> ... same if you're vertical-aligning to any line that isn't latin / cjk
<emilio> ... same for text trimming, and you're dealing with a metric that isn't cap / x height
<emilio> ... e.g. hebrew
<r12a> qq+
<emilio> ... so we need either opentype to tell us where that line is
<emilio> ... or have a way of compute it
<emilio> fantasai: or adding a font descriptor that tells you
<astearns> ack r12a
<Zakim> r12a, you wanted to react to florian
<emilio> ... I know where the line is even though
<emilio> r12a: so absolute or average line? In my ??? notes I have some diagrams and it really varies by font
<emilio> ... so question is would we be wanting per font metrics rather than per-script metrics?
<emilio> ... there's wide differences even within the same script
<emilio> ... the extent to which ascenders go up and down
<emilio> fantasai: which is why we want the font or font descriptor to provide it, but the line has a name which in latin might be the cap height, but in hebrew might be the aleph line...
<astearns> q+
<emilio> ... if we didn't have a cap height on designers would need to tweak the alignment to match it
<emilio> ... and same in hebrew or thai
<emilio> ... we need to know the name of the line and whether it matches an existing metric
<emilio> florian: so the position of the line is per font
<emilio> ... but the existence of the line is per script
<emilio> astearns: the writing system might have a traditional design line but how different fonts deal with that might or might not match with that line
<r12a> q+
<florian> q+
<emilio> ... so in order for this to be effective we'd need a way of matching the design line with particular font metrics from the font descriptor or so
<emilio> ... so that in one font they get the right line
<astearns> ack astearns
<emilio> ... which wouldn't be in regular CSS but on the font-face descriptor
<ChrisL> q+
<emilio> florian: yeah but in order to allow a descriptor to tell us where the line is we need to agree on the existance and name of the line
<emilio> ... in english we have x/cap height
<ChrisL> rrsagent, here
<RRSAgent> See https://www.w3.org/2025/01/30-css-irc#T16-29-03
<emilio> ... a design for a particular font the os might be higher and not really respect those lines
<emilio> ... yet in latin scripts there's an x and cap height and they're typically relevant
<florian> q-
<emilio> ... and in some scripts those are not defined
<astearns> ack r12a
<emilio> r12a: I'm starting to think that what we're hoping to do here is to define generic lines for fonts or possibly define generic lines per script and once we have defined what lines are appropriate we can start using those
<emilio> ... it might be better to start with "what are the functions for which we want to use these"
<ChrisL> q-
<emilio> ... e.g. what is the line to use to align first-letter?
<emilio> ... what is the the line to align with a particular point of the page
<emilio> ... rather than trying to create some generic repository of lines, start from the use
<emilio> fantasai: that's why I think we need to look at the line for alignments
<emilio> r12a: didn't see that reflected on the discussion
<emilio> astearns: we have things people would want to use lines for today, but for some scripts we don't yet have the places where they'd like to use their own layout-affecting lines
<ChrisL> q+
<emilio> ... so I think in creating this registry we will find things that need to be added
<emilio> ... to create layout in less popular writing-systems
<emilio> fantasai: if we focus on the thing that everybody needs to do (align things to other things and spacing) I think we'd get most of the answers we need
<astearns> ack ChrisL
<emilio> ChrisL: Is this registry a temporary thing while we propose it to ?? or are we going on our own direction
<emilio> astearns: this was discussed, we want to nag opentype to do this thing
<emilio> fantasai: and we're not defining a technical feature, just collect this information
<emilio> ... which would provide info on what keywords we add and metrics we need
<emilio> ChrisL: once we have it would be good to give a heads up to the right people, even if it's not really ready yet
<emilio> astearns: so proposal is to start collecting these layout affecting lines in various typographical systems
<emilio> ... and that'd be a joint effort between CSSWG andf i18n?
<ChrisL> q+
<r12a> pretty picture: https://r12a.github.io/scripts/arab/ug.html#baselines
<emilio> r12a: I think sounds fine if people can find the time
<emilio> fantasai: florian and I can find the time
<astearns> ack ChrisL
<noamr> In hebrew the equivalent of the x height is the Mem height (ם). I think this information is openly available for some languages?
<emilio> ChrisL: I was thinking that the various gap analysis for different languages would be useful
<emilio> ... so that collects the experts from one language and [missed]
<fantasai> noamr, awesome. I've been wondering about what it's called. :)
<emilio> r12a: I think it'd be cool to collect an initial set of what we need first
<emilio> ChrisL: agreed
<emilio> RESOLVED: Start a shared registry on layout-affecting lines as a joint effort between CSS and i18n
<noamr> fantasai: I think my grandpa defined some of this stuff for Hebrew decades ago :)
<dbaron> s/fantasai:/fantasai,/

@css-meeting-bot css-meeting-bot removed Agenda+ i18n Add to agenda for CSS-i18n calls Agenda+ F2F labels Jan 30, 2025
@r12a r12a added i18n-sealreq Southeast Asian language enablement i18n-ilreq Indic language enablement i18n-jlreq Japanese language enablement i18n-clreq Chinese language enablement i18n-alreq Arabic language enablement i18n-klreq Korean language enablement i18n-tlreq Tibetan language enablement i18n-mlreq Mongolian language enablement i18n-eurlreq European language enablement i18n-afrlreq African language enablement i18n-elreq Ethiopic language enablement i18n-hlreq Hebrew language enablement i18n-amlreq Americas Language Enablement labels Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
css-fonts-5 css-inline-3 Current Work i18n-afrlreq African language enablement i18n-alreq Arabic language enablement i18n-amlreq Americas Language Enablement i18n-clreq Chinese language enablement i18n-elreq Ethiopic language enablement i18n-eurlreq European language enablement i18n-hlreq Hebrew language enablement i18n-ilreq Indic language enablement i18n-jlreq Japanese language enablement i18n-klreq Korean language enablement i18n-mlreq Mongolian language enablement i18n-sealreq Southeast Asian language enablement i18n-tlreq Tibetan language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
Projects
Status: Thursday morning
Development

No branches or pull requests

4 participants