Support Glyphs3 RTL kerning #778

yanone · 2022-02-21T17:45:11Z

Glyphs 3 kerning is sorted into the new kerningLTR and kerningRTL font data attributes instead of the previous plain kerning. The kerning for RTL is now stored in opposite attributes, contradicting the first/second logic of UFO and G2. @MMK_L_* becomes @MMK_R_* (and vv), and the visually right glyph side’s kerning class is now stored in kernRight as opposed to the previous kernLeft (=first/incoming side).

This means that correctly sorting glyphs into first and second UFO groups now involves checking a glyph’s script and whether or not it’s a number (that are always LTR even for RTL scripts).

This in turn required a fix for the glyph data generator that previously didn't generate the script for unencoded derivative glyphs such as hah-ar.init.swsh. The algorithm then cycles through the glyph name variations by cutting off dot-separated suffices from the right until glyph data could be found (hah-ar.init.swsh->hah-ar.init), and adds only the script value, as the other values were found to be already correctly generated.

…swsh), find it

…yph data

…e identically

yanone · 2022-02-22T08:38:13Z

One sec, I’m gonna rewrite the script construction for the glyph data to be more aligned with the methods already in place for constructing the production name and categories.

…on, now also processes ligatures

yanone · 2022-02-22T09:27:00Z

Okay, I’m done. The script construction is now analog to the category construction, and also covers ligatures as requested in the #TODO comment.
The failure in the Windows tests I believe is unrelated to this PR.

yanone · 2022-02-22T09:30:48Z

It seems wasteful to do the info lookup for each property. Why not add one getGlyphInfo(name, data) method and then get all needed info from it?

I didn’t write the initial state of this method, but what you’re describing is already happening, I believe, in the initial attributes = _lookup_attributes(glyph_name, data).

The subsequent methods construct the missing attributes in case a glyph couldn’t be found in the data by its name or unicode, which is quite the norm for my Arabic use cases (hah-ar.init.swsh)

schriftgestalt · 2022-02-22T09:47:12Z

My comment was more general, not about your changes. I should make a separate issue.

anthrotype · 2022-02-22T09:50:42Z

Lib/glyphsLib/builder/groups.py

+    if glyph.parent.format_version > 2:  # Glyphs 3
+        glyph_data = glyphdata.get_glyph(glyph.name)
+        if (
+            glyph_data.script in ("arabic", "hebrew")


why only "arabic" and "hebrew"?

These are the two RTL scripts to my knowledge. This point is arguably the critical one, also about the numbers. I chose those to my best knowledge. Some algorithm needs to be in place to sort Glyphs’ kerning classes into the correct first/second pairs, and I didn’t know any better.

I can add the writing direction to the glyphdata.xml but there might be a way to query unicode to get the writing direction of each code. Then we don’t need to check for script == arabic and category != Number.

You can get script direction from a function in ufo2ft.utils.

(I’m on the road this week so won’t be able to review.)

I can add the writing direction to the glyphdata.xml but there might be a way to query unicode to get the writing direction of each code. Then we don’t need to check for script == arabic and category != Number.

Yes, I could try this approach. @anthrotype Seems like a good idea?

khaledhosny · 2022-02-22T10:04:41Z

This is getting dangerously convoluted. The designer manually determines which kerning is RTL and which is LTR, then glyphsLib turns this into a single kerning table, then ufo2ft’s kern feature writer will try to deduce which kerning is RTL and which is LTR.

khaledhosny · 2022-02-22T10:06:27Z

Also, some of the work being done here conflicts with #771.

anthrotype · 2022-02-22T10:12:26Z

Lib/glyphsLib/builder/kerning.py

+        if direction == "LTR":
+            match = re.match(r"@MMK_L_(.+)", left)
+        elif direction == "RTL":
+            match = re.match(r"@MMK_R_(.+)", left)


wait.. does Glyphs 3 really call the class on the left-hand side @MMK_R_ and the one on the right-hand side @MMK_L_ for RTL kerning?!

I’ll try to answer: The kerning pairs are still defined in logical order. So the code logic doesn’t change here. First stays first, second stays second. Additionally, for the new RTL-specific kerning table, Georg has switched the L and R to now be visual rather than logical, (as he did with kernLeft and kernRight classes) which I believe shouldn't have happened. We should have silently accepted that L and R were never appropriate names but are used to describe 1st and 2nd.

FYI, you can check the two added RTL_kerning_v2.glyphs and RTL_kerning_v3.glyphs files for reference, in particular the kerning pair reh-ar,hah-ar.init.swsh (in logical order) which is defined in both.

yanone · 2022-02-22T10:15:16Z

This is getting dangerously convoluted. The designer manually determines which kerning is RTL and which is LTR, then glyphsLib turns this into a single kerning table, then ufo2ft’s kern feature writer will try to deduce which kerning is RTL and which is LTR.

We’re hoping for a remedy with UFO 4 to support separate kerning tables for each writing direction, if I understand correctly? Today, G3 sources don’t produce any Arabic kerning at all, so there is an urgency.

Also, some of the work being done here conflicts with #771.

They way I see it, it doesn't conflict with my issue logically, so I could easily wait it out and apply my changes to the updated code after #771. Is that PR under active development?

anthrotype · 2022-02-22T10:17:29Z

I actually agree with Khaled. This is getting too complicated and we need to step back a little. Who decides what pairs get placed in the kerningLTR and kerningRTL? If it's the user, then this classification needs to be carried across to ufo2ft somehow, we can't rely on ufo2ft splitting it again, with the risk of applying a slightly different split.

Second question: in the kerningRTL dictionary, is the order logical (like in OpenType's and UFO's first and second glyph depending on a script writing direction) or is it a visual order, like literally "left" and "right" (as in the proposed but never implemented hkerning.plist in UFO4)?

schriftgestalt · 2022-11-10T22:32:57Z

I have finished the GlyphsInfo stuff. Please have a look at the GlyphsInfo3 branch and pull request.

simoncozens · 2022-12-02T07:27:11Z

I've said this privately to Yan, but I think it's worth talking about here as well:

I have some misgivings about the direction of this PR, but I’m aware that some of that may be to do with the fact that I have some misgivings about the whole concept of a separate RTL kerning dictionary. So it’s hard for me to separate my feelings about the PR implementation from my feelings about what it’s implementing.

I think one of the reasons why you're not getting traction on this PR (at least from me), is that conceptually I can't get my head around it. Splitting kerning by direction at the source level doesn't make sense to me. Most glyphs have inherent directionality so you don't need to store this separately. For those which do not, like kerning two common script glyphs together, then what? Two numbers will be displayed in logical order regardless of direction, so that doesn't need a separate RTL dictionary. So really the only case this deals with is punctuation against punctuation. And then why split by direction? Perhaps your Hebrew is more tightly spaced than Arabic - then what? I'm sure there are reasons why separate LTR and RTL kerning is a thing. But to be honest, none of them convince me. So it's not something I really want to be spending my time thinking about anyway.

And when I do think about it, in terms of the implementation, I feel like this could be much, much simpler. glyphsLib is part of a UFO-based compilation approach where we convert glyphs files to UFO for compilation. There is only one kerning dictionary in UFO. So no matter what we do, everything is going to end up in a single kerning dictionary anyway, at which point it is the responsibility of ufo2ft to turn the contents of that dictionary into GPOS rules. ufo2ft already does the clever stuff about working out whether a kern is LTR or RTL.

So my conclusion is that glyphsLib does not need to do that - it just needs to gather all the kerns wherever they live and put them into a UFO kerning file. And now we have a situation where this is all based on a GlyphsInfo file? That makes even less sense to me.

schriftgestalt · 2022-12-02T08:31:09Z

It is not Jan’s fault that their file structure is how it is.

In Glyphs, there are different aspects about the kerning.

It needs to be able to produce proper GPOS.
It needs to display the data in the UI.
should the pairs be stored in typing or visual order. Same goes for classes.

Splitting it into two kerning dicts made the handling of the UI much easier and fixed some edge cases that were in Glyphs 2.

I’m happy to discuss this in detail and if you can explain how to handle it properly, I can change it for the next version.

simoncozens · 2022-12-02T08:55:50Z

I think the way to handle it properly is to combine kerningLTR and kerningRTL into one dictionary and then let ufo2ft do its thing. But I haven't tested this.

yanone · 2022-12-02T09:20:27Z

I'm eager to try.

_glyph_kerning_attr() (https://github.com/googlefonts/glyphsLib/blob/support_glyphs3_rtl_kerning-3/Lib/glyphsLib/builder/groups.py#L141, this is my latest branch, not this one) uses the glyph direction to determine the group. You're saying, the group and order should be determined strictly by the origin dict of the pair (LRT or RTL)? I'm not sure where to begin here.

This would work, but will break round-tripping, as for UFO to Glyphs conversion we wouldn't know in which dict to move the pairs then.

yanone · 2022-12-02T09:31:14Z

So I think it's really the group where the problem lies here.

The kerning dict conversion in kerning.py probably won't need changing.
But the two separate kerning dicts are only half of the story.

Since Georg also decided to switch the kerning groups from leftKerningGroup to rightKerningGroup and vv for RTL, we need to know the writing direction of each glyph for groups.py, which is how we ended up here in the first place.

yanone · 2022-12-02T09:34:20Z

The only solution to this that I can see without Georg’s updated glyph data is to decide the group sides based on whether or not a glyph can be found as part of the RTL kerning, so walk the dict for each group decision (or compile that once for speed)

yanone · 2022-12-02T09:37:25Z

... which won’t work for bidirectional glyphs. They'll basically always end up with the wrong side groups as soon as they are referenced anywhere in the RTL dict.

simoncozens · 2022-12-02T11:13:52Z

ufo2ft already does the clever stuff about working out whether a kern is LTR or RTL.

You talk about needing to know the writing direction for each glyph, but ufo2ft already works this out: take the writing direction for encoded glyphs from Unicode and extend it via GSUB closure to unencoded glyphs. We don't need to use GlyphsInfo for it.

anthrotype · 2022-12-02T11:18:57Z

So I think it's really the group where the problem lies here... Since Georg also decided to switch the kerning groups from leftKerningGroup to rightKerningGroup and vv for RTL, we need to know the writing direction of each glyph for groups.py, which is how we ended up here in the first place.

Yes, Yanone is right, this is the crux of the problem. It's not sufficient to simply merge the two kerning dicts. We need to find a way to deal with the groups as well.

simoncozens · 2022-12-02T11:23:42Z

The groups are going to be split by the kern splitter, though.

anthrotype · 2022-12-02T11:28:50Z

the problem is _glyph_kerning_attr method, which determines what is to be put in public.kern1 and public.kern2 kerning groups, used here:

glyphsLib/Lib/glyphsLib/builder/groups.py

Lines 75 to 82 in f012805

    
           for glyph in self.font.glyphs.values(): 
        
               for side in 1, 2: 
        
                   if (glyph.name, side) not in recovered: 
        
                       attr = _glyph_kerning_attr(glyph, side) 
        
                       group = getattr(glyph, attr) 
        
                       if group: 
        
                           group = f"public.kern{side}.{group}" 
        
                           groups[group].append(glyph.name)

This returs "leftKerningGroup" or "rightKerningGroup" depending on the UFO group's logical position (1 or 2). For Glyphs-2 kerning, this would always be "rightKerningGroup" for public.kern1 and "leftKerningGroup" for public.kern2.
Now, since Glyphs3, for RTL kerning, these have been swapped: i.e. "leftKerningGroup" for public.kern1 and "rightKerningGroup" for public.kern2.

anthrotype · 2022-12-02T11:32:23Z

will break round-tripping, as for UFO to Glyphs conversion we wouldn't know in which dict to move the pairs then

we could work around this by storing stuff in the lib, but at this stage is more urgent to be able to compile these fonts, we can add support for roundtripping at a later stage

yanone · 2022-12-02T11:35:59Z

You have no idea how delighted I am that someone utters the word urgency in this context other than me. It is indeed the case that this whole disaster is currently leading to Arabic and Hebrew typography being slowed down and disadvantaged. I know of several projects by several studios that are on hold since a while now, they're just not speaking out about it even though I asked them to.

simoncozens · 2022-12-02T11:38:21Z

So how I would attempt to approach this would be to put everything into two groups, e.g. public.kern1.A and public.kern2.RTL.A if it's a left glyph, or public.kern2.A and public.kern1.RTL.A if it's a right glyph, and when merging the two dictionaries together, rewrite the groups depending on whether you're processing kerningLTR or kerningRTL.

yanone · 2022-12-02T11:39:02Z

Deriving the writing direction of a glyph from Unicode will have to go through glyphdata.py to make it work for unencoded derivative glyphs, which will prove again why GlyphData.xml was a mistake in the first place rather than using Unicode on-the-fly.

anthrotype · 2022-12-02T11:47:14Z

hm Simon is onto something, keep going!

yanone · 2022-12-02T11:52:49Z

So basically here?

yanone · 2022-12-02T11:59:40Z

So how I would attempt to approach this would be to put everything into two groups, e.g. public.kern1.A and public.kern2.RTL.A if it's a left glyph, or public.kern2.A and public.kern1.RTL.A if it's a right glyph, and when merging the two dictionaries together, rewrite the groups depending on whether you're processing kerningLTR or kerningRTL.

I understand the first part, but I don't understand where the merging is happening that Simon is speaking of.

simoncozens · 2022-12-02T13:54:06Z

OK, I'm going to avoid using real examples because if I do, I'll get horribly confused with MMK_L_ and MMK_R_ and muck it up, and that won't help explain anything.

So the first step is to store each glyph into @MMK_some_kerning_group and also into @MMK_some_other_kerning_group_RTL.

After we've built all those groups, what we currently do is:

iterate through each pair defined in kerning or kerningLTR
for each pair, if it contains a group, transform the groups into @public.kernX.whatever where X depends on whether it was left or right
then emit the kern pair into the UFO kerning.

We keep doing that. But then next, we also do the same thing but for kerningRTL:

for each pair, if it contains a group, transform the groups into @public.kernX.RTL.whatever where X depends on whether it was right or left. (i.e. switching it round here)
then emit the kern pair into the UFO kerning.

That will work for all cases except for situations where the same glyph pair has both LTR and RTL kerning, with different values in each table. But we were never going to get that right anyway.

anthrotype · 2022-12-02T13:59:01Z

wow.. and the result of this very complicated process is basically to undo Glyphs3's LTR vs RTL kerning split such that ufo2ft can later on redo that in its own way (ie. using cmap+GSUB closure to identify script and pair direction).

schriftgestalt · 2022-12-02T14:01:55Z

That will work for all cases except for situations where the same glyph pair has both LTR and RTL kerning, with different values in each table. But we were never going to get that right anyway.

That is also one reason why I did the split.

@schriftgestalt

if I look at googlefonts/glyphsLib#778, I see the name of the Right-to-left kerning dictionary in the .glyphs plist file uses the correct spelling `kerningRTL`, so the incorrect misspelled `kerningRLT` must have been fixed already and a mention of it is no longer needed in this .md doc? /cc @schriftgestalt @yanone

yanone · 2022-12-08T14:40:17Z

Gentlemen, I’ve implemented Simon’s idea and am happy to announce that I think it works. I made it into a separate PR for clarity: #838

yanone · 2023-03-03T16:48:47Z

This is the first of the three related PRs and has now been fixed in #865, so closing this too.

yanone added 7 commits February 21, 2022 18:32

Attaching previously unattached attributes

d8bda71

When script isn't found for unencoded derivate glyphs (= hah-ar.init.…

698cce4

…swsh), find it

Correctly read newly sorted RTL kerning class values

5cb0fa3

Correctly read newly sorted kerning classes by reading script from gl…

5d1f602

…yph data

Script guessing test

e9c20e5

Fixed someone else’s typo

f09d578

Added G2 and G3 test files with RTL kerning, making sure both generat…

d9295ad

…e identically

yanone requested review from simoncozens and anthrotype February 21, 2022 17:45

yanone added 8 commits February 21, 2022 18:48

Formatting

0cb3787

Formatting

91dee17

Formatting

9e7bc65

Formatting

61a35d1

Oops, forgot to remove print statement

a6f2ace

Formatting

2fe2e9b

Formatting

9e6af8e

Comment was erroneously copied from another test

877f862

yanone added 2 commits February 22, 2022 10:12

Rewrote script construction to align with other glyph data constructi…

bb799de

…on, now also processes ligatures

Formatting

8541e61

anthrotype reviewed Feb 22, 2022

View reviewed changes

schriftgestalt mentioned this pull request Feb 22, 2022

Refactor glyph info lookup #779

Open

anthrotype reviewed Feb 22, 2022

View reviewed changes

eliheuer mentioned this pull request Dec 1, 2022

Solitreo: Version 1.000; ttfautohint (v1.8.4.7-5d5b) added google/fonts#5297

Merged

anthrotype mentioned this pull request Dec 6, 2022

rename kerningRLT => kerningRTL in GlyphsFileFormatv3.md schriftgestalt/GlyphsSDK#81

Merged

yanone mentioned this pull request Dec 8, 2022

Simple G3 RTL Kerning Support #838

Closed

yanone closed this Mar 3, 2023

yanone deleted the support_glyphs3_rtl_kerning branch March 3, 2023 16:48

Support Glyphs3 RTL kerning #778

Support Glyphs3 RTL kerning #778

Conversation

yanone commented Feb 21, 2022

yanone commented Feb 22, 2022

yanone commented Feb 22, 2022

yanone commented Feb 22, 2022

schriftgestalt commented Feb 22, 2022

anthrotype Feb 22, 2022

Choose a reason for hiding this comment

yanone Feb 22, 2022

Choose a reason for hiding this comment

schriftgestalt Feb 22, 2022

Choose a reason for hiding this comment

simoncozens Feb 22, 2022

Choose a reason for hiding this comment

yanone Feb 22, 2022

Choose a reason for hiding this comment

khaledhosny commented Feb 22, 2022

khaledhosny commented Feb 22, 2022

anthrotype Feb 22, 2022 • edited Loading

Choose a reason for hiding this comment

yanone Feb 22, 2022 • edited Loading

Choose a reason for hiding this comment

yanone commented Feb 22, 2022 • edited Loading

anthrotype commented Feb 22, 2022

schriftgestalt commented Nov 10, 2022

simoncozens commented Dec 2, 2022

schriftgestalt commented Dec 2, 2022

simoncozens commented Dec 2, 2022

yanone commented Dec 2, 2022

yanone commented Dec 2, 2022 • edited Loading

yanone commented Dec 2, 2022 • edited Loading

yanone commented Dec 2, 2022

simoncozens commented Dec 2, 2022 • edited Loading

anthrotype commented Dec 2, 2022

simoncozens commented Dec 2, 2022

anthrotype commented Dec 2, 2022

anthrotype commented Dec 2, 2022

yanone commented Dec 2, 2022

simoncozens commented Dec 2, 2022 • edited Loading

yanone commented Dec 2, 2022

anthrotype commented Dec 2, 2022 • edited Loading

yanone commented Dec 2, 2022

yanone commented Dec 2, 2022

simoncozens commented Dec 2, 2022

anthrotype commented Dec 2, 2022

schriftgestalt commented Dec 2, 2022

yanone commented Dec 8, 2022

yanone commented Mar 3, 2023

anthrotype Feb 22, 2022 •

edited

Loading

yanone Feb 22, 2022 •

edited

Loading

yanone commented Feb 22, 2022 •

edited

Loading

yanone commented Dec 2, 2022 •

edited

Loading

yanone commented Dec 2, 2022 •

edited

Loading

simoncozens commented Dec 2, 2022 •

edited

Loading

simoncozens commented Dec 2, 2022 •

edited

Loading

anthrotype commented Dec 2, 2022 •

edited

Loading