-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Glyphs3 RTL kerning #778
Conversation
One sec, I’m gonna rewrite the script construction for the glyph data to be more aligned with the methods already in place for constructing the production name and categories. |
…on, now also processes ligatures
Okay, I’m done. The script construction is now analog to the category construction, and also covers ligatures as requested in the |
I didn’t write the initial state of this method, but what you’re describing is already happening, I believe, in the initial The subsequent methods construct the missing attributes in case a glyph couldn’t be found in the data by its name or unicode, which is quite the norm for my Arabic use cases ( |
My comment was more general, not about your changes. I should make a separate issue. |
if glyph.parent.format_version > 2: # Glyphs 3 | ||
glyph_data = glyphdata.get_glyph(glyph.name) | ||
if ( | ||
glyph_data.script in ("arabic", "hebrew") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only "arabic" and "hebrew"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the two RTL scripts to my knowledge. This point is arguably the critical one, also about the numbers. I chose those to my best knowledge. Some algorithm needs to be in place to sort Glyphs’ kerning classes into the correct first/second pairs, and I didn’t know any better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add the writing direction to the glyphdata.xml but there might be a way to query unicode to get the writing direction of each code. Then we don’t need to check for script == arabic and category != Number
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get script direction from a function in ufo2ft.utils.
(I’m on the road this week so won’t be able to review.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add the writing direction to the glyphdata.xml but there might be a way to query unicode to get the writing direction of each code. Then we don’t need to check for
script == arabic and category != Number
.
Yes, I could try this approach. @anthrotype Seems like a good idea?
This is getting dangerously convoluted. The designer manually determines which kerning is RTL and which is LTR, then glyphsLib turns this into a single kerning table, then ufo2ft’s kern feature writer will try to deduce which kerning is RTL and which is LTR. |
Also, some of the work being done here conflicts with #771. |
if direction == "LTR": | ||
match = re.match(r"@MMK_L_(.+)", left) | ||
elif direction == "RTL": | ||
match = re.match(r"@MMK_R_(.+)", left) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait.. does Glyphs 3 really call the class on the left-hand side @MMK_R_
and the one on the right-hand side @MMK_L_
for RTL kerning?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ll try to answer: The kerning pairs are still defined in logical order. So the code logic doesn’t change here. First stays first, second stays second. Additionally, for the new RTL-specific kerning table, Georg has switched the L
and R
to now be visual rather than logical, (as he did with kernLeft
and kernRight
classes) which I believe shouldn't have happened. We should have silently accepted that L
and R
were never appropriate names but are used to describe 1st and 2nd.
FYI, you can check the two added RTL_kerning_v2.glyphs
and RTL_kerning_v3.glyphs
files for reference, in particular the kerning pair reh-ar,hah-ar.init.swsh
(in logical order) which is defined in both.
We’re hoping for a remedy with UFO 4 to support separate kerning tables for each writing direction, if I understand correctly? Today, G3 sources don’t produce any Arabic kerning at all, so there is an urgency.
They way I see it, it doesn't conflict with my issue logically, so I could easily wait it out and apply my changes to the updated code after #771. Is that PR under active development? |
I actually agree with Khaled. This is getting too complicated and we need to step back a little. Who decides what pairs get placed in the kerningLTR and kerningRTL? If it's the user, then this classification needs to be carried across to ufo2ft somehow, we can't rely on ufo2ft splitting it again, with the risk of applying a slightly different split. Second question: in the kerningRTL dictionary, is the order logical (like in OpenType's and UFO's first and second glyph depending on a script writing direction) or is it a visual order, like literally "left" and "right" (as in the proposed but never implemented hkerning.plist in UFO4)? |
I have finished the GlyphsInfo stuff. Please have a look at the GlyphsInfo3 branch and pull request. |
I've said this privately to Yan, but I think it's worth talking about here as well: I have some misgivings about the direction of this PR, but I’m aware that some of that may be to do with the fact that I have some misgivings about the whole concept of a separate RTL kerning dictionary. So it’s hard for me to separate my feelings about the PR implementation from my feelings about what it’s implementing. I think one of the reasons why you're not getting traction on this PR (at least from me), is that conceptually I can't get my head around it. Splitting kerning by direction at the source level doesn't make sense to me. Most glyphs have inherent directionality so you don't need to store this separately. For those which do not, like kerning two common script glyphs together, then what? Two numbers will be displayed in logical order regardless of direction, so that doesn't need a separate RTL dictionary. So really the only case this deals with is punctuation against punctuation. And then why split by direction? Perhaps your Hebrew is more tightly spaced than Arabic - then what? I'm sure there are reasons why separate LTR and RTL kerning is a thing. But to be honest, none of them convince me. So it's not something I really want to be spending my time thinking about anyway. And when I do think about it, in terms of the implementation, I feel like this could be much, much simpler. glyphsLib is part of a UFO-based compilation approach where we convert glyphs files to UFO for compilation. There is only one kerning dictionary in UFO. So no matter what we do, everything is going to end up in a single kerning dictionary anyway, at which point it is the responsibility of ufo2ft to turn the contents of that dictionary into GPOS rules. ufo2ft already does the clever stuff about working out whether a kern is LTR or RTL. So my conclusion is that glyphsLib does not need to do that - it just needs to gather all the kerns wherever they live and put them into a UFO kerning file. And now we have a situation where this is all based on a GlyphsInfo file? That makes even less sense to me. |
It is not Jan’s fault that their file structure is how it is. In Glyphs, there are different aspects about the kerning.
Splitting it into two kerning dicts made the handling of the UI much easier and fixed some edge cases that were in Glyphs 2. I’m happy to discuss this in detail and if you can explain how to handle it properly, I can change it for the next version. |
I think the way to handle it properly is to combine kerningLTR and kerningRTL into one dictionary and then let ufo2ft do its thing. But I haven't tested this. |
I'm eager to try.
This would work, but will break round-tripping, as for UFO to Glyphs conversion we wouldn't know in which dict to move the pairs then. |
So I think it's really the group where the problem lies here. The kerning dict conversion in kerning.py probably won't need changing. Since Georg also decided to switch the kerning groups from |
The only solution to this that I can see without Georg’s updated glyph data is to decide the group sides based on whether or not a glyph can be found as part of the RTL kerning, so walk the dict for each group decision (or compile that once for speed) |
... which won’t work for bidirectional glyphs. They'll basically always end up with the wrong side groups as soon as they are referenced anywhere in the RTL dict. |
You talk about needing to know the writing direction for each glyph, but ufo2ft already works this out: take the writing direction for encoded glyphs from Unicode and extend it via GSUB closure to unencoded glyphs. We don't need to use GlyphsInfo for it. |
Yes, Yanone is right, this is the crux of the problem. It's not sufficient to simply merge the two kerning dicts. We need to find a way to deal with the groups as well. |
The groups are going to be split by the kern splitter, though. |
the problem is glyphsLib/Lib/glyphsLib/builder/groups.py Lines 75 to 82 in f012805
This returs "leftKerningGroup" or "rightKerningGroup" depending on the UFO group's logical position (1 or 2). For Glyphs-2 kerning, this would always be "rightKerningGroup" for public.kern1 and "leftKerningGroup" for public.kern2. |
we could work around this by storing stuff in the lib, but at this stage is more urgent to be able to compile these fonts, we can add support for roundtripping at a later stage |
You have no idea how delighted I am that someone utters the word urgency in this context other than me. It is indeed the case that this whole disaster is currently leading to Arabic and Hebrew typography being slowed down and disadvantaged. I know of several projects by several studios that are on hold since a while now, they're just not speaking out about it even though I asked them to. |
So how I would attempt to approach this would be to put everything into two groups, e.g. |
Deriving the writing direction of a glyph from Unicode will have to go through |
hm Simon is onto something, keep going! |
So basically here? |
I understand the first part, but I don't understand where the merging is happening that Simon is speaking of. |
OK, I'm going to avoid using real examples because if I do, I'll get horribly confused with So the first step is to store each glyph into After we've built all those groups, what we currently do is:
We keep doing that. But then next, we also do the same thing but for
That will work for all cases except for situations where the same glyph pair has both LTR and RTL kerning, with different values in each table. But we were never going to get that right anyway. |
wow.. and the result of this very complicated process is basically to undo Glyphs3's LTR vs RTL kerning split such that ufo2ft can later on redo that in its own way (ie. using cmap+GSUB closure to identify script and pair direction). |
That is also one reason why I did the split. |
if I look at googlefonts/glyphsLib#778, I see the name of the Right-to-left kerning dictionary in the .glyphs plist file uses the correct spelling `kerningRTL`, so the incorrect misspelled `kerningRLT` must have been fixed already and a mention of it is no longer needed in this .md doc? /cc @schriftgestalt @yanone
Gentlemen, I’ve implemented Simon’s idea and am happy to announce that I think it works. I made it into a separate PR for clarity: #838 |
This is the first of the three related PRs and has now been fixed in #865, so closing this too. |
Glyphs 3 kerning is sorted into the new
kerningLTR
andkerningRTL
font data attributes instead of the previous plainkerning
. The kerning for RTL is now stored in opposite attributes, contradicting the first/second logic of UFO and G2.@MMK_L_*
becomes@MMK_R_*
(and vv), and the visually right glyph side’s kerning class is now stored inkernRight
as opposed to the previouskernLeft
(=first/incoming side).This means that correctly sorting glyphs into first and second UFO groups now involves checking a glyph’s script and whether or not it’s a number (that are always LTR even for RTL scripts).
This in turn required a fix for the glyph data generator that previously didn't generate the script for unencoded derivative glyphs such as
hah-ar.init.swsh
. The algorithm then cycles through the glyph name variations by cutting off dot-separated suffices from the right until glyph data could be found (hah-ar.init.swsh->hah-ar.init
), and adds only the script value, as the other values were found to be already correctly generated.