-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding woes #4
Comments
Encoding table is done.
|
@ctrlcctrlv I'm not up on the font world, but I don't understand points 2 and 3. Would the font with Terry's order not work with Unicode, and if not, might it be confusing to add for the font normies (such as myself)? Regarding point 4, I knew that most of them would have Unicode representations, but I couldn't for the life of me figure out what they were. I was planning to go back and fix it up, but never got around to it. I just dumped them at the beginning of one of the PUAs. If you want to formalize their position in the PUA, then feel free, although it's probably will never be an issue for –again– us font normies. ;) |
@rendello The issue is if you try to view a file made on TempleOS with your font, it will not work if the font is not in Terry order. (And some programs may have their own limitations around the use of certain codepoints, e.g., Terry likes to put blocks at 0x7F, some programs will not render these, nor any of the characters between 0x80 and 0x9F.) So, the Unicode font should certainly be default, I agree with you: thus that huge encoding table which makes it possible to coerce Terry order to Unicode order. |
Today I'm going to generate an ICU compatible conversion file. (https://unicode-org.github.io/icu/userguide/conversion/data.html#icu-mapping-table-data-files) |
@ctrlcctrlv I see, thanks for the explanation |
Yes, to do it right there need to be three conversion tables:
Those files are enough for tools like iconv to do the reverse conversions, although some data will be inevitably lost because Terry duplicated characters. So e.g. if you convert to Unicode and back you may not get byte-for-byte results. |
(But the result will still look the same in TempleOS.) |
By the way, I think that Terry's encoding is the only popular encoding in history which does what's called "LCG unification". This is similar to CJK unification, which is common in encodings and software. Terry did not add a codepoint for e.g. Cyrillic a, instead I guess assuming users would use the Latin one even though it'd be hilariously ugly because the Cyrillic is regular and the Latin is bold. I won't be trying to fix this. |
It doesn't help that Terry was a boomer and added a glyph for a currency that has not existed since 2002 (peseta, ₧), before he even started work on TempleOS lol :o) |
The "LCG unification", as you've termed it, was a hilarious thing to figure out when I was first looking at the font. I wonder where the Cyrillic characters are from. |
Hahaha it's not my term, I read it on a Unicode mailing list years ago when someone was making an argument against some point about CJK unification. I've never seen LCG unification in practice until this day, it's usually an absurdly broken idea you can trot out to make a point in a discussion about character encodings. |
I'm working on this now, but this font has a particular issue because it's for an encoding that Terry, essentially, just made up.
I suggest that we handle it thus:
The text was updated successfully, but these errors were encountered: