Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ctt14651.txt: ISO 14651 collation Common Template Table #928

Merged
merged 1 commit into from
Sep 5, 2024

Conversation

markusicu
Copy link
Member

@markusicu markusicu commented Aug 30, 2024

This is the ISO/IEC 14651 Common Template Table.
This is basically a different file format for the same data as in UCA allkeys.txt.

Ken generated this file as ctt14651.txt using the sifter tool but renamed it to CTT_V16_0.txt for publication.
In the unicodetools repo, I assume that we don't need to keep multiple versions in parallel.
Therefore, I am using a version-less filename here.

I could create a new folder for this file, but that seems silly. It is a sibling to allkeys.txt but starts out unversioned, so I am putting it into .../data/uca/.

Before Unicode 17 alpha, we need to update the publication scripts to output this file and rename it with the current-version suffix. If we only generate it at the end of the release cycle, then maybe it needs to be published only in the "final" script.

Ken-Whistler
Ken-Whistler previously approved these changes Aug 30, 2024
Copy link
Contributor

@Ken-Whistler Ken-Whistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is going to be named CTT.txt in the repo, then we should update the sifter code at some point to output "CTT.txt" instead of "ctt14651.txt" as it currently does. In any case, I am not managing unicodetools/data/uca/ so I don't much care where you end up putting it.

@markusicu
Copy link
Member Author

If this is going to be named CTT.txt in the repo, then we should update the sifter code at some point to output "CTT.txt" instead of "ctt14651.txt" as it currently does.

I wasn't looking at the sifter. I was just going by the filename that you provided, minus the version suffix.

If the sifter currently writes "ctt14651.txt" then I should rename this file right now before it goes in. Unless you like CTT.txt better and we go the other way.

Your preference?

@Ken-Whistler
Copy link
Contributor

It's not much difference to me. But, yeah, the current output from the sifter is "ctt14651.txt", so that would be easier. (The output file names are defined in lines 195-200 of unisift.c.) The output file can then be named to whatever we need for an official CTT version for 14651 upon deployment.

@markusicu markusicu changed the title add CTT.txt: ISO collation Common Template Table add ctt14651.txt: ISO 14651 collation Common Template Table Sep 3, 2024
@markusicu
Copy link
Member Author

Renamed, PTAL

@markusicu
Copy link
Member Author

Ken is extremely busy with the Unicode 16 release. Could someone else please rubber-stamp this simple PR?

Copy link
Collaborator

@josh-hadley josh-hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[rs]lgtm

@markusicu markusicu merged commit 508846f into unicode-org:main Sep 5, 2024
24 checks passed
@markusicu markusicu deleted the add-ctt branch September 5, 2024 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants