Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): initial normalization 🙀 #9728

Merged
merged 23 commits into from
Oct 20, 2023
Merged
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
dcb6796
chore(core): Some failing normalization transform tests 🙀
srl295 Oct 9, 2023
8c4642a
Merge branch 'master' into feat/core/9468-normalization-epic-ldml
srl295 Oct 10, 2023
15e6fb6
chore(core): dx: improve test output 🙀
srl295 Oct 10, 2023
daa0523
feat(core): ldml normalization 🙀
srl295 Oct 11, 2023
522d2ef
feat(core): ldml normalization 🙀
srl295 Oct 12, 2023
e64f596
feat(core): ldml normalization 🙀
srl295 Oct 12, 2023
736af4b
feat(core): ldml normalization 🙀
srl295 Oct 12, 2023
a577bd2
feat(core): ldml normalization 🙀
srl295 Oct 12, 2023
84b4836
Merge branch 'master' into feat/core/9468-normalization-epic-ldml
srl295 Oct 12, 2023
b3de1ff
feat(core): ldml normalization 🙀
srl295 Oct 12, 2023
95ad1c4
feat(core): ldml normalization 🙀
srl295 Oct 12, 2023
bf15feb
feat(core): ldml normalization 🙀
srl295 Oct 13, 2023
79f37f7
Merge branch 'master' into feat/core/9468-normalization-epic-ldml
srl295 Oct 16, 2023
9ee183c
fix(core): ldml update to normalization per review 🙀
srl295 Oct 16, 2023
93e1c18
fix(core): ldml update to normalization per review 🙀
srl295 Oct 17, 2023
9dbf70c
fix(core): ldml more updates to normalization per review 🙀
srl295 Oct 18, 2023
f1b12a1
fix(core): ldml more updates to normalization per review 🙀
srl295 Oct 18, 2023
c4c7361
Merge branch 'master' into feat/core/9468-normalization-epic-ldml
srl295 Oct 18, 2023
3723e6b
fix(core): ldml more updates to normalization per review 🙀
srl295 Oct 18, 2023
4afe038
fix(core): ldml more updates to normalization per review 🙀
srl295 Oct 19, 2023
1d9820b
fix(core): ldml more updates to normalization per review 🙀
srl295 Oct 19, 2023
9292410
Merge branch 'master' into feat/core/9468-normalization-epic-ldml
srl295 Oct 19, 2023
4cdc2c3
fix(core): ldml more updates to normalization per review 🙀
srl295 Oct 19, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 24 additions & 18 deletions core/src/ldml/ldml_transforms.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -408,7 +408,6 @@ transform_entry::transform_entry(const std::u32string &from, const std::u32strin
init();
}

// TODO-LDML: How do we return errors from here?
transform_entry::transform_entry(
const std::u32string &from,
const std::u32string &to,
Expand All @@ -417,13 +416,16 @@ transform_entry::transform_entry(
const kmx::kmx_plus &kplus,
bool &valid)
: fFrom(from), fTo(to), fFromPattern(nullptr), fMapFromStrId(mapFrom), fMapToStrId(mapTo) {
valid = false;
if (!valid)
return; // exit early
Comment on lines +419 to +420
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, if valid came in false we leave with it false.

Only change it from true->false on err, don't set it to true here.

Comment on lines +419 to +420
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to the status pattern from ICU. It means more work on the caller side but I guess I am okay with it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes sure the c'tor can return failure. Could be a factory model, but the caller is using emplace_back() so the idea is a stack based construction anyway. on failure we're going to bail out of the entire load.

assert(!fFrom.empty()); // TODO-LDML: should not happen?
assert((fMapFromStrId == 0) == (fMapToStrId == 0)); // we have both or we have neither.
assert(kplus.strs != nullptr);
assert(kplus.vars != nullptr);
assert(kplus.elem != nullptr);
valid = init();
if(!init()) {
valid = false;
}

// setup mapFrom
if (fMapFromStrId != 0) {
Expand Down Expand Up @@ -460,21 +462,21 @@ transform_entry::transform_entry(

bool
transform_entry::init() {
if (!fFrom.empty()) {
// TODO-LDML: if we have mapFrom, may need to do other processing.
const std::u16string patstr = km::kbp::kmx::u32string_to_u16string(fFrom);
UErrorCode status = U_ZERO_ERROR;
/* const */ icu::UnicodeString patustr_raw = icu::UnicodeString(patstr.data(), (int32_t)patstr.length());
// add '$' to match to end
patustr_raw.append(u'$');
icu::UnicodeString patustr;
const icu::Normalizer2 *nfd = icu::Normalizer2::getNFDInstance(status);
// NFD normalize on pattern creation
nfd->normalize(patustr_raw, patustr, status);
fFromPattern.reset(icu::RegexPattern::compile(patustr, 0, status));
return (UASSERT_SUCCESS(status));
}
return false; // fFrom should not be empty.
if (fFrom.empty()) {
return false;
}
// TODO-LDML: if we have mapFrom, may need to do other processing.
const std::u16string patstr = km::kbp::kmx::u32string_to_u16string(fFrom);
UErrorCode status = U_ZERO_ERROR;
/* const */ icu::UnicodeString patustr_raw = icu::UnicodeString(patstr.data(), (int32_t)patstr.length());
// add '$' to match to end
patustr_raw.append(u'$');
icu::UnicodeString patustr;
const icu::Normalizer2 *nfd = icu::Normalizer2::getNFDInstance(status);
// NFD normalize on pattern creation
nfd->normalize(patustr_raw, patustr, status);
fFromPattern.reset(icu::RegexPattern::compile(patustr, 0, status));
return (UASSERT_SUCCESS(status));
}

size_t
Expand Down Expand Up @@ -813,6 +815,10 @@ transforms::load(
const std::u32string toStr = kmx::u16string_to_u32string(kplus.strs->get(element->to));
KMX_DWORD mapFrom = element->mapFrom; // copy, because of alignment
KMX_DWORD mapTo = element->mapTo; // copy, because of alignment
assert(!fromStr.empty());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we get an earlier assert

if (fromStr.empty()) {
valid = false;
}
newGroup.emplace_back(fromStr, toStr, mapFrom, mapTo, kplus, valid); // creating a transform_entry
assert(valid);
if(!valid) {
Expand Down