Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing the basis of UUID generation for Mojaloop #103

Open
4 tasks
MichaelJBRichards opened this issue May 18, 2023 · 7 comments
Open
4 tasks

Changing the basis of UUID generation for Mojaloop #103

MichaelJBRichards opened this issue May 18, 2023 · 7 comments

Comments

@MichaelJBRichards
Copy link

MichaelJBRichards commented May 18, 2023

Request Summary:

Can we move to a standard form of UID generation which is at least as secure as UUIDV4 (and preferably more so) but whose result can be encoded in 35 characters or less?

Request Details:

  • Deadline: ASAP
  • Impact (Teams): FSPIOP Special Interest Group; Payment Manager
  • Impact (Components): All API definitions which use UIDs will be affected.

The current implementation of the ISO 20022 data model limits the length of a unique identifier to 35 characters. This is (just) not sufficient to hold a UUIDV4, which is the current Mojaloop standard for a GUID. We therefore need to change something. The following alternative proposals have been discussed in the ISO 20022 SIG:

  1. Move Mojaloop away from using a GUID as the standard for identification.
  2. Continue to use a UUIDV4, but omit the hyphens when it is used in the ISO 20022 identifier field. This reduces the identifier's length to 32 characters.
  3. Ask ISO 20022 to revise its identifier policy to make all identifiers of sufficient length to hold a UUIDV4.
  4. Move to a different UUID generation method which can be encoded in fewer than 36 characters

A summary of discussion on these points:

  • We do not regard the existing ISO convention as satisfactory because we require Mojaloop identifiers to be reliably unique over long periods of time, and to adhere to a single external standard used by all participants.
  • Proposal 2 would work, but would require additional work by participants and the switch, and is not easy for participants outside Mojaloop to understand.
  • Proposal 3 is formally attractive, but stands very little chance of acceptance by the ISO community due to the amount of work that existing ISO 20022 institutions would need to undertake to implement it.
  • So we considered proposal 4...

The ISO 20022 SIG has proposed moving to the CUID2 algorithm for the generation of UUIDs. This algorithm appears to offer:

  • Better collision resistance than UUIDV4
  • A result encoded in 32 characters

Most of the work required to implement this change will be on the APIs, and I shall be raising an issue on the FSPIOP API to consider this; but I wanted the DA to consider it from a technical perspective first.

Artifacts:

Dependencies:

  • If Applicable

Accountability:

  • Owner: Michael Richards
  • Raised By: Michael Richards

Decision(s):

  • Approved By:

Details

  • Actual decision made as a result of discussion

Follow-up:

  • Actions to implement the decisions
@millerabel
Copy link
Member

I like the concept of moving to Cuid2 in place of UUID 4 or other alternatives. That we can specify the size of the generated ID with calculable bounds on the probability of collision allows specifying sufficiently large IDs that will by definition fit within the available field width. And Cuid2 IDs are constrained to lower case ASCII and digits 0-9 with no special characters, thus simplifying encoding for exchange between systems. The innovative use of multiple sources of entropy and the obscuration of the source values limits leakage of source data.

We might move more than just our ISO identifiers to Cuid2. Identifiers used in the system like Transaction ID and Transfer ID might be stronger even in the non-ISO aspects of processing (e.g. in DFSP IOP API).

When thinking about wider application, the novel observation that the Cuid2 algorithm is fast enough, but not too fast, requires some study before applying Cuid2 as a replacement for each ID type: do we generate and use a particular ID type in a way that is consistent with this speed characteristic? Are we generating IDs in places like log entry creation many hundreds or thousands of times per second? If so, use something faster in those cases.

Cuid2 is appropriate for distributed unique ID creation, but should likely not be used where a simple single-host unique ID is needed. Be aware of the real entropy sources in the executing environment: Be wary of dozens of containers all running the same base operating system image, with essentially the same startup time, many on the same physical host with the same entropy sources. Generating thousands of IDs / second simultaneously in these containers might lead to increased risk of collision.

And ensure we don’t make assumptions that the IDs are K-sortable (Cuid2 IDs are not K-sortable, they are opaque) or database generated (too slow to generate Cuid2 using C-callable extensions from the DB layer).

With the proper study of context, I like the idea of moving to Cuid2 across the platform and APIs. Tested algorithms are available for JavaScript, Python, as well as a few languages we don’t use. It’s not yet ported to Zig.

@millerabel
Copy link
Member

(Modified original note to refer to CUID2 as an algorithm, not a standard. It hasn’t been adopted by a standards body.)

@MichaelJBRichards
Copy link
Author

JB: Probability of collisions needs to be sufficiently low to stop participants needing to add extra checks.
Per @millerabel 's comment: We don't need to change everything - internal processes can still use UUID
Analysis of data layer code: put this on core team backlog? Regex will need changing at API level.
Are there any code instances where the UID class is used?

@bushjames
Copy link

Related to this ticket in CCB space: mojaloop/mojaloop-specification#120 (comment)

@bushjames
Copy link

FSPIOP SIG have approved a move to support ULID. This decision needs documenting and accepting by CCB before DA can close this issue.

@elnyry-sam-k
Copy link
Member

Here's the confirmation from the FSPIOP SIG, @bushjames : mojaloop/mojaloop-specification#120 (comment)

Thanks to Henrik and the FSPIOP SIG

@bushjames
Copy link

Implementation happening as part of ISO-20022 work (PRs in review). Spec doc changes complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment