self addressing identifier

Definition

any identifier that is deterministically generated out of the content, or a digest of the content.
Source: Dr. S. Smtih

Explanation

An identifier that is deterministically generated from and embedded in the content it identifies, making it and its data mutually tamper-evident.

To generate a SAID

Fully populate the data that the SAID will identify, leaving a placeholder for the value of the SAID itself.
Canonicalize the data, if needed. The result is called the SAID's identifiable basis.
Hash the identifiable basis. The result is the value of the SAID.
Replace the placeholder in identifiable basis the with the newly generated identifier, so the SAID is embedded in the data it identifies. The result is called the saidified data.

To verify that a SAID truly identifies a specific chunk of data

Canonicalize the data, if needed. The result is claimed saidified data.
In the claimed saidified data, replace the SAID value with a placeholder. The result is the identifiable basis for the SAID.
Hash the identifiable basis.
Compare the hash value to the SAID. If they are equal, then the SAID identifies the claimed saidified data.

Differences in SAID algorthms manifest in the following choices

how data is canonicalized
which hash algorithm is used
which placeholder is used
how the bytes produced by the hash algorithm are encoded
how the SAID value is formatted

Notation

A terse way to describe a SAID and its data is to write an expression that consists of the token SAID followed by a token with field names in canonical order, where the field containing the SAID itsef is marked by the suffix =said. For example, the saidification of a simple ContactInfo data structure might be given as SAID(name, address, phone, email, id=said).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly