Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent TLS keys #783

Closed
wants to merge 10 commits into from
Closed

Conversation

hreinecke
Copy link
Collaborator

Hi all,

this pull request implements persistent TLS key handling. The original idea of storing the key serial number in the JSON configuration file doesn't really work out as the key serial number is ephemeral, and might change whenever a key is updated/revoked or somesuch. So after a reboot the key won't be present, the we won't be able to load the keys at all.
With the pull request we now store the TLS keys in 'PSK interchange format' as mandated in the NVMe TCP transport specification, which then contains the actual key data. And, obviously, the parser has been updated to read the TLS keys
from the PSK interchange format, too.
With that we fixup a design flaw in the JSON configuration, and ensure that TLS keys are persistent across reboots.

@hreinecke hreinecke force-pushed the persistent-keys.v1 branch 2 times, most recently from fe58d8b to 182ce01 Compare February 21, 2024 14:56
Add a function to return the payload of a key.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Add a function to export a TLS key in the PSK interchange format
as defined in the NVMe TCP transport specification.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Add a function to import a TLS key in the PSK interchange format
as defined in the NVMe TCP transport specification.

Signed-off-by: Hannes Reinecke <hare@suse.de>
We really should not update existing keys as this will confuse
existing users of that key. Rather we should revoke the old key
and insert a new one.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Use the default '.nvme' keyring when nvme_lookup_keyring() is called
with a NULL argument.

Signed-off-by: Hannes Reinecke <hare@suse.de>
nvme_configure_ctrl() is reading the values from sysfs, so we
should be reading the TLS key, too, if present.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Rather than printing the key serial number (which will only be valid
for this session) we should be exporting the actual key in PSK
interchange format to make it persistent across reboots.

Signed-off-by: Hannes Reinecke <hare@suse.de>
As now the JSON configuration file holds the TLS key in PSK interchange
format we should be parsing that key and inserting it into the kernel
keystore to make it available for TLS.

Signed-off-by: Hannes Reinecke <hare@suse.de>
TLS keys and keyrings are stored as strings, not as integers.

Signed-off-by: Hannes Reinecke <hare@suse.de>
The keyring identifier is a 'long', not an integer.

Signed-off-by: Hannes Reinecke <hare@suse.de>
"tls_key": {
"description": "TLS key for the connection",
"type": "string"
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these changes should be in config-schema.json.in

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course.

src/nvme/linux.c Show resolved Hide resolved
@@ -0,0 +1,100 @@
// SPDX-License-Identifier: GPL-2.0-or-later
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library is LGPL 2.1. So this is not compatible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, Rustys code is for crc32c (not crc32), and the crc source code from ccan also has GPLv2. So ccan doesn't help. The second implementation should be fine, though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But ever so slightly complicated to use. Switching to the freebsd implementation, which actually has a copy with a permissible license.

@@ -0,0 +1,11 @@
/* SPDX-License-Identifier: GPL-2.0-or-later */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this license doesn't work for the library.

* @key_len: Length of @key_data
*
* Returns @key_data in the PSK Interchange format as defined in section
* 3.6.1.5 of the NVMe TCP Transport specification.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and NULL with errno set in failure case.

@igaw
Copy link
Collaborator

igaw commented Feb 22, 2024

I suppose updating the docs would also be good.

@mwilck
Copy link
Contributor

mwilck commented Feb 23, 2024

This approach should be generalized in order to be future-proof, and because config.json represents an API of sorts, we should try to get it right, even if we don't implement all functionality yet. While its fine to just use plaintext keys in the config file now, we should define the syntax such that we'd be able to switch to more more secure mechanisms in the future. I suggest to use a model similar to crypttab(5).

Keys can be retrieved directly from the json string text, from a file on the file system, or from a socket. Independently from how they are retrieved, keys can be unencrypted or encrypted, to be decrypted either with a user-supplied pass phrase, a PKCS#11 or FIDO2 key, or a TPM (where in the latter case the decryption can additionally be bound to a set of PCRs), or any other method that the security people may come up with in the future.

Therefore I propose to use some sort of URI format for tls_key. Here's a rough outline of how I think it should look like, probably not perfect.

  • "tls_key": "NVMeTLSkey-1:01:....." (as it is now) for a key stored directly in config.json (well, NVMeTLSkey-1 describes the format of the key, not the source; maybe we want an extra prefix like "tls_key": "json:NVMeTLSkey-1:01:..."?)
  • "tls_key": "file:///etc/secrets/key.txt" for a key stored on disk,
  • "tls_key": "file://LABEL=my-usb-key/dir/key.txt" for a file on an external disk or USB key,
  • "tls_key": "socket:SOCKNAME" for an AF_UNIX socket (we might as well use file: for that, too).
  • "tls_key": "kernel:$keyring:description" for keys to be looked up in the kernel keyring (?)

An addition to that, we could add a tls_key_options attribute that specifies the method to decrypt the key, using a comma-separated list of options. For simplicity and user-friendliness, the syntax should be similar to crypttab's (pkcs11-uri=, tpm2-device=, etc.).

Obviously, we don't need to implement this now, but we should make sure that we don't need to change the syntax when we add features.

@hreinecke
Copy link
Collaborator Author

In principle a good idea. But how do we differentiate which format to use when writing out the json config file?
My original design was to use the kernel keyring such that one could have an external entity storing the keys in the keyring, and libnvme/nvme-cli just use the keys available in the keyring.
So really, there wasn't any need for making the keys persistent as the external entity would take care of that.
And the '--tls_key' option for nvme-cli was just a way to reference a specific key in case the default selection would pick the wrong key. (Remember, pre-TP8018 we couldn't differentiate between keys for the same host/controller combo, so when doing a key rotation it got tricky.)
With that in mind maybe using 'tls_key' for persistently store keys is not the best way, and we really should look at another solution, be it a separate daemon or the firmware idea.
Anyway, you are right that we should allow for different value at 'tls_key'. Out of necessity we should be continue to support the original idea of having a hex value here, referencing a key in the kernel keyring. And also the NVMeTLSKey plain string method. But I'd rather not do anything more here; if that's required one should code a daemon pushing the keys into the keyring.

@hreinecke
Copy link
Collaborator Author

And there another issue. we have the option '--tls' (indicating that we should start TLS, and use the default key), and the option '--tls_key' (indicating that we should be using the specified TLS key. But once the TLS connection is started the negotiated key serial is presented in the 'tls_key' sysfs entry. But there is no indicator whether this was the default TLS key or the user-specified one. So with the current interface we cannot reliably write the configuration file from a given connection, ie 'nvme config --dump --scan' will always write 'tls_key', but won't be able to figure out if 'tls' was given.

@mwilck
Copy link
Contributor

mwilck commented Feb 26, 2024

My original design was to use the kernel keyring such that one could have an external entity storing the keys in the keyring, and libnvme/nvme-cli just use the keys available in the keyring. So really, there wasn't any need for making the keys persistent as the external entity would take care of that. And the '--tls_key' option for nvme-cli was just a way to reference a specific key in case the default selection would pick the wrong key. (Remember, pre-TP8018 we couldn't differentiate between keys for the same host/controller combo, so when doing a key rotation it got tricky.)

A key that can be replaced by key rotation is by definition not "persistent", is it? We shouldn't store such keys in a static configuration file. As far as I understand TP 8018, this applies to generated PSKs. Updating rotaded keys in the kernel keyring should indeed be the responsibility of some external entity 1. IMO that means that we don't need a tls_key option at all, at least with TP 8018, and that the tls_key attribute in config.json should be deprecated. tls_key corresponds to --tls-key and should have the same semantics, otherwise we'd just confuse users. If it's just used to differentiate between keys that can't be differentiated otherwise, we shouldn't overload it with a different meaning for persistent keys.

We could have a persistent_tls_key (retained_tls_key??) option though, which would work along the lines indicated above. Naïvely I would assume that if we can store dhchap_key in config.json, we can also store persistent_tls_key there, or do you still see issues with config-file write-out?

Hope this makes sense, I'm still trying to understand how all this works together.

Footnotes

  1. IIUC AUTH_Negotiate messages with REPLACETLSPK must be sent over the admin queue, which is owned by the kernel, so wouldn't the kernel itself need to take care of this?

@hreinecke
Copy link
Collaborator Author

Remember: 'retained key' != 'persistent key'. A 'retained key' is everything which is not auto-generated via 'secure concatenation'. As the patchset for this is still pending (and needs reviews, BTW) the current code only supports 'retained keys'.
Normally a user would just specify '--tls' to enable TLS encryption, and the kernel code would select a matching key from the keystore. So key rotation is easily possible by just adding a new / different key to the keystore and reset the connection But if the user want to use a specific key from the keystore he can select it via '--tls_key'. Clearly key rotation is not possible here.
The problem is now that in sysfs we only have one setting 'tls_key', which will point to the key ID of the key currently in use for TLS encryption, and we have no means of telling if this key got selected by the in-kernel mechanism or if the user specified it. That's what I mean with 'persistent': once the user specifies a key via '--tls_key' we can't do a key rotation, and the key is 'persistent' in the sense that it cannot be changed for this connection.
Other names are welcome :-)

@martin-belanger
Copy link
Contributor

Other names are welcome :-)

Maybe: fixed_tls_key, permanent_tls_key, or explicit_tls_key.

@mwilck
Copy link
Contributor

mwilck commented Feb 26, 2024

the current code only supports 'retained keys'. Normally a user would just specify '--tls' to enable TLS encryption, and the kernel code would select a matching key from the keystore. So key rotation is easily possible by just adding a new / different key to the keystore and reset the connection

The way I read TP8018, it's possible but not allowed for retained keys. At least that's my understanding of Fig. 18, which says that the retained key is "retained indefinitely", whereas the generated PSK is "replaced periodically or on demand".

But if the user want to use a specific key from the keystore he can select it via '--tls_key'. Clearly key rotation is not possible here.

I think this kind of key rotation, irrespective whether allowed or not, could be made work by changing the key selection code in nvme_tcp_alloc_admin_queue() to fall back to the default key if the configured key wasn't found. If we did that, there would be no difference between connections set up with --tls-key or just --tls wrt key rotation.

@hreinecke
Copy link
Collaborator Author

Authoritative answer from NVMexpress: The statement in figure 518 for Retained PSK: (retained indefinitely, associate with the hash of the selected cipher suite) is not intended to read 'retained permanently'. 'Indefinitely' here simply means: without a defined limit, ie no limit is given by the spec. But there might be limit imposed externally, and retained keys can be rotated.

@hreinecke
Copy link
Collaborator Author

Superseded by #786 .

@hreinecke hreinecke closed this Feb 28, 2024
@hreinecke hreinecke deleted the persistent-keys.v1 branch February 28, 2024 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants