Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPS: update client library documentation #152

Merged
merged 5 commits into from
Apr 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,23 @@ The API is defined in the public header files provided in the distributed packag
A DPS client device is an extension of an [IoTivity](https://github.com/iotivity/iotivity-lite) device. Define your desired device and add DPS code to automatically provision the device.

Start up the DPS initialization by calling the `plgd_dps_init` function, which allocates and initializes required data structures.
Use setters `plgd_dps_set_endpoint`, `plgd_dps_set_manager_callbacks`, `plgd_dps_set_skip_verify`, `plgd_dps_set_configuration_resource`, `plgd_dps_time_configure`, `plgd_dps_set_retry_configuration`, `plgd_dps_set_cloud_observer_configuration` and `plgd_dps_pki_set_expiring_limit` to configure the device.
Use setters `plgd_dps_set_manager_callbacks`, `plgd_dps_set_skip_verify`, `plgd_dps_set_configuration_resource`, `plgd_dps_time_configure`, `plgd_dps_set_retry_configuration`, `plgd_dps_set_cloud_observer_configuration` and `plgd_dps_pki_set_expiring_limit` to configure the device.

### Set DPS Endpoint

To set the DPS endpoint call the `plgd_dps_set_endpoint(plgd_dps_context_t *ctx, const char *endpoint)` function.
DPS endpoints are stored in a list ordered by priority. (ie. the first endpoint has the highest priority and should be tried first.)

- To add an endpoint to the list of DPS server endpoints use `plgd_dps_add_endpoint_address(plgd_dps_context_t *ctx, const char *uri, size_t uri_len, const char *name, size_t name_len)` function.
- To remove an endpoint use `plgd_dps_remove_endpoint_address(plgd_dps_context_t *ctx, const oc_endpoint_address_t *address)`
- To iterate over the list use `plgd_dps_iterate_server_addresses(const plgd_dps_context_t *ctx, oc_endpoint_addresses_iterate_fn_t iterate_fn, void *iterate_fn_data)`
- To select an endpoint from the list to be used during DPS provisioning use `plgd_dps_select_endpoint_address(plgd_dps_context_t *ctx, const oc_endpoint_address_t *address)` (by default the first endpoint added is selected)
- To get the currently selected endpoint use `plgd_dps_selected_endpoint_address(const plgd_dps_context_t *ctx)`

Function `plgd_dps_set_endpoint` used to previously configure the DPS server endpoint when it was a single element has been deprecated. Use the functions above to work with the list of endpoints.

{{< note >}}
On repeated DPS provisioning failure with the currently selected endpoint, the default retry mechanism changes the selected endpoint to the next one in the list of endpoints. For details, go [here](/docs/services/device-provisioning-service/retry-mechanism)
{{< /note >}}

### Set status callbacks

Expand Down Expand Up @@ -101,6 +113,12 @@ dps_status_handler(plgd_dps_context_t *ctx, plgd_dps_status_t status, void *data
if ((status & PLGD_DPS_HAS_OWNER) != 0) {
PRINT("\t\t-Has owner\n");
}
if ((status & PLGD_DPS_GET_CLOUD) != 0) {
PRINT("\t\t-Get cloud configuration\n");
}
if ((status & PLGD_DPS_HAS_CLOUD) != 0) {
PRINT("\t\t-Has cloud configuration\n");
}
if ((status & PLGD_DPS_GET_CREDENTIALS) != 0) {
PRINT("\t\t-Get credentials\n");
}
Expand All @@ -113,12 +131,6 @@ dps_status_handler(plgd_dps_context_t *ctx, plgd_dps_status_t status, void *data
if ((status & PLGD_DPS_HAS_ACLS) != 0) {
PRINT("\t\t-Has set acls\n");
}
if ((status & PLGD_DPS_GET_CLOUD) != 0) {
PRINT("\t\t-Get cloud configuration\n");
}
if ((status & PLGD_DPS_HAS_CLOUD) != 0) {
PRINT("\t\t-Has cloud configuration\n");
}
if ((status & PLGD_DPS_CLOUD_STARTED) != 0) {
PRINT("\t\t-Started cloud\n");
}
Expand All @@ -142,9 +154,11 @@ The resource type of the DPS configuration resource is `x.plgd.dps.conf` and the

| Property Title | Property Name | Type | Access Mode | Mandatory | Description |
| -------------- | ------------- | -----| ----------- | --------- | ----------- |
| Endpoint | endpoint | string | RW | No | Device provisioning server endpoint in format `coaps+tcp://{domain}:{port}` |
| Endpoint | endpoint | string | RW | No | Selected device provisioning server endpoint in format `coaps+tcp://{domain}:{port}` |
| Endpoint name | endpointName | string | RW | No | Name associated with the selected device provisioning server endpoint (currently unused by DPS). |
| Endpoints | endpoints | array of objects | RW | No | Array of device provisioning server endpoints. Each item is a pair of (`uri`, `name`) values, where `uri` is the endpoint address in the format `coaps+tcp://{domain}:{port}` and `name` is a string name associated with the endpoint. (Note: the property is generated only if there are at least 2 endpoints set) |
| Last error code | lastErrorCode | string | R | No | Provides last error code when provision status is in `failed` state (see list below for possible values). |
| Force reprovision | forceReprovision | bool | RW | No | Connect to dps service and reprovision credentials, acls and cloud configuration. |
| Force reprovision | forceReprovision | bool | RW | No | Connect to dps service and reprovision time, owner, cloud configuration, credentials and acls. |
| Provisioning status | provisionStatus | enum(string) | R | No | String representation of the provisioning status (see list below for possible values). |

Last error code values:
Expand All @@ -157,6 +171,7 @@ Last error code values:
- `5` (`PLGD_DPS_ERROR_SET_CLOUD`): cannot apply cloud configuration
- `6` (`PLGD_DPS_ERROR_START_CLOUD`): cannot start cloud
- `7` (`PLGD_DPS_ERROR_GET_OWNER`): cannot retrieve device owner
- `8` (`PLGD_DPS_ERROR_GET_TIME`): cannot retrieve current time

Provisioning status values:

Expand All @@ -170,6 +185,8 @@ Provisioning status values:
- `provisioned acls`: acls are provisioned
- `provisioning cloud`: provisioning of cloud configuration has been started
- `provisioned cloud`: cloud configuration is provisioned
- `provisioning time`: time synchronization has been started
- `provisioned time`: time is synchronized
- `provisioned`: device is fully provisioned and configured
- `renew credentials`: renewing expired or expiring certificates
- `transient failure`: provisioning failed with a transient error and the failed step is being retried
Expand Down Expand Up @@ -390,16 +407,25 @@ manufacturer_setup(plgd_dps_context_t *dps_ctx)
oc_free_string(&dev->name);
oc_new_string(&dev->name, dps_device_name, strlen(dps_device_name));
}
plgd_dps_set_manager_callbacks(dps_ctx, dps_status_handler, /* on_change_data */ NULL, cloud_status_handler,
/* on_cloud_change_data */ NULL);
plgd_dps_manager_callbacks_t callbacks = {
.on_status_change = dps_status_handler,
.on_status_change_data = NULL,
.on_cloud_status_change = cloud_status_handler,
.on_cloud_status_change_data = NULL,
};
plgd_dps_set_manager_callbacks(dps_ctx, callbacks);
if (g_expiration_limit != -1) {
plgd_dps_pki_set_expiring_limit(dps_ctx, (uint16_t)g_expiration_limit);
}
if (g_observer_max_retry != -1) {
plgd_dps_set_cloud_observer_configuration(dps_ctx, (uint8_t)g_observer_max_retry, 1);
}
plgd_dps_set_skip_verify(dps_ctx, g_skip_ca_verification != 0);
plgd_dps_set_endpoint(dps_ctx, dps_endpoint);
size_t dps_endpoint_len = strlen(g_dps_endpoint);
if (dps_endpoint_len > 0 && !plgd_dps_add_endpoint_address(dps_ctx, g_dps_endpoint, dps_endpoint_len, /*name*/NULL, /*name_len*/0)) {
printf("ERROR: failed to add endpoint address\n");
return -1;
}
if (dps_add_certificates(dps_ctx, dps_cert_dir) != 0) {
DPS_ERR("failed to add initial certificates on factory reset");
return -1;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,18 +71,16 @@ d -> dps: **4** Get DPS service time to sync clocks
dps --> d
d -> dps: **5** Get owner of the device
dps --> d
d -> dps: **6** Send Identity Certificate Signing Request
d -> dps: **6** Get plgd hub connection configuration
dps --> d
d -> dps: **7** Retrieve ACL configuration
d -> dps: **7** Send Identity Certificate Signing Request
dps --> d
d -> dps: **8** Get initial configuration
dps --> d
d -> dps: **9** Get plgd hub connection configuration
d -> dps: **8** Retrieve ACL configuration
dps --> d
d ->x dps: Disconnect
deactivate dps
deactivate d
d -> hub: **10** Connect and authenticate
d -> hub: **9** Connect and authenticate
activate d

@enduml
Expand All @@ -94,16 +92,15 @@ activate d
3. The DPS finds the Enrollment Group with matching Manufacturer Certificate CA or TPM's endorsement key.
4. In order to validate TLS certificates and rotate them, the device synchronizes its clock with the DPS by obtaining the current time.
5. The device's owner is granted access by the DPS, enabling the device to operate in Device-to-Device scenarios.
6. The device issues Certificate Signing Request (CSR) for the unique device Identity and requests the DPS to sign it. The CSR is signed by the separate service, running next to the DPS or within the plgd hub deployment. Custom Identity CA can be used. Identity Certificate is then securely stored on the device and used to for unique identification and secure connection to the plgd hub.
7. The device requests resources' ACLs for the Device-to-Device as well as Device-to-Cloud communication and applies them.
8. If the operator provided initial configuration for device resources, the devices retrieves and applies it.
9. The device retrieves connection configuration and OAuth2.0 access token which authorizes the device to communicate with the plgd hub APIs.
10. The device connects to the configured plgd hub instance, authenticates and encrypts the session using Identity Certificate and authorizes using the OAuth2.0 access token.
6. The device retrieves connection configuration and OAuth2.0 access token which authorizes the device to communicate with the plgd hub APIs.
7. The device issues Certificate Signing Request (CSR) for the unique device Identity and requests the DPS to sign it. The CSR is signed by the separate service, running next to the DPS or within the plgd hub deployment. Custom Identity CA can be used. Identity Certificate is then securely stored on the device and used to for unique identification and secure connection to the plgd hub.
8. The device requests resources' ACLs for the Device-to-Device as well as Device-to-Cloud communication and applies them.
9. The device connects to the configured plgd hub instance, authenticates and encrypts the session using Identity Certificate and authorizes using the OAuth2.0 access token.

{{< note >}}
Step number 4 and 10 are optional.
Step number 4 and 9 are optional.

4. Time synchronization is not required when device is synchronized time by another method like NTP. In such a case, the device can skip this step.
10. Device provisioning doesn't require to connect the device to the plgd hub. In such a case, device is ready to be securely used for your Device-to-Device scenarios.
9. Device provisioning doesn't require to connect the device to the plgd hub. In such a case, device is ready to be securely used for your Device-to-Device scenarios.

{{< /note >}}
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,6 @@ weight: 2
### Planned features

- **&#x2610; Device attestation via TPM:** This upcoming feature will enable devices to attest their identity using Trusted Platform Modules (TPM).
- **&#x2610; Initial device configuration:** This planned feature aims to provide users with a mechanism to perform the initial configuration of devices during the provisioning process.
- **&#x2610; Blacklisting and whitelisting devices:** With this feature, users will have the ability to blacklist or whitelist specific devices for enhanced access control.
- **&#x2610; Manual approval for device configuration:** This planned feature enables a mechanism where devices require manual approval from the user in order to receive configuration settings. Instead of automatic configuration, users will have control over granting permission for device configuration.
- **&#x2610; Verify Common Name:** Verify that the Common Name of DPS endpoint certificate matches the name of the selected endpoint by dps client.
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
title: 'Retry mechanism'
description: 'How is recoverable failure handled?'
date: '2022-06-28'
lastmod: '2024-04-22'
categories: [zero-touch, provisioning]
keywords: [retry, recovery, failure]
weight: 5
Expand All @@ -17,7 +18,7 @@ The used retry interval is determined by the retry counter and the retry configu

## Configuration

The retry configuration consists of an array of non-zero values, which are interpreted as retry intervals in seconds. The maximal size of the configuration array is 8. By default, the retry configuration is initialized to the following array of values:
The retry configuration consists of an array of non-zero values, which are interpreted as retry intervals in seconds. The configuration array must contain at least one value and the maximal size of is 8. By default, the retry configuration is initialized to the following array of values:

```C
[10, 20, 40, 80, 120]
Expand All @@ -27,26 +28,59 @@ The configuration can be changed by the `plgd_dps_set_retry_configuration` funct

## Failures during provisioning

Provisioning consists of 3 main steps:
Provisioning consists of 5 main steps:

* synchronization of time
* requesting and set of device owner
* requesting and applying of plgd hub connection configuration
* sending of signing certificate request
* requesting and applying of ACLs
* requesting and applying of plgd hub connection configuration

Each step sends a request to the DPS service and waits for response. After a request is sent, then the retry interval is used as a deadline. If the response is not received before this deadline, the operation timeouts and the request is resend. If the response is received in time, then its status code is checked. We distinguish between transient and non-transient errors.

Non-transient errors force a full provisioning on retry. Transient errors first try to repeat the failed step in case the problem clears up. However, if a transient failure occurs consecutively 3 times, then a full reprovisioning is forced.

The retry counter starts with zero. Each retry, after either a transient or a non-transient error or a timeout, increments the retry counter. The counter is reset to zero after a provisioning step successfully finishes. If the value of the retry counter is greater than the maximal index of the configuration array, then the counter is also reset back to zero.

### Retry action

Each failure or timeout triggers a retry action that calculates the retry delay (how long after a failure is the step or full reprovisioning retried) and timeout based on the configuration. By default, it is calculated like this:

```pseudocode

timeout = take the timeout value from configuration array indexed by the retry counter

delay = timeout / 2;
// Include a random delay to prevent multiple devices from attempting to
// connect or make requests simultaneously.
delay += random value % delay;

```

Moreover, once the retry counter reaches higher value than the maximal index of the configuration array, not only is the counter reset back to zero, but the library attempts to change the selected DPS endpoint. If the are more than one DPS endpoint servers configured, then the selected endpoint is changed to the next in the list (the list is considered circular, so the next endpoint after the last endpoint is the first endpoint).

To set up a custom retry action use `plgd_dps_set_schedule_action`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comma for clarity in the instruction.

- To set up a custom retry action use `plgd_dps_set_schedule_action`.
+ To set up a custom retry action, use `plgd_dps_set_schedule_action`.

This correction improves the readability of the instruction.


Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
To set up a custom retry action use `plgd_dps_set_schedule_action`.
To set up a custom retry action, use `plgd_dps_set_schedule_action`.


## Failures of cloud manager connection and authentication

After a successful provisioning, the device disconnects from the DPS service and starts up the cloud manager in `IoTivity-lite`. If the cloud manager fails to start, then a full reprovisioning is triggered. If the cloud manager starts successfully, then a cloud status observer also starts to operate.
After successfully provisioning, the device disconnects from the DPS service and initiates the cloud manager within `IoTivity-lite`. If the cloud manager fails to start, it triggers a full reprovisioning process. Conversely, if the cloud manager starts successfully, a cloud status observer also activates.

The cloud status observer operates as a simple polling mechanism, checking the cloud status value 30 times at 1-second intervals. It waits for the status to have both `OC_CLOUD_REGISTERED` and `OC_CLOUD_LOGGED_IN` flags set. Once these flags are set, the polling stops. The observer restarts the polling mechanism if the connection to the PLGD hub is lost.

You can configure the limit of polling checks (default: 30) and the interval (default: 1 second) using the `plgd_dps_set_cloud_observer_configuration` function.

{{< note >}}
Successful authentication of the cloud manager relies on a valid access token. If the access token retrieved during provisioning is not permanent, it will eventually expire. To prevent the PLGD hub from closing the connection to the device, the access token must be refreshed. This refresh operation is handled internally by the `IoTivity-lite` library, which schedules a token refresh operation before the access token expires.
{{< /note >}}

### Changing cloud servers on repeated failures

If the limit of polling checks is reached and the required flags are still not set, the cloud manager attempts to connect using other cloud server addresses available in the configuration. When attempting different cloud server addresses, some DPS steps may need to be redone.

Cloud status observer is a simple polling mechanism which examines the cloud status value 30 times in 1 second intervals. The observer checks the cloud status and waits for the status to have both `OC_CLOUD_REGISTERED` and `OC_CLOUD_LOGGED_IN` flags sets. If the flags are set, then the polling stops. If the limit of polling checks is reached and the flags are still not set, then the cloud manager is stopped and a full reprovisioning is forced. The polling mechanism is restarted as soon as the connection to the plgd hub is lost.
If the IDs of the cloud servers in the configuration differ, it implies that certificates and ACLs might also be different. Therefore, reprovisioning of credentials and ACLs is triggered. If the IDs are the same, DPS provisioning is not triggered, and the cloud manager is simply restarted with the new address.

The limit of polling checks (default: 30) and the interval (default: 1 second) can be configured by the `plgd_dps_set_cloud_observer_configuration` function.
If the observer goes through all the addresses without establishing a successful connection, the cloud manager is stopped, and a full DPS reprovisioning is forced.

{{< note >}}
Valid authentication of cloud manager depends on a valid access token. If the access token retrieved during provisioning is not permanent, it will eventually expire. It must be refreshed, because otherwise #plgd hub will close the connection to the device. This is handled internally by IoTivity-lite library, which schedules a refresh token operation before the access token expires.
To lean how to set up multiple cloud server addresses in an `IoTivity-lite` device [see Cloud: support for multiple servers](https://github.com/iotivity/iotivity-lite/wiki/Cloud:-support-for-multiple-servers)
{{< /note >}}
Loading
Loading