-
Notifications
You must be signed in to change notification settings - Fork 7
/
app-exp-peripherals.tex
188 lines (150 loc) · 16 KB
/
app-exp-peripherals.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
CHERI's design focuses on the `main' CPU core(s), in which there is a single operating system, and capabilities are used within virtual address spaces, mediated via an MMU and a memory-coherency system.
Many systems are composed of distributed compute elements that share memory. In various contexts these are termed `peripherals', `DMA engines', `accelerators' or `remote DMA network cards'. These may be on a single system-on-chip using fabrics such as AXI, or across interconnect such as PCI Express, Thunderbolt or Infiniband.
When capabilities are used in such a system, there is a requirement to
protect them from inappropriate modification by cores that might be outside the purview of the primary operating system. Additionally, it would be advantageous for such cores to use capabilities for their own code and data, without having to mediate them from centralized authority. Furthermore, such systems frequently use multiple levels of address translation -- not just a virtual address space (as capabilities in this document primarily refer to), but a patchwork of multiple physical address spaces (including the guest physical address spaces used by hypervisors), as well as virtual address spaces used by accelerators and other cores.
There are two challenges: first, preventing a core from modifying a
capability it does not own, and second, handling the case that capabilities
can alias if they refer to an incorrect address space.
To achieve these goals, we propose several architectural features.
\subsection{Scope and threat model}
This feature assumes that peripherals are capability-aware, in that they are able to load, store and manipulate capabilities and their tags. A number of scenarios with trustworthy hardware and software, untrustworthy software on trustworthy hardware, or untrustworthy hardware and software may be envisaged. Hardware that is not capability-aware and uses integers as pointers is out of scope for this extension, although it may be constrained or otherwise capability-wrapped by some other structure.
\subsection{Address-space coloring}
\label{app:exp:coloring}
We may deconstruct systems into regions of address-space colors (ASCs). A region with a common color has addresses with a single unambiguous meaning. Generalizing, a color could apply to an application's virtual address space, the system's hardware physical address space, the guest physical address space of a virtual machine, or a piece of memory on a peripheral.
A Processing Element (PE -- processor, DMA engine or other core) is assigned a color based on its physical location in the system topography.
Colors also represent single regions of authority. Within a colored region, it is assumed that every device that can synthesize a capability has rights to do so. If a device is untrustworthy, it should be segmented into a different colored region.
An address space may be connected to a different address space via an Address Translation and Protection Unit (ATPU). Examples of ATPUs might be MMUs, IOMMUs, and hypervisor page translation, but also more limited cases such as PCI BAR mapping or driving upper address bits from a page register. An ATPU may provide no translation between mutually distrusting hardware that happens to share an address space, but still apply protection between them.
We generalize an ATPU as a bridge by which requests come in from one address space and are dispatched into another. The ATPU may itself make memory requests to determine the translation, such as when walking page tables. These would potentially occur in an address space of a third color.
\subsection{Capability coloring}
Capabilities refer to addresses in particular address spaces, hence capabilities are given a color that is stored within the capability. It is now possible to disambiguate the address within a capability with the address space to which it refers.
\subsubsection{Representation}
We describe architecturally the notion of address space color without specifying the specific representation.
However microarchitecturally we expect that the \cotype{} field in a capability would be reused, based on a tagging scheme to distinguish them from a software-defined \cotype{}. Given the limited number of bits available in the \cotype{} for the otype-color, it may be impossible to represent all the colors within these bits. It is not necessary for the otype-color field to be unique, only that it is possible to disambiguate which address space region is referred by a capability. For example, the upper bits of the address may be used to distinguish two regions with the same otype-color field which are each smaller than 64-bit addressing. Architecturally such regions would be thought of as having different colors.
\paragraph{otype reuse}
\label{app:exp:otype-reuse}
Since 128-bit capabilities are constrained by size, we propose using the \cotype{} field to represent some or all of the ASC. To disambiguate from the softwaredefined \cotype{}, the data structure should be tagged.
To avoid reducing the bits for the \cotype{}, we propose a variable-length tag. This also allows embedding one instance of a variety of other metadata in the \cotype{} field. For example:
\begin{verbatim}
0x_xxxx_xxxx_xxxx_xxxx: Software-defined otype (17 bits)
10_xxxx_xxxx_xxxx_xxxx: Metadata type A (16 bits)
11_0xxx_xxxx_xxxx_xxxx: Metadata type B (15 bits)
...
11_1110_cccx_xxxx_xxxx: Metadata type E (8 alternatives of 9 bits each)
\end{verbatim}
\subsubsection{Operations on colored capabilities}
Colors are used to enforce policy by processing elements and ATPUs. A processing element has, in its hardware, an awareness of the color of its local address space.
In this area exist a number of possibilities, subject to further research.
Most conservatively, a PE could deal only with capabilities of its own color. A capability with another color is treated as if the tag is cleared. This would allow PEs to use capabilities internally, without sharing between them. A privileged process (boot loader, management processor, hypervisor, operating system on an application core) is used to generate initial colored capabilities for each PE, from where they are used internally.
In this scenario, capability conversion between colors would be minimal or require a call to the privileged process.
Conversion would require translation between address spaces, with a chance that a capability could not be directly represented in the target address space (if it represents disparate physical pages, for instance).
Other approaches are possible. For instance, colored capabilities could be treated as sealed. This would enable devices to be given `handles' to memory in another address space that they cannot access, but can pass around to other data structures. For instance, networking data structures might contain linked list pointers in network stack address space -- the NIC can build its own linked list, without the ability to access the data being pointed to. Much care would be required here to avoid confused deputy attacks.
%We consider the following operations by PEs:
%\begin{description}
%\item[Read of a capability, local color.] Allowed. A PE can make free use of capabilities of its local color. All the %regular operations on capabilities have their usual effect.
%\item[Write of a capability, local color.] Allowed. The capability is stored in memory with its color (potentially %transformed by ATPUs as described below).
%\item[Read/write of a capability, non-local color.] Non-local capabilities are treated as sealed. They can be stored in %memory and passed to other devices without modification.
%\item[Unseal capability, non-local color.] The unseal operation transforms a capability from a foreign address space into %the current address space, converting it to the local color. This may require a call-out to ATPU(s) to receive the %translation, potentially passing through several ATPUs.
%\item[Seal capability, local color.] Local use of sealing is still permitted, generating a sealed capability of a local %color. Given microarchitectural constraints, a smaller \cotype{} space may be available.
%\item[Seal capability, non-local color.] May change to the representation of a sealed capability with the target color.
%\item[Change color on capability.] A local capability may be converted to a capability of a different color by passing it %through ATPU(s), which return the appropriate non-local capability. The rights on the returned capability may be reduced %compared to the input.
%\end{description}
\subsubsection{Enforcement}
Both PEs and system bridges are tasked with enforcing the capability model:
\paragraph{Bridges} enforce operations on colored capabilities. For example, a bridge may disallow capabilities of other colors to pass through it. Bridges are viewed as more trustworthy than devices they connect, although a hierarchy exists -- bridges closer to DRAM are able to disallow capabilities that are accepted by bridges further away. Bridges have an awareness of whether hardware might be untrustworthy (for instance, plugged in to a motherboard slot or external port) and apply external enforcement of properties where the hardware might be untrustworthy.
ATPUs could also transform capabilities that pass through them according to their address space remapping -- e.g., allowing a PE to store capabilities with its local address space, but remap them to physical addresses when storing to DRAM.\tmnote{Does this make sense? Reversibility? Representability? Need to consider what happens when something looks at the integer address of such a transformed cap.}
%\paragraph{PEs} are tasked with enforcing sealing of non-local capabilities.
%\tmnote{What happens when malicious PE hardware violates the sealing of a non-local cap? Can the bridge efficiently enforce that?}
\paragraph{PEs} are tasked with enforcing the capability model within their local software. Bridges enforce colors, but PEs enforce the remainder of the capability model (monotonicity, tagging, etc). An untrustworthy PE may corrupt its own capabilities, but since the coloring is enforced by the bridge it will only have detrimental effects on its own software.
\subsubsection{Implementation outline}
For a minimalist implementation, the following might be done:
\begin{enumerate}
\item Choose a representation for the bits that indicate a color within a capability
\item Implement a CSR that configures the current color of a processor core. The
permissions on this register are up for debate but access control might be
similar to that of the IOMMU page table base register - potentially
set externally rather than internal to a PE
\item In user mode, loading or dereferencing capabilities checks the color
matches the current one, and causes an exception if mismatched
\item A privileged instruction takes two capabilities, an input capability
and an authorization capability. The output of the instruction is the input
capability with the color changed to match the authorization capability.
\tmnote{Is this something that should be restricted by a permission? Or by
a mode?}
\item Checks may be necessary to verify the input capability is a subset of the
authorization capability, bearing in mind they may be in different address
spaces without a contiguous mapping.
\item It may be necessary for a PE that wishes to change the color of its
capability to call out to a more trustworthy component, such as a bridge or
another processor with more authority. This could be modeled by the PE not
possessing a suitable authorization capability, and thus making an API call to
another PE or ATPU. ATPUs may implement such translation in hardware to
make it efficient.
\end{enumerate}
\subsection{Shared Host and Peripheral Virtual Address Spaces} % <<<
An alternative to tagging capabilities with an address space color would be
to arrange for all capabilities to represent the same address space, preventing
ambiguity.
Unlike in section \ref{app:exp:physcap}, when an IOMMU is present capabilities
may hold virtual addresses. The question is how should I/O virtual address
spaces and capabilities interact.
Parallels can be drawn to with the implementation of
capabilities in Unix processes. There, capabilities are used for fine-grained
protection while the MMU may be used for address translation.
In the I/O space, we can similarly use capabilities for fine-grained
protection of memory objects while using an IOMMU to acheive address
translation between the address spaces of different initiators. Relegating the
IOMMU to a secondary role gives the opportunity to
simplify its hardware, potentially allowing fewer levels of page translation
or even a non-paged IOMMU.
As with virtually addressed capabilities in operating systems, there comes
the challenge that capabilities without an address-space color (as sketched in
section \ref{app:exp:coloring}) must be distinguished as to which
address space they relate to. In software the OS is in charge of process
switching, maintains an awareness of the current address space, and prevents
capabilities from leaking between address spaces. In I/O this is not true:
devices are making transactions independently and without coordination,
hence transactions in different address spaces can and do occur
concurrently. Unlike traditional IOMMU usage, we are concerned not
just with the address on the memory interconnect (where we can tell the address
space based on the transaction's origin in the system topology) but also
with addresses in memory (which have no clear provenance).
An alternative approach is to use a simplified IOMMU to retain
topological address provenance, but establish a \emph{shared I/O virtual
address space}.
In this way all devices share the same address mappings, but
capabilities are used to restrict those accesses (either by device hardware
being trustworthy and playing by the rules, or externally checking the outputs
of untrusted hardware).
One way to implement this is to ringfence a single range of virtual address
space
in the OS that is set up so that both OS and all peripherals' virtual address
spaces align -- i.e. an address in this region refers to the same piece of
memory whether it comes from any software on the host or peripheral device.
An analogy may be drawn with operating systems' \emph{direct mapped region},
a chunk of virtual address space where physical memory is mapped contiguously
-- this mapping exists in every process and is never switched out.
This means that any capability that refers to this address space is
unambiguous to what it refers. Hence there is never a disambiguation
problem and hence no need to color the capabilities. The downside is that
we cannot use such capabilities with physical addresses (unless the CPU lets
software use physical addresses, which may be true on a microcontroller but
less in an application-class system) and hence translation is required.
Such translation could be relatively lightweight however -- eg simply
adding a fixed offset to a physical address to turn it into a virtual
address in this region.
In this scenario access would require a capability to access a certain range of
shared I/O virtual address space, and a way to securely revoke capabilities
from devices once they were no longer required. The IOMMU would be limited to
ensuring that devices could not access memory outside the shared address space.
A simplified IOMMU would map this region to physical address space.
Alternatively a more complex page-based IOMMU could instead
dynamically provision this space with physical pages to allow sparse mappings
and contiguous blocks in fragmented physical memory -- but it could do so at a
larger granularity than that of software objects (eg megabyte-sized pages
rather than 4~KiB). Such design choices would depend on the kinds of accesses
expected by the system designer, but some tradeoffs could be made at runtime --
for example, a traditional paged IOMMU could be populated with 'huge' pages
(saving translation costs) since it is no longer needed to use
the IOMMU for protection of individual objects. Additionally, by handling
revocation via capabilities rather than in the IOMMU, it reduces IOTLB churn
and invalidation delays.