index.bs

<pre class='metadata'>
Title: Web Neural Network API
Shortname: webnn
Level: None
Status: w3c/ED
Group: webmlwg
TR: https://www.w3.org/TR/webnn/
URL: https://webmachinelearning.github.io/webnn/
Editor: Ningxin Hu 68202, Intel Corporation https://intel.com
Editor: Chai Chaoweeraprasit 120203, Microsoft Corporation https://microsoft.com
Abstract: This document describes a dedicated low-level API for neural network inference hardware acceleration.
Repository: https://github.com/webmachinelearning/webnn
Test Suite: https://github.com/web-platform-tests/wpt/tree/master/webnn
!Explainer: <a href="https://github.com/webmachinelearning/webnn/blob/master/explainer.md">explainer.md</a>
!Polyfill: <a href="https://github.com/webmachinelearning/webnn-polyfill">webnn-polyfill</a> / <a href="https://github.com/webmachinelearning/webnn-samples">webnn-samples</a>
Markup Shorthands: markdown yes
Markup Shorthands: dfn yes
Markup Shorthands: idl yes
Markup Shorthands: css no
Logo: https://webmachinelearning.github.io/webmachinelearning-logo.png
</pre>
<pre class="anchors">
urlPrefix: https://www.khronos.org/registry/webgl/specs/latest/1.0/; spec: WEBGL-1
    type: interface
        text: WebGLRenderingContext; url: 5.14
        text: WebGLBuffer; url: 5.4
        text: WebGLTexture; url: 5.9
urlPrefix: https://gpuweb.github.io/gpuweb/; spec: WEBGPU
    type: interface
        text: GPUDevice; url: gpu-device
        text: GPUBuffer; url: buffer-interface
        text: GPUTexture; url: texture-interface
</pre>
<pre class="biblio">
{
	"WEBGPU": {
		"authors": [
			"Dzmitry Malyshau",
			"Kai Ninomiya"
		],
		"href": "https://gpuweb.github.io/gpuweb/",
		"title": "WebGPU",
		"status": "ED",
		"publisher": "W3C",
		"deliveredBy": [
			"https://www.w3.org/2020/gpu/"
		]
	}
}
</pre>

<pre class="link-defaults">
spec:html;
    type:interface; text:Navigator
spec:webidl;
    type:dfn; text:record
    type:dfn; text:resolve
</pre>

<style>
/* Make <dl> blocks more distinct from their surroundings. */
main dl:not(.switch) {
    border-left: thin solid #f3e48c;
    padding-left: .5em;
}

/* <p> by default has these margins. Update ul/ol/dl to match,
 * since they are also put in places where paragraphs go. */
p, ul, ol, dl {
    margin: 1em 0;
}

/* Box for Valid Usage requirements. */
div.validusage {
    padding: .5em;
    border: thin solid #88e !important;
    border-radius: .5em;
}

/*
 * Stylistic labels, for clarity of presentation of these blocks.
 *
 * NOTE: This text is non-accessible and non-selectable; surrounding
 * text must also explain the context.
 */
.validusage {
    position: relative;
}
.validusage::before {
    font-weight: bold;
    font-style: italic;
    font-size: 130%;
    color: rgba(0, 0, 0, 0.15);
    color: var(--watermark-text);
    position: absolute;
    right: .3em;
    top: -.1em;
}
.validusage::before {
    content: "Valid Usage";
}

/*
 * Ensure that argumentdef blocks don't overflow algorithm section borders. This is made far harder
 * than it needs to be because the top-level W3C stylesheet has several @media + min-width variants
 * that mark themselves as !important and then proceed to do the wrong thing.
 */
@media screen and (min-width: 78em) {
    body:not(.toc-inline) .algorithm .overlarge {
        margin-right: auto !important;
    }
}
@media screen and (min-width: 90em) {
    body:not(.toc-inline) .algorithm .overlarge {
        margin-right: auto !important;
    }
}
.algorithm .overlarge {
    margin-right: auto !important;
}

/*
 * The default algorithm style has a caption that doesn't suit this spec's
 * formatting particularly well. Hide it.
 */
.algorithm .argumentdef {
    margin-top: 0;
}
.algorithm .argumentdef>caption {
    display: none;
}

/*
 * Add vertical lines to demarcate multi-column cells.
 */
table.data td[colspan] {
    border-left-style: dotted;
    border-right-style: dotted;
}

table.data.no-colspan-center td[colspan],
table.data.no-colspan-center th[colspan] {
    text-align: unset;
}

table.data tr.row-continuation td,
table.data tr.row-continuation th {
    border-top: none;
}

/*
 * Sticky table headers.
 */
.overlarge {
    /* position: sticky doesn't work inside scrollable elements. */
    overflow-x: unset;
}
thead.stickyheader th, th.stickyheader {
    position: sticky;
    top: 0;
    background: #f8f8f8;
    background: var(--stickyheader-background);
}

/*
 * Generic table format.
 */
th {
  text-align: left;
}

th, td {
  border-bottom: 1px solid black;
  border-collapse: collapse;
  padding-left: 5px;
  padding-right: 5px;
}

/*
 * Darkmode colors
 */
:root {
    --watermark-text: rgba(0, 0, 0, 15%);
    --stickyheader-background: #f8f8f8;
    --tint-red: rgba(255, 0, 0, 6%);
    --tint-green: rgba(0, 255, 0, 10%);
    --tint-blue: rgba(0, 0, 255, 5%);
    --tint-purple: rgba(255, 0, 255, 5%);
}
@media (prefers-color-scheme:dark) {
    :root {
        --watermark-text: rgba(255, 255, 255, 25%);
        --stickyheader-background: #181818;
        --tint-red: rgba(255, 0, 0, 20%);
        --tint-green: rgba(0, 255, 0, 18%);
        --tint-blue: rgba(0, 130, 255, 24%);
        --tint-purple: rgba(255, 0, 255, 22%);
    }
}

</style>

Introduction {#intro}
=====================

The Web Neural Network API defines a web-friendly hardware-agnostic abstraction layer that makes use of Machine Learning capabilities of operating systems and underlying hardware platforms without being tied to platform-specific capabilities. The abstraction layer addresses the requirements of key Machine Learning JavaScript frameworks and also allows web developers familiar with the ML domain to write custom code without the help of libraries. A complementary <a href="https://webmachinelearning.github.io/model-loader/">Model Loader API</a> defines a higher-level abstraction targeting primarily web developers.

For an illustrated introduction, please see the <a href="https://github.com/webmachinelearning/webnn/blob/master/explainer.md">explainer</a>.

Use cases {#usecases}
=====================

## Application Use Cases ## {#usecases-application}

This section illustrates application-level use cases for neural network
inference hardware acceleration. All applications in those use cases can be
built on top of pre-trained deep neural network (DNN) [[models]].

### Person Detection ### {#usecase-person-detection}

A user opens a web-based video conferencing application, but she temporarily
leaves from her room. The application is watching whether she is in front of her
PC by using object detection (for example, using object detection approaches
such as [[SSD]] or [[YOLO]] that use a single DNN) to detect regions in a camera
input frame that include persons.

When she comes back, the application automatically detects her and notifies
other online users that she is active now.

### Semantic Segmentation ### {#usecase-segmentation}

A user joins a teleconference via a web-based video conferencing application at
her desk since no meeting room in her office is available. During the
teleconference, she does not wish that her room and people in the background are
visible. To protect the privacy of the other people and the surroundings, the
application runs a machine learning model such as [[DeepLabv3+]] or
[[MaskR-CNN]] to semantically split an image into segments and replaces
segments that represent other people and background with another picture.

### Skeleton Detection ### {#usecase-skeleton-detection}

A web-based video conferencing application tracks a pose of user's skeleton by
running a machine learning model, which allows for real-time human pose
estimation, such as [[PoseNet]] to recognize her gesture and body language. When
she raises her hand, her microphone is automatically unmuted and she can start
speaking on the teleconference.

### Face Recognition ### {#usecase-face-recognition}

There are multiple people in the conference room and they join an online meeting
using a web-based video conferencing application. The application detects faces
of participants by using object detection (for example, using object detection
approaches such as [[SSD]]) and checks whether each face was present at the
previous meeting or not by running a machine learning model such as [[FaceNet]],
which verifies whether two faces would be identical or not.

### Facial Landmark Detection ### {#usecase-facial-landmarks}

A user wants to find new glasses that beautifully fits her on an online glasses
store. The online store offers web-based try-on simulator that runs a machine
learning model such as Face Alignment Network [[FAN]] to detect facial landmarks
like eyes, nose, mouth, etc. When she chooses a pair of glasses, the simulator
properly renders the selected glasses on the detected position of eyes on her
facial image.

### Style Transfer ### {#usecase-style-transfer}

A user is looking for cosmetics on an online store and wondering which color may
fit her face. The online store shows sample facial makeup images of cosmetics,
and offers makeup simulator that runs a machine learning model like
[[ContextualLoss]] or [[PairedCycleGAN]] to transfer the makeup style of the
sample makeup image to her facial image. She can check how the selected makeup
looks like on her face by the simulator.

### Super Resolution ### {#usecase-super-resolution}

A web-based video conferencing is receiving a video stream from its peer, but
the resolution of the video becomes lower due to network congestion. To prevent
degradation of the perceived video quality, the application runs a machine
learning model for super-resolution such as [[SRGAN]] to generate
higher-resolution video frames.

### Image Captioning ### {#usecase-image-captioning}

For better accessibility, a web-based presentation application provides
automatic image captioning by running a machine learning model such as
[[im2txt]] which predicts explanatory words of the presentation slides.

### Machine Translation ### {#usecase-translation}

Multiple people from various countries are talking via a web-based real-time
text chat application. The application translates their conversation by using a
machine learning model such as [[GNMT]] or [[OpenNMT]], which translates every
text into different language.

### Emotion Analysis ### {#usecase-emotion-analysis}

A user is talking to her friend via a web-based real-time text chat application,
and she is wondering how the friend feels because she cannot see the friend's
face. The application analyses the friend's emotion by using a machine learning
model such as [[DeepMoji]], which infers emotion from input texts, and displays
an emoji that represents the estimated emotion.

### Video Summarization ### {#usecase-video-summalization}

A web-based video conferencing application records received video streams, and
it needs to reduce recorded video data to be stored. The application generates
the short version of the recorded video by using a machine learning model for
video summarization such as [[Video-Summarization-with-LSTM]].

### Noise Suppression ### {#usecase-noise-suppression}

A web-based video conferencing application records received audio streams, but 
usually the background noise is everywhere. The application leverages real-time 
noise suppression using Recurrent Neural Network such as [[RNNoise]] for 
suppressing background dynamic noise like baby cry or dog barking to improve 
audio experiences in video conferences.

### Detecting fake video ### {#usecase-detecting-fake-video}

A user is exposed to realistic fake videos generated by ‘deepfake’ on the web. 
The fake video can swap the speaker’s face into the president’s face to incite 
a user politically or to manipulate user’s opinion. The deepfake detection 
applications such as [[FaceForensics++]] analyze the videos and protect a user against 
the fake videos or images. When she watches a fake video on the web, the 
detection application alerts her of the fraud video in real-time.

## Framework Use Cases ## {#usecases-framework}

This section collects framework-level use cases for a dedicated low-level API
for neural network inference hardware acceleration. It is expected that Machine
Learning frameworks will be key consumers of the Web Neural Network API (WebNN
API) and the low-level details exposed through the WebNN API are abstracted out
from typical web developers. However, it is also expected that web developers
with specific interest and competence in Machine Learning will want to interface
with the WebNN API directly instead of a higher-level ML framework.

### Custom Layer ### {#usecase-custom-layer}

A web application developer wants to run a DNN model on the WebNN API. However,
she has found that some of activation functions like [[LeakyReLU]], [[ELU]],
etc. are not included in the WebNN API. To address this issue, she constructs
custom layers of the additional activation functions on top of the WebNN API.
Note that the scope of custom layers may include convolution, normalization,
etc. as well as activation.

### Network Concatenation ### {#usecase-network-concat}

A web application uses a DNN model, and its model data of upper convolutional
layers and lower fully-connected layers are stored in separate files, since
model data of the fully-connected layers are periodically updated due to fine
tuning at the server side.

Therefore, the application downloads both partial model files at first and
concatenates them into a single model. When the model is updated, the
application downloads fine-tuned part of the model and replace only the
fully-connected layers with it.

### Performance Adaptation ### {#usecase-perf-adapt}

A web application developer has a concern about performance of her DNN model on
mobile devices. She has confirmed that it may run too slow on mobile devices
which do not have GPU acceleration. To address this issue, her web application
refers to the WebNN API to confirm whether acceleration is available or not, so
that the application can display the warning for devices without acceleration.

After several weeks, she has developed a tiny DNN model that can even run on
CPU. In order to accommodate CPU execution, she modifies the application
so that the application loads the tiny model in the case of CPU-only devices.

### Operation Level Execution ### {#usecase-op-level-exec}

A JavaScript ML framework is responsible for loading, interpreting and executing a ML model. During the model execution phase, the framework iterates through the operations of the model and executes each operation on the hardware device, like CPU, GPU or ML accelerator. To avoid the unnecessary data copying across devices, the framework selects the same device to execute the operations. For a compute intensive operation, such as convolution 2D or matrix multiplication, the framework uses WebNN API to execute it with the ML-specific acceleration available on that selected device.

### Integration with real-time video processing ### {#usecase-real-time-video-processing}

The user experience of WebRTC-based video conferencing is enhanced using real-time video processing. For example, background blur implemented using a [[#usecase-segmentation]] model blurs the background in the user's live camera feed. To satisfy the performance requirements of this use case, the WebNN API integrates with primitives from other Web APIs that make up the media pipeline to allow WebNN API-based transformation of real-time video streams.

Security Considerations {#security}
===================================

This API is disabled by default in all cross-origin frames using the [[#permissions-policy-integration]]. This prevents third-party content from using this API unless the embedding page explicitly sets a policy that grants permission.

This API allows creation of an {{MLContext}} from a {{GPUDevice}} or {{WebGLRenderingContext}} defined by WebGPU and WebGL specifications respectively. See <a href="https://gpuweb.github.io/gpuweb/#security">WebGPU Security Considerations</a> and <a href="https://www.khronos.org/registry/webgl/specs/latest/1.0/#4">WebGL Security Consideration</a> for more information regarding security characteristics of these contexts.

Privacy Considerations {#privacy}
===================================

This API enhances privacy compared to cloud-based inference, since input data such as locally sourced images or video streams stay within the browser's sandbox.

This API exposes the minimum amount of information necessary to address the identified [[#usecases]] for the best performance and reliability of results.

No information from the underlying platform is exposed directly. An execution time analysis may reveal indirectly the performance of the underlying platform's neural network hardware acceleration capabilities relative to another underlying platform.

Note: The group is <a href="https://github.com/webmachinelearning/webnn/issues/85">soliciting further input</a> on the proposed execution time analysis fingerprinting vector and will augment this section with more information and mitigations to inform the implementers of this API.

Implementers of this API are expected to be familiar with the <a href="https://gpuweb.github.io/gpuweb/#security-privacy">WebGPU Privacy Considerations</a>.

Ethical Considerations {#ethics}
===================================

The Working Group has started documenting ethical issues associated with using Machine Learning on the Web, to help identify what mitigations its normative specifications should take into account. This work currently happens in a dedicated <a href="https://github.com/webmachinelearning/ethical-webmachinelearning">GitHub repository</a>.

# Programming Model # {#programming-model}
## Overview ## {#programming-model-overview}

At the heart of neural networks is a computational graph of mathematical operations.
These operations are the building blocks of modern machine learning technologies in
computer vision, natural language processing, and robotics.
The WebNN API is a specification for constructing, compiling, and executing computational
graphs of neural networks.

The {{MLGraph}} interface represents a compiled computational graph (that is, a model) and exposes
a compute method to perform inference.

The {{MLGraphBuilder}} interface serves as a builder (factory) to create a {{MLGraph}}.
An {{MLOperand}} is a representation of data that flows within the computational graph,
which include input-values for inference, constants (including trained weights)
used for inference, intermediate values (often referred to as activations) computed
during inference, as well as the output values of inference.
At inference time, every {{MLOperand}} will be bound to a tensor (the actual data).

The {{MLGraphBuilder}} interface enables the creation of {{MLOperand}}s.
A key part of the {{MLGraphBuilder}} interface are the operations (such as 
{{MLGraphBuilder/gemm()}} and {{MLGraphBuilder/softmax()}}). The operations have a functional
semantics, with no side effects.
Each operation invocation conceptually returns a distinct new value, without
changing the value of any other {{MLOperand}}.

The {{MLGraphBuilder/build()}} method of the {{MLGraphBuilder}} interface is used to compile and optimize
the computation graph used to compute one or more specified outputs. The key
purpose of the compilation step is to enable optimizations that span two or
more operations, such as operation or loop fusion.

The {{MLGraph/compute()}} method of the {{MLGraph}} interface is used to execute the
compiled computation graph (to perform inference). The caller supplies the input
values using {{MLNamedInputs}}, binding the input {{MLOperand}}s to their values.
The caller supplies pre-allocated buffers for output {{MLOperand}}s using {{MLNamedOutputs}}.

The runtime values (of {{MLOperand}}s) are tensors, which are essentially multidimensional
arrays. The representation of the tensors is implementation dependent, but it typically
includes the array data stored in some buffer (memory) and some metadata describing the
array data (such as its shape). 

As mentioned above, the operations have a functional semantics. This allows the implementation
to potentially share the array data between multiple tensors. For example, the implementation
of operations such as reshape, or slice, or squeeze may return a view of its input tensor
that shares the same buffer as the input tensor. (In the case of reshape or squeeze,
the entire data is shared, while in the case of slice, a part of the input data is shared.)
The implementation may use views, as above, for intermediate values.

## Device Selection ## {#programming-model-device-selection}

An {{MLContext}} interface represents a global state of neural network execution. One of the important context states is the underlying execution device that manages the resources and facilitates the compilation and the eventual execution of the neural network graph. An {{MLContext}} could be created from a specific GPU device such as {{GPUDevice}} or {{WebGLRenderingContext}} that is already in use by the application, in which case the corresponding {{GPUBuffer}} or {{WebGLBuffer}} resources used as graph constants, as well as the {{GPUTexture}} and {{WebGLTexture}} as graph inputs must also be created from the same device. In a multi-adapter configuration, the device used for {{MLContext}} must be created from the same adapter as the device used to allocate the resources referenced in the graph.

In a situation when a GPU context executes a graph with a constant or an input in the system memory as an {{ArrayBufferView}}, the input content is automatically uploaded from the system memory to the GPU memory, and downloaded back to the system memory of an {{ArrayBufferView}} output buffer at the end of the graph execution. This data upload and download cycles will only occur whenever the execution device requires the data to be copied out of and back into the system memory, such as in the case of the GPU. It doesn't occur when the device is a CPU device. Additionally, the result of the graph execution is in a known layout format. While the execution may be optimized for a native memory access pattern in an intermediate result within the graph, the output of the last operation of the graph must convert the content back to a known layout format at the end of the graph in order to maintain the expected behavior from the caller's perspective.

When an {{MLContext}} is created with {{MLContextOptions}}, the user agent selects and creates the underlying execution device by taking into account the application's preference specified in the {{MLPowerPreference}} and the {{MLDevicePreference}} options:
- The *"gpu"* device provides the broadest range of achievable performance across graphics hardware platforms from consumer devices to professional workstations. 
- The *"cpu"* device provides the broadest reach of software compute availability, but with limited scalability of execution performance on the more complex neural networks. 
- When the device preference is not specified (*"default"*), the user agent selects the most suitable device to use. 

The following table summarizes the types of resource supported by the device selected.

<div class="note">
<table>
  <tr><th>Device Type<th>ArrayBufferView<th>GPUBuffer<th>GPUTexture<th>WebGLBuffer<th>WebGLTexture
  <tr><td>GPUDevice<td>Yes<td>Yes<td>Yes<td>No<td>No
  <tr><td>WebGLRenderingContext<td>Yes<td>No<td>No<td>Yes<td>Yes
  <tr><td>default<td>Yes<td>No<td>No<td>No<td>No
  <tr><td>gpu<td>Yes<td>No<td>No<td>No<td>No
  <tr><td>cpu<td>Yes<td>No<td>No<td>No<td>No
</table>
</div>

API {#api}
=====================

## navigator.ml ## {#api-navigator-ml}

A {{ML}} object is available in the {{Window}} and {{DedicatedWorkerGlobalScope}} contexts through the {{Navigator}}
and {{WorkerNavigator}} interfaces respectively and is exposed via `navigator.ml`:

<script type=idl>
interface mixin NavigatorML {
  [SecureContext, SameObject] readonly attribute ML ml;
};
Navigator includes NavigatorML;
WorkerNavigator includes NavigatorML;
</script>

## ML ## {#api-ml}
<script type=idl>
enum MLDevicePreference {
  "default",
  "gpu",
  "cpu"
};

enum MLPowerPreference {
  // Let the user agent select the most suitable behavior.
  "default",

  // Prioritizes execution speed over power consumption.
  "high-performance",
  
  // Prioritizes power consumption over other considerations such as execution speed.
  "low-power"
};

dictionary MLContextOptions {
  // Preferred kind of device used
  MLDevicePreference devicePreference = "default";

  // Preference as related to power consumption
  MLPowerPreference powerPreference = "default";
};

[SecureContext, Exposed=(Window, DedicatedWorker)]
interface ML {
  // Create a context with options
  MLContext createContext(optional MLContextOptions options = {});

  // Create a context from WebGL rendering context
  MLContext createContext(WebGLRenderingContext glContext);

  // Create a context from WebGPU device
  MLContext createContext(GPUDevice gpuDevice);
};
</script>

The {{ML/createContext()}} method steps are:
1. If the [=responsible document=] is not [=allowed to use=] the [=webnn-feature|webnn=] feature, then throw a "{{SecurityError!!exception}}" {{DOMException}} and abort these steps.
1. Let |context| be a new {{MLContext}} object.
1. Switch on the method's first argument:
    <dl class=switch>
    <dt>{{MLContextOptions}}
    <dd>Set |context|'s [=context type=] to [=default-context|default=].

    <dt>{{WebGLRenderingContext}}
    <dd>Set |context|'s [=context type=] to [=webgl-context|webgl=].

    <dt>{{GPUDevice}}
    <dd>Set |context|'s [=context type=] to [=webgpu-context|webgpu=].

    <dt>Otherwise
    <dd>Set |context|'s [=context type=] to [=default-context|default=].
    </dl>
1. Return |context|.

### Permissions Policy Integration ### {#permissions-policy-integration}

This specification defines a <a>policy-controlled feature</a> identified by the
string "<code><dfn data-lt="webnn-feature">webnn</dfn></code>".
Its <a>default allowlist</a> is <code>'self'</code>.

## MLContext ## {#api-mlcontext}
The {{MLContext}} interface represents a global state of neural network compute workload and execution processes.
<script type=idl>
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLContext {};
</script>

The <dfn>context type</dfn> for an {{MLContext}} is either "<code><dfn data-lt="default-context">default</dfn></code>", "<code><dfn data-lt="webgl-context">webgl</dfn></code>" or "<code><dfn data-lt="webgpu-context">webgpu</dfn></code>".

## MLOperandDescriptor ## {#api-mloperanddescriptor}
<script type=idl>
enum MLInputOperandLayout {
  "nchw",
  "nhwc"
};

enum MLOperandType {
  "float32",
  "float16",
  "int32",
  "uint32",
  "int8",
  "uint8"
};

dictionary MLOperandDescriptor {
  // The operand type.
  required MLOperandType type;

  // The dimensions field is only required for tensor operands.
  // The negative value means an unknown dimension.
  sequence<long> dimensions;
};
</script>

## MLOperand ## {#api-mloperand}

An {{MLOperand}} represents an intermediary graph being constructed as a result of compositing parts of an operation into a fully composed operation.

For instance, an {{MLOperand}} may represent a constant feeding to an operation or the result from combining multiple constants together into an operation. See also [[#programming-model]].

<script type=idl>
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLOperand {};
</script>

## MLOperator ## {#api-mloperator}

Objects implementing the {{MLOperator}} interface represent activation function types. As a generic construct, this interface may be reused for other types in a future version of this specification.

<script type=idl>
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLOperator {};
</script>

<div class="note">
These activations function types are used to create other operations. One such use of this interface is for when an activation function is fused into another operation such as [[#api-mlgraphbuilder-conv2d]] or [[#api-mlgraphbuilder-batchnorm]] during a graph construction session.
</div>

<div class="note">
The implementation of the {{MLOperator}} interface can simply be a struct that holds a string type of the activation function along with other properties needed. The actual creation of the activation function e.g. a [[#api-mlgraphbuilder-sigmoid]] or [[#api-mlgraphbuilder-relu]] can then be deferred until when the rest of the graph is ready to connect with it such as during the construction of [[#api-mlgraphbuilder-conv2d]] for example.
</div>

## MLGraphBuilder ## {#api-mlgraphbuilder}

The {{MLGraphBuilder}} interface defines a set of operations as identified by the [[#usecases]] that can be composed into a computational graph. It also represents the intermediate state of a graph building session.

<script type=idl>
typedef record<DOMString, MLOperand> MLNamedOperands;

dictionary MLBufferResourceView {
  required (WebGLBuffer or GPUBuffer) resource;
  unsigned long long offset = 0;
  unsigned long long size;
};

typedef (ArrayBufferView or MLBufferResourceView) MLBufferView;

[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLGraphBuilder {
  // Construct the graph builder from the context.
  constructor(MLContext context);

  // Create an operand for a graph input.
  MLOperand input(DOMString name, MLOperandDescriptor desc);

  // Create an operand for a graph constant.
  MLOperand constant(MLOperandDescriptor desc, MLBufferView bufferView);

  // Create a single-value operand from the specified number of the specified type.
  MLOperand constant(double value, optional MLOperandType type = "float32");

  // Compile the graph up to the specified output operands
  MLGraph build(MLNamedOperands outputs);
};
</script>

### batchNormalization ### {#api-mlgraphbuilder-batchnorm}
Normalize the tensor values of input features across the batch dimension using [[Batch-Normalization]]. For each input feature, the mean and variance values of that feature supplied in this calculation as parameters are previously computed across the batch dimension of the input during the model training phase of this operation.
<script type=idl>
dictionary MLBatchNormalizationOptions {
  MLOperand scale;
  MLOperand bias;
  long axis = 1;
  float epsilon = 1e-5;
  MLOperator activation;
};

partial interface MLGraphBuilder {
  MLOperand batchNormalization(MLOperand input, MLOperand mean, MLOperand variance,
                             optional MLBatchNormalizationOptions options = {});
};
</script>
<div algorithm=batchnorm>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input N-D tensor.
        - *mean*: an {{MLOperand}}. The 1-D tensor of the mean values of the input features across the batch whose length is equal to the size of the input dimension denoted by *options.axis*.
        - *variance*: an {{MLOperand}}. The 1-D tensor of the variance values of the input features across the batch whose length is equal to the size of the input dimension denoted by *options.axis*.
        - *options*: an optional {{MLBatchNormalizationOptions}}. The optional parameters of the operation.
              - *scale*: an {{MLOperand}}. The 1-D tensor of the scaling values whose length is equal to the size of the input dimension denoted by *options.axis*.
              - *bias*: an {{MLOperand}}. The 1-D tensor of the bias values whose length is equal to the size of the input dimension denoted by *options.axis*.
              - *axis*: a {{long}} scalar. The index to the feature count dimension of the input shape for which the mean and variance values are. When it's not specified, the default value is 1.
              - *epsilon*: a {{float}} scalar. A small value to prevent computational error due to divide-by-zero. The default value is 0.00001 when not specified.
              - *activation*: an {{MLOperator}}. The optional activation function that immediately follows the normalization operation.
        
    **Returns:** an {{MLOperand}}. The batch-normalized N-D tensor of the same shape as the input tensor.

    When *input* is a 4-D tensor of the *"nchw"* or *"nhwc"* layout, *options.axis* should be set to 1 or 3 respectively. The axis value designates the feature or channel count dimension of the input tensor.

    <div class="note">
    The behavior of this operation when the input tensor is 4-D of the *"nchw"* layout and the activation is of operator type *relu* can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
    <pre highlight="js">
    const shape = [1,-1,1,1];
    return builder.relu(
      builder.add(
        builder.mul(
          builder.reshape(options.scale, shape),
          builder.div(
            builder.sub(input, builder.reshape(mean, shape)),
            builder.pow(
              builder.add(builder.reshape(variance, shape), builder.constant(options.epsilon)),
              builder.constant(0.5))
            )),
        builder.reshape(options.bias, shape)));
    </pre>
    </div>
</div>

### clamp ### {#api-mlgraphbuilder-clamp}
Clamp the input tensor element-wise within a range specified by the minimum and maximum values.
<script type=idl>
dictionary MLClampOptions {
  float minValue;
  float maxValue;
};

partial interface MLGraphBuilder {
  MLOperand clamp(MLOperand x, optional MLClampOptions options = {});
  MLOperator clamp(optional MLClampOptions options = {});
};
</script>
<div algorithm=clamp>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.
        - *options*: an optional {{MLClampOptions}}. The optional parameters of the operation.
            - *minValue*: a {{float}} scalar. Specifies the minimum value of the range. When it is not specified, the clamping is not performed on the lower limit of the range.
            - *maxValue*: a {{float}} scalar. Specifies the maximum value of the range. When it is not specified, the clamping is not performed on the upper limit of the range.

    **Returns:** 
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the clamp operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    if (options.minValue === undefined) {
      if (options.maxValue === undefined) {
        return x;
      } else {
        return builder.min(x, builder.constant(options.maxValue));
      }
    } else {
      if (options.maxValue === undefined) {
        return builder.max(x, builder.constant(options.minValue));
      } else {
        return builder.min(
            builder.max(x, builder.constant(options.minValue)),
            builder.constant(options.maxValue));
      }
    }
    </pre>
    </div>
</div>

### concat ### {#api-mlgraphbuilder-concat}
Concatenates the input tensors along a given axis.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand concat(sequence<MLOperand> inputs, long axis);
};
</script>
<div algorithm=concat>
    **Arguments:**
        - *inputs*: a sequence of {{MLOperand}}. All input tensors must have the
            same shape, except for the size of the dimension to concatenate on.
        - *axis*: a {{long}} scalar. The axis that the inputs concatenate along, with
            the value in the interval [0, N) where N is the rank of all the
            inputs.
        
    **Returns:** an {{MLOperand}}. The concatenated tensor of all the inputs along
    the *axis*. The output tensor has the same shape except on the dimension
    that all the inputs concatenated along. The size of that dimension is
    computed as the sum of all the input sizes of the same dimension.
</div>

### conv2d ### {#api-mlgraphbuilder-conv2d}
Compute a 2-D convolution given 4-D input and filter tensors
<script type=idl>
enum MLConv2dFilterOperandLayout {
  "oihw",
  "hwio",
  "ohwi",
  "ihwo"
};

enum MLAutoPad {
  "explicit",
  "same-upper",
  "same-lower"
};

dictionary MLConv2dOptions {
  sequence<long> padding;
  sequence<long> strides;
  sequence<long> dilations;
  MLAutoPad autoPad = "explicit";
  long groups = 1;
  MLInputOperandLayout inputLayout = "nchw";
  MLConv2dFilterOperandLayout filterLayout = "oihw";
  MLOperand bias;
  MLOperator activation;
};

partial interface MLGraphBuilder {
  MLOperand conv2d(MLOperand input, MLOperand filter, optional MLConv2dOptions options = {});
};
</script>
<div algorithm=conv2d>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input 4-D tensor. The logical shape
            is interpreted according to the value of *options.inputLayout*.
        - *filter*: an {{MLOperand}}. The filter 4-D tensor. The logical shape is
            interpreted according to the value of *options.filterLayout* and *options.groups*.
        - *options*: an optional {{MLConv2dOptions}}. The optional parameters of the operation.
            - *padding*: a sequence of {{long}} of length 4. The additional rows and columns added to the beginning and ending of each spatial dimension of *input*, [beginning_height, ending_height, beginning_width, ending_width]. If not present, the values are assumed to be [0,0,0,0].
            - *strides*: a sequence of {{long}} of length 2. The stride of the sliding window for each spatial dimension of *input*, [stride_height, stride_width]. If not present, the values are assumed to be [1,1].
            - *dilations*: a sequence of {{long}} of length 2. The dilation factor for each spatial dimension of *input*, [dilation_height, dilation_width]. If not present, the values are assumed to be [1,1].
            - *autoPad*: an {{MLAutoPad}}. The automatic input padding options. By default, this argument is set to *"explicit"*, which means that the values in the *options.padding* array should be used for input padding. When the option is set other than *"explicit"*, the values in the *options.padding* array are ignored. With the *"same-upper"* option, the padding values are automatically computed such that the additional ending padding of the spatial input dimensions would allow all of the input values in the corresponding dimension to be filtered. The *"same-lower"* option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.
            - *groups*: a {{long}} scalar. The number of groups that input channels and output channels are divided into, default to 1.
            - *inputLayout*: an {{MLInputOperandLayout}}. The default value is *"nchw"*. This option specifies the layout format of the input and output tensor as follow:

                "nchw":
                    - input tensor: [batches, input_channels, height, width]
                    - output tensor: [batches, output_channels, height, width]

                "nhwc":
                    - input tensor: [batches, height, width, input_channels]
                    - output tensor: [batches, height, width, output_channels]

            - *filterLayout*: an {{MLConv2dFilterOperandLayout}}. The default value is *"oihw"*. This option specifies the layout format of the filter tensor as follow:

                "oihw":
                    - [output_channels, input_channels/groups, height, width]
                    
                "hwio":
                    - [height, width, input_channels/groups, output_channels]
                    
                "ohwi":
                    - [output_channels, height, width, input_channels/groups]

                "ihwo":
                    - [input_channels/groups, height, width, output_channels]

            - *bias*: an {{MLOperand}}. The additional 1-D tensor with the shape of [output_channels] whose values are to be added to the convolution result.
            - *activation*: an {{MLOperator}}. The optional activation function that immediately follows the convolution operation. 

    **Returns:** an {{MLOperand}}. The output 4-D tensor that contains the convolution result. The output shape is interpreted according to the *options.inputLayout* value. More specifically, the spatial dimensions or the sizes of the last two dimensions of the output tensor for the *nchw* input layout can be calculated as follow:

    *output size = 1 + (input size - filter size - (filter size - 1) ** *(dilation - 1) + beginning padding + ending padding) / stride*

    <div class="note">
    A *depthwise* conv2d operation is a variant of grouped convolution, used in models like the MobileNet, where the *options.groups* = input_channels = output_channels and the shape of filter tensor is [options.groups, 1, height, width]
    for *"oihw"* layout, [height, width, 1, options.groups] for *"hwio"* layout, [options.groups, height, width, 1] for *"ohwi"* layout and [1, height, width, options.groups] for *"ihwo"* layout.
    </div>
</div>

### convTranspose2d ### {#api-mlgraphbuilder-convtranspose2d}
Compute a 2-D transposed convolution given 4-D input and filter tensors
<script type=idl>

enum MLConvTranspose2dFilterOperandLayout {
  "iohw",
  "hwoi",
  "ohwi"
};

dictionary MLConvTranspose2dOptions {
  sequence<long> padding;
  sequence<long> strides;
  sequence<long> dilations;
  sequence<long> outputPadding;
  sequence<long> outputSizes;
  MLAutoPad autoPad = "explicit";
  long groups = 1;
  MLInputOperandLayout inputLayout = "nchw";
  MLConvTranspose2dFilterOperandLayout filterLayout = "iohw";
  MLOperand bias;
  MLOperator activation;
};

partial interface MLGraphBuilder {
  MLOperand convTranspose2d(MLOperand input, MLOperand filter,
                            optional MLConvTranspose2dOptions options = {});
};
</script>
<div algorithm=convtranspose2d>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input 4-D tensor. The logical shape
            is interpreted according to the value of *options.inputLayout*.
        - *filter*: an {{MLOperand}}. The filter 4-D tensor. The logical shape is
            interpreted according to the value of *options.filterLayout* and *options.groups*.
        - *options*: an optional {{MLConvTranspose2dOptions}}. The optional parameters of the operation.
            - *padding*: a sequence of {{long}} of length 4. The additional rows and columns added to the beginning and ending of each spatial dimension of *input*, [beginning_height, ending_height, beginning_width, ending_width]. If not present, the values are assumed to be [0,0,0,0].
            - *strides*: a sequence of {{long}} of length 2. The stride of the sliding window for each spatial dimension of *input*, [stride_height, stride_width]. If not present, the values are assumed to be [1,1].
            - *dilations*: a sequence of {{long}} of length 2. The dilation factor for each spatial dimension of *input*, [dilation_height, dilation_width]. If not present, the values are assumed to be [1,1].
            - *outputPadding*: a sequence of {{long}} of length 2. The padding values applied to each spatial dimension of the output tensor. This explicit padding values are needed to disambiguate the output tensor shape for transposed convolution when the value of the *options.strides* is greater than 1. Note that these values are only used to disambiguate output shape when needed; it does not necessarily cause any padding value to be written to the output tensor. If not specified, the values are assumed to be [0,0].
            - *outputSizes*: a sequence of {{long}} of length 2. The sizes of the last two dimensions of the output tensor. When the output sizes are explicitly specified, the output padding values in *options.outputPadding* are ignored. If not specified, the output sizes are automatically computed.
            - *autoPad*: an {{MLAutoPad}}. The automatic input padding options. By default, this argument is set to *"explicit"*, which means that the values in the *options.padding* array should be used for input padding. When the option is set other than *"explicit"*, the values in the *options.padding* array are ignored. With the *"same-upper"* option, the padding values are automatically computed such that the additional ending padding of the spatial input dimensions would allow all of the input values in the corresponding dimension to be filtered. The *"same-lower"* option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.
            - *groups*: a {{long}} scalar. The number of groups that input channels and output channels are divided into, default to 1.
            - *inputLayout*: an {{MLInputOperandLayout}}. The default value is *"nchw"*. This option specifies the layout format of the input and output tensor as follow:

                "nchw":
                    - input tensor: [batches, input_channels, height, width]
                    - output tensor: [batches, output_channels, height, width]

                "nhwc":
                    - input tensor: [batches, height, width, input_channels]
                    - output tensor: [batches, height, width, output_channels]

            - *filterLayout*: an {{MLConvTranspose2dFilterOperandLayout}}. The default value is *"iohw"*. This option specifies the layout format of the filter tensor as follow:

                "iohw":
                    - [input_channels, output_channels/groups, height, width]

                "hwoi":
                    - [height, width, output_channels/groups, input_channels]

                "ohwi":
                    - [output_channels/groups, height, width, input_channels]

            - *bias*: an {{MLOperand}}. The additional 1-D tensor with the shape of [output_channels] whose values are to be added to the transposed convolution result.
            - *activation*: an {{MLOperator}}. The optional activation function that immediately follows the transposed convolution operation.

    **Returns:** an {{MLOperand}}. The output 4-D tensor that contains the transposed convolution result. The output shape is interpreted according to the *options.inputLayout* value. More specifically, unless the *options.outputSizes* values are explicitly specified, the *options.outputPadding* may be needed to compute the spatial dimension values of the output tensor as follow:

    *output size = (input size - 1) ** *stride + filter size + (filter size - 1) ** *(dilation - 1) - beginning padding - ending padding + output padding*
</div>

### element-wise binary operations ### {#api-mlgraphbuilder-binary}
Compute the element-wise binary addition, subtraction, multiplication, division,
maximum and minimum of the two input tensors.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand add(MLOperand a, MLOperand b);
  MLOperand sub(MLOperand a, MLOperand b);
  MLOperand mul(MLOperand a, MLOperand b);
  MLOperand div(MLOperand a, MLOperand b);
  MLOperand max(MLOperand a, MLOperand b);
  MLOperand min(MLOperand a, MLOperand b);
  MLOperand pow(MLOperand a, MLOperand b);
};
</script>
<div algorithm=binary>
    **Arguments:**
        - *a*: an {{MLOperand}}. The first input tensor.
        - *b*: an {{MLOperand}}. The second input tensor.

    **Returns:** an {{MLOperand}}. The output tensor that contains the result of
    element-wise binary operation of the two input tensors.

    The element-wise binary operation will be broadcasted according to
    [[!numpy-broadcasting-rule]]. The rank of the output tensor is the maximum
    rank of the input tensors. For each dimension of the output tensor, its size
    is the maximum size along that dimension of the input tensors.

    **Operation types:**
        - *add*: Add the values of the two input tensors, element-wise.
        - *sub*: Subtract the values of the second input tensor from the values of the first input tensor, element-wise.
        - *mul*: Multiply the values of the two input tensors, element-wise.
        - *div*: Divide the values of the first input tensor with the values of the second tensor, element-wise.
        - *max*: Select the greater values of the two input tensors, element-wise.
        - *min*: Select the lesser values of the two input tensors, element-wise.
        - *pow*: Compute the values of the values of the first input tensor to the power of the values of the second input tensor, element-wise.
</div>

### element-wise unary operations ### {#api-mlgraphbuilder-unary}
Compute the element-wise unary operation for input tensor.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand abs(MLOperand x);
  MLOperand ceil(MLOperand x);
  MLOperand cos(MLOperand x);
  MLOperand exp(MLOperand x);
  MLOperand floor(MLOperand x);
  MLOperand log(MLOperand x);
  MLOperand neg(MLOperand x);
  MLOperand sin(MLOperand x);
  MLOperand tan(MLOperand x);
};
</script>
<div algorithm=unary>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.

    **Returns:** an {{MLOperand}}. The output tensor that contains the result of
    element-wise unary operation of the input tensor. The shape of the output
    tensor is the same as the shape of input tensor.

    **Operation types:**
        - *abs*: Compute the absolute value of the input tensor, element-wise.
        - *ceil*: Compute the ceiling of the input tensor, element-wise.
        - *cos*: Compute the cosine of the input tensor, element-wise.
        - *exp*: Compute the exponential of the input tensor, element-wise.
        - *floor*: Compute the floor of the input tensor, element-wise.
        - *log*: Compute the natural logarithm of the input tensor, element-wise.
        - *neg*: Compute the numerical negative value of the input tensor, element-wise.
        - *sin*: Compute the sine of the input tensor, element-wise.
        - *tan*: Compute the tangent of the input tensor, element-wise.
</div>

### elu ### {#api-mlgraphbuilder-elu}
Calculate the <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#ELU"> exponential linear unit function</a> on the input tensor element-wise. The calculation follows the expression `max(0, x) + alpha * (exp(min(0, x)) - 1)`.
<script type=idl>
dictionary MLEluOptions {
  float alpha = 1;
};

partial interface MLGraphBuilder {
  MLOperand elu(MLOperand x, optional MLEluOptions options = {});
  MLOperator elu(optional MLEluOptions options = {});
};
</script>
<div algorithm=elu>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.
        - *options*: an optional {{MLEluOptions}}. The optional parameters of the operation.
            - *alpha*: a {{float}} scalar multiplier, default to 1.

    **Returns:** 
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the elu operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.add(
              builder.max(0, x),
              builder.mul(
                builder.constant(options.alpha), 
                builder.sub(
                  builder.exp(builder.min(builder.constant(0), x)), 
                  builder.constant(1))));
    </pre>
    </div>
</div>

### gemm ### {#api-mlgraphbuilder-gemm}
Calculate the [general matrix multiplication of the Basic Linear Algebra Subprograms](https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3). The calculation follows the expression `alpha * A * B + beta * C`, where `A` is a 2-D tensor with shape [M, K] or [K, M], `B` is a 2-D tensor with shape [K, N] or [N, K], and `C` is broadcastable to the shape [M, N]. `A` and `B` may optionally be transposed prior to the calculation.
<script type=idl>
dictionary MLGemmOptions {
  MLOperand c;
  float alpha = 1.0;
  float beta = 1.0;
  boolean aTranspose = false;
  boolean bTranspose = false;
};

partial interface MLGraphBuilder {
  MLOperand gemm(MLOperand a, MLOperand b, optional MLGemmOptions options = {});
};
</script>
<div algorithm=gemm>
    **Arguments:**
        - *a*: an {{MLOperand}}. The first input 2-D tensor with shape [M, K] if *aTranspose* is false, or [K, M] if *aTranspose* is true.
        - *b*: an {{MLOperand}}. The second input 2-D tensor with shape [K, N] if *bTranspose* is false, or [N, K] if *bTranspose* is true.
        - *options*: an optional {{MLGemmOptions}}. The optional parameters of the operation.
            - *c*: an {{MLOperand}}. The third input tensor. It is either a scalar, or of the shape that is unidirectionally broadcastable to the shape [M, N] according to [[!numpy-broadcasting-rule]]. When it is not specified, the computation is done as if *c* is a scalar 0.0.
            - *alpha*: a {{float}} scalar multiplier for the first input, default to 1.0.
            - *beta*: a {{float}} scalar multiplier for the third input, default to 1.0.
            - *aTranspose*: a {{boolean}} indicating if the first input should be transposed prior to calculating the output, default to false.
            - *bTranspose*: a {{boolean}} indicating if the second input should be transposed prior to calculating the output, default to false.

    **Returns:** an {{MLOperand}}. The output 2-D tensor of shape [M, N] that contains the calculated product of all the inputs.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
    <pre highlight="js">
    if (options.aTranspose)
      a = builder.transpose(a);

    if (options.bTranspose)
      b = builder.transpose(b);

    let ab = builder.matmul(builder.mul(builder.constant(options.alpha), a), b);
    return (c ? builder.add(ab, builder.mul(builder.constant(options.beta), c)) : ab);
    </pre>
    </div>
</div>

### gru ### {#api-mlgraphbuilder-gru}
Gated Recurrent Unit [[GRU]] recurrent network using an update gate and a reset gate to compute the hidden state that rolls into the output across the temporal sequence of the Network
<script type=idl>
enum MLRecurrentNetworkWeightLayout {
  "zrn",  // update-reset-new gate ordering
  "rzn"   // reset-update-new gate ordering
};

enum MLRecurrentNetworkDirection {
  "forward",
  "backward",
  "both"
};

dictionary MLGruOptions {
  MLOperand bias;
  MLOperand recurrentBias;
  MLOperand initialHiddenState;
  boolean resetAfter = true;
  boolean returnSequence = false;
  MLRecurrentNetworkDirection direction = "forward";
  MLRecurrentNetworkWeightLayout layout = "zrn";
  sequence<MLOperator> activations;
};

partial interface MLGraphBuilder {
  sequence<MLOperand> gru(MLOperand input, MLOperand weight, MLOperand recurrentWeight, 
                        long steps, long hiddenSize, optional MLGruOptions options = {});
};
</script>
<div algorithm=gru>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input 3-D tensor of shape [steps, batch_size, input_size]. 
        - *weight*: an {{MLOperand}}. The 3-D input weight tensor of shape [num_directions, 3 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the *layout* argument.
        - *recurrentWeight*: an {{MLOperand}}. The 3-D recurrent weight tensor of shape [num_directions, 3 * hidden_size, hidden_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the *layout* argument.
        - *steps*: a {{long}} scalar. The number of time steps in the recurrent network. The value must be greater than 0.
        - *hiddenSize*: a {{long}} scalar. The value of the third dimension of the cell output tensor shape. It indicates the number of features in the hidden state.
        - *options*: an optional {{MLGruOptions}}. The optional parameters of the operation.
            - *bias*: an {{MLOperand}}. The 2-D input bias tensor of shape [num_directions, 3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the *options.layout* argument.
            - *recurrentBias*: an {{MLOperand}}. The 2-D recurrent bias tensor of shape [num_directions, 3 * hidden_size]. The ordering of the bias vectors in the second dimension of the tensor shape is specified according to the *options.layout* argument.
            - *initialHiddenState*: an {{MLOperand}}. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, it's assumed to be a tensor filled with zero.
            - *resetAfter*: a {{boolean}} indicating whether to apply the reset gate after or before matrix multiplication. Default to true.
            - *returnSequence*: a {{boolean}} indicating whether to also return the entire sequence with every cell output from each time step in it in addition to the cell output of the last time step. Default to false.
            - *direction*: a {{MLRecurrentNetworkDirection}}. The processing direction of the input sequence. When set to *"both"*, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions.
            - *layout*: a {{MLRecurrentNetworkWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of GRU, specifically the *update (z)*, *reset (r)*, and *new (n)* gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is *"zrn"*.
            - *activations*: a sequence of {{MLOperator}}. A pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, it's assumed to be the sigmoid (*"sigmoid"*) and the hyperbolic tangent (*"tanh"*) function respectively.

    **Returns:** a sequence of {{MLOperand}}. The first element of the sequence is a 3-D tensor of shape [num_directions, batch_size, hidden_size], the cell output from the last time step of the network. Additionally, if *returnSequence* is set to true, the second element is the 4-D output tensor of shape [steps, num_directions, batch_size, hidden_size] containing every cell outputs from each time step in the temporal sequence.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
    <pre highlight="js">
    const numDirections = (options.direction == "both" ? 2 : 1);
    let hiddenState = options.initialHiddenState;

    if (!hiddenState) {
      const desc = { type: 'float32', dimensions: [numDirections, 1, hiddenSize] };
      const totalSize = numDirections * hiddenSize;
      hiddenState = builder.constant(desc, new Float32Array(totalSize).fill(0));
    }

    let sequence = null;
    let cellWeight = [];
    let cellRecurrentWeight = [];
    let cellBias = [];
    let cellRecurrentBias = [];

    for (let slot = 0; slot < numDirections; ++slot) {
      cellWeight.push(builder.squeeze(builder.slice(weight, [slot, 0, 0], [1, -1, -1]), { axes: [0] }));
      cellRecurrentWeight.push(builder.squeeze(builder.slice(recurrentWeight, [slot, 0, 0], [1, -1, -1]), { axes: [0] }));
      cellBias.push(options.bias ? (builder.squeeze(builder.slice(options.bias, [slot, 0], [1, -1]), { axes: [0] })) : null);
      cellRecurrentBias.push(options.recurrentBias ? 
        (builder.squeeze(builder.slice(options.recurrentBias, [slot, 0], [1, -1]), { axes: [0] })) : null);
    }

    for (let step = 0; step < steps; ++step) {
      let cellHidden = [];
      let cellOutput = null;

      for (let slot = 0; slot < numDirections; ++slot) {
        cellHidden.push(builder.squeeze(builder.slice(hiddenState, [slot, 0, 0], [1, -1, -1]), { axes: [0] }));
      }

      for (let slot = 0; slot < numDirections; ++slot) {
        let slice = (slot == 1 || options.direction == "backward" ? steps - step - 1 : step);
        let cellInput = builder.squeeze(builder.slice(input, [slice, 0, 0], [1, -1, -1]), { axes: [0] });

        let result = builder.reshape(
          builder.gruCell(
            cellInput, cellWeight[slot], cellRecurrentWeight[slot],
            cellHidden[slot], hiddenSize, { bias: cellBias[slot],
            recurrentBias: cellRecurrentBias[slot], resetAfter: options.resetAfter,
            layout: options.layout, activations: options.activations }),
          [1, -1, hiddenSize]);

        cellOutput = (cellOutput ? builder.concat([cellOutput, result], 0) : result);
      }

      hiddenState = cellOutput;

      if (options.returnSequence) {
        cellOutput = builder.reshape(cellOutput, [1, numDirections, -1, hiddenSize]);
        sequence = (sequence ? builder.concat([sequence, cellOutput], 0) : cellOutput);
      }
    }

    return (sequence ? [hiddenState, sequence] : [hiddenState]);
    </pre>
    </div>
</div>

### gruCell ### {#api-mlgraphbuilder-grucell}
A single time step of the Gated Recurrent Unit [[GRU]] recurrent network using an update gate and a reset gate to compute the hidden state that rolls into the output across the temporal sequence of a recurrent network.
<script type=idl>
dictionary MLGruCellOptions {
  MLOperand bias;
  MLOperand recurrentBias;
  boolean resetAfter = true;
  MLRecurrentNetworkWeightLayout layout = "zrn";
  sequence<MLOperator> activations;
};

partial interface MLGraphBuilder {
  MLOperand gruCell(MLOperand input, MLOperand weight, MLOperand recurrentWeight, 
                  MLOperand hiddenState, long hiddenSize, optional MLGruCellOptions options = {});
};
</script>
<div algorithm=grucell>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input 2-D tensor of shape [batch_size, input_size]. 
        - *weight*: an {{MLOperand}}. The 2-D input weight tensor of shape [3 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *layout* argument.
        - *recurrentWeight*: an {{MLOperand}}. The 2-D recurrent weight tensor of shape [3 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *layout* argument.
        - *hiddenState*: an {{MLOperand}}. The 2-D input hidden state tensor of shape [batch_size, hidden_size].
        - *hiddenSize*: a {{long}} scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.
        - *options*: an optional {{MLGruCellOptions}}. The optional parameters of the operation.
            - *bias*: an {{MLOperand}}. The 1-D input bias tensor of shape [3 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
            - *recurrentBias*: an {{MLOperand}}. The 1-D recurrent bias tensor of shape [3 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
            - *resetAfter*: a {{boolean}} indicating whether to apply the reset gate after or before matrix multiplication. Default to true.
            - *layout*: a {{MLRecurrentNetworkWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of GRU, specifically the *update (z)*, *reset (r)*, and *new (n)* gate, as indicated in the first dimension of the weight and bias tensor shapes. When not specified, the default layout is *"zrn"*.
            - *activations*: a sequence of {{MLOperator}}. A pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, it's default to the sigmoid (*"sigmoid"*) and the hyperbolic tangent (*"tanh"*) function respectively.

    **Returns:** an {{MLOperand}}. The 2-D tensor of shape [batch_size, hidden_size], the cell output hidden state of a single time step of the recurrent network.

    <div class="note">
    The behavior of this operation when the activations of the update/reset gate and new gate are of the operator types *sigmoid* and *tanh* respectively can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
    <pre highlight="js">
    const one = builder.constant(1);
    const zero = builder.constant(0);

    // update gate
    let z = builder.sigmoid(
      builder.add(
        builder.add(
          (options.bias ? builder.slice(options.bias, [0], [hiddenSize]) : zero), 
          (options.recurrentBias ? builder.slice(options.recurrentBias, [0], [hiddenSize]) : zero)
          ),
        builder.add(
          builder.matmul(
            input, 
            builder.transpose(builder.slice(weight, [0, 0], [hiddenSize, -1]))
            ),
          builder.matmul(
            hiddenState,
            builder.transpose(builder.slice(recurrentWeight, [0, 0], [hiddenSize, -1]))
            )
          )
        )
      );

    // reset gate
    let r = builder.sigmoid(
      builder.add(
        builder.add(
          (options.bias ? builder.slice(options.bias, [hiddenSize], [hiddenSize]) : zero),
          (options.recurrentBias ? builder.slice(options.recurrentBias, [hiddenSize], [hiddenSize]) : zero)
          ),
        builder.add(
          builder.matmul(
            input, 
            builder.transpose(builder.slice(weight, [hiddenSize, 0], [hiddenSize, -1]))
            ),
          builder.matmul(
            hiddenState, 
            builder.transpose(builder.slice(recurrentWeight, [hiddenSize, 0], [hiddenSize, -1]))
            )
          )
        )
      );

    // new gate
    let n;
    if (resetAfter) {
      n = builder.tanh(
        builder.add(
          (options.bias ? builder.slice(options.bias, [2 * hiddenSize], [hiddenSize]) : zero),
          builder.add(
            builder.matmul(
              input, 
              builder.transpose(builder.slice(weight, [2 * hiddenSize, 0], [hiddenSize, -1]))
              ),
            builder.mul(
              r,
              builder.add(
                (options.recurrentBias ? builder.slice(options.recurrentBias, [2 * hiddenSize], [hiddenSize]) : zero),
                builder.matmul(
                  hiddenState, 
                  builder.transpose(builder.slice(recurrentWeight, [2 * hiddenSize, 0], [hiddenSize, -1]))
                  )
                )
              )
            )
          )
        );
    }
    else {
      n = builder.tanh(
        builder.add(
          builder.add(
            (options.bias ? builder.slice(options.bias, [2 * hiddenSize], [hiddenSize]) : zero),
            (options.recurrentBias ? builder.slice(options.recurrentBias, [2 * hiddenSize], [hiddenSize]) : zero)
            ),
          builder.add(
            builder.matmul(
              input, 
              builder.transpose(builder.slice(weight, [2 * hiddenSize, 0], [hiddenSize, -1]))
              ),
            builder.matmul(
              builder.mul(r, hiddenState),
              builder.transpose(builder.slice(recurrentWeight, [2 * hiddenSize, 0], [hiddenSize, -1]))
              )
            )
          )
        );
    }

    // compute the new hidden state
    return builder.add(builder.mul(z, hiddenState), builder.mul(n, builder.sub(one, z)));
    </pre>
    </div>
</div>

### hardSigmoid ### {#api-mlgraphbuilder-hard-sigmoid}
Calculate the <a href="https://en.wikipedia.org/wiki/Hard_sigmoid"> non-smooth function</a> used in place of a sigmoid function on the input tensor.
<script type=idl>
dictionary MLHardSigmoidOptions {
  float alpha = 0.2;
  float beta = 0.5;
};

partial interface MLGraphBuilder {
  MLOperand hardSigmoid(MLOperand x, optional MLHardSigmoidOptions options = {});
  MLOperator hardSigmoid(optional MLHardSigmoidOptions options = {});
};
</script>
<div algorithm=hard-sigmoid>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.
        - *options*: an optional {{MLHardSigmoidOptions}}. The optional parameters of the operation.
            - *alpha*: a {{float}} scalar multiplier, default to 0.2.
            - *beta*: a {{float}} scalar addition, default to 0.5.

    **Returns:** 
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the hard sigmoid operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.max(
               builder.min(
                   builder.add(
                       builder.mul(builder.constant(options.alpha), x),
                       builder.constant(options.beta)), 
                   builder.constant(1)),
               builder.constant(0));
    </pre>
    </div>
</div>

### hardSwish ### {#api-mlgraphbuilder-hard-swish}
Computes the nonlinear function `y = x * max(0, min(6, (x + 3))) / 6` that is introduced by [[MobileNetV3]] on the input tensor element-wise.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand hardSwish(MLOperand x);
  MLOperator hardSwish();
};
</script>
<div algorithm=hard-swish>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.

    **Returns:**
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the hard-swish operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.div(
               builder.mul(
                   x,
                   builder.max(
                       builder.constant(0),
                       builder.min(
                           builder.constant(6),
                           builder.add(x, builder.constant(3))))),
               builder.constant(6));
    </pre>
    </div>
</div>

### instanceNormalization ### {#api-mlgraphbuilder-instancenorm}
Normalize the input features using [[Instance-Normalization]]. Unlike [[#api-mlgraphbuilder-batchnorm]] where the mean and variance values used in the calculation are previously computed across the batch dimension during the model training phase, the mean and variance values used in the calculation of an instance normalization are computed internally on the fly per input feature.
<script type=idl>
dictionary MLInstanceNormalizationOptions {
  MLOperand scale;
  MLOperand bias;
  float epsilon = 1e-5;
  MLInputOperandLayout layout = "nchw";
};

partial interface MLGraphBuilder {
  MLOperand instanceNormalization(MLOperand input, 
                                optional MLInstanceNormalizationOptions options = {});
};
</script>
<div algorithm=instancenorm>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input 4-D tensor.
        - *options*: an optional {{MLInstanceNormalizationOptions}}. The optional parameters of the operation.
              - *scale*: an {{MLOperand}}. The 1-D tensor of the scaling values whose length is equal to the size of the feature dimension of the input e.g. for the input tensor with *nchw* layout, the feature dimension is 1.
              - *bias*: an {{MLOperand}}. The 1-D tensor of the bias values whose length is equal to the size of the feature dimension of the input e.g. for the input tensor with *nchw* layout, the feature dimension is 1.
              - *epsilon*: a {{float}} scalar. A small value to prevent computational error due to divide-by-zero. The default value is 0.00001 when not specified.
              - *layout*: an {{MLInputOperandLayout}}. This option specifies the layout format of the input. The default value is *"nchw"*.
        
    **Returns:** an {{MLOperand}}. The instance-normalized 4-D tensor of the same shape as the input tensor.

    <div class="note">
    The behavior of this operation when the input tensor is 4-D of the *"nchw"* layout can be generically emulated from 
    the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, 
    therefore its usage is encouraged from the performance standpoint.
    <pre highlight="js">
    // The mean reductions happen over the spatial dimensions of the input
    // e.g. axis 2 and 3 of the input tensor.
    const reduceOptions = { axes: [2,3], keepDimensions: true };
    const mean = builder.reduceMean(input, reduceOptions);
    const variance = builder.reduceMean(
      builder.pow(
        builder.sub(input, mean), 
        buider.constant(2)),
      reduceOptions
      );

    // The scale and bias values are applied per input feature
    // e.g. axis 1 of the input tensor.
    const shape = [1,-1,1,1];
    return builder.add(
      builder.mul(
        builder.reshape(options.scale, shape),
        builder.div(
          builder.sub(input, mean),
          buidler.pow(
            builder.add(variance, options.epsilon), 
            builder.constant(0.5))
          )
        ),
      builder.reshape(options.bias, shape)
      );
    </pre>
    </div>
</div>

### leakyRelu ### {#api-mlgraphbuilder-leakyrelu}
Calculate the <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#Leaky_ReLU"> leaky version of rectified linear function</a> on the input tensor element-wise. The calculation follows the expression `max(0, x) + alpha ∗ min(0, x)`.
<script type=idl>
dictionary MLLeakyReluOptions {
  float alpha = 0.01;
};

partial interface MLGraphBuilder {
  MLOperand leakyRelu(MLOperand x, optional MLLeakyReluOptions options = {});
  MLOperator leakyRelu(optional MLLeakyReluOptions options = {});
};
</script>
<div algorithm=leakyrelu>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.
        - *options*: an optional {{MLLeakyReluOptions}}. The optional parameters of the operation.
            - *alpha*: a {{float}} scalar multiplier, default to 0.01.

    **Returns:**
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the leaky relu operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.add(builder.max(builder.constant(0), x),
              builder.mul(builder.constant(options.alpha), builder.min(builder.constant(0), x)));
    </pre>
    </div>
</div>

### matmul ### {#api-mlgraphbuilder-matmul}
Compute the matrix product of two input tensors.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand matmul(MLOperand a, MLOperand b);
};
</script>
<div algorithm=matmul>
    **Arguments:**
        - *a*: an {{MLOperand}}. The first input N-D tensor.
        - *b*: an {{MLOperand}}. The second input N-D tensor.

    **Returns:** an {{MLOperand}}. The output N-D tensor that contains the matrix
    product of two input tensors.

    Compute the matrix product of two input tensors. It behaves as following:
        - If both *a* and *b* are 2-D, they are multiplied like conventional
            matrices and produce a 2-D tensor as the output.
        - If either *a* or *b* is N-D, N > 2, it is treated as a stack of
            matrices with dimensions corresponding to the last two indices. The
            matrix multiplication will be broadcasted accordingly by following
            [[!numpy-broadcasting-rule]]. The output is a N-D tensor whose rank
            is the maximum rank of the input tensors. For each dimension, except
            the last two, of the output tensor, its size is the maximum size
            along that dimension of the input tensors.
        - If *a* is 1-D, it is converted to a 2-D tensor by prepending a 1 to
            its dimensions.
        - If *b* is 1-D, it is converted to a 2-D tensor by by appending a 1 to
            its dimensions.
        - If both *a* and *b* are 1-D, the operation is a vector dot-product,
            which produces a scalar output.
</div>

### linear ### {#api-mlgraphbuilder-linear}
Calculate a linear function `y = alpha * x + beta` on the input tensor.
<script type=idl>
dictionary MLLinearOptions {
  float alpha = 1;
  float beta = 0;
};

partial interface MLGraphBuilder {
  MLOperand linear(MLOperand x, optional MLLinearOptions options = {});
  MLOperator linear(optional MLLinearOptions options = {});
};
</script>
<div algorithm=linear>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.
        - *options*: an optional {{MLLinearOptions}}. The optional parameters of the operation.
            - *alpha*: a {{float}} scalar multiplier, default to 1.
            - *beta*: a {{float}} scalar addition, default to 0.

    **Returns:**
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the linear operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.add(
              builder.mul(x, builder.constant(options.alpha)), 
              builder.constant(options.beta));
    </pre>
    </div>
</div>

### pad ### {#api-mlgraphbuilder-pad}
Inflate the tensor with constant or mirrored values on the edges.
<script type=idl>
enum MLPaddingMode {
  "constant",
  "edge",
  "reflection",
  "symmetric"
};

dictionary MLPadOptions {
  MLPaddingMode mode = "constant";
  float value = 0;
};

partial interface MLGraphBuilder {
  MLOperand pad(MLOperand input, MLOperand padding, optional MLPadOptions options = {});
};
</script>
<div algorithm=pad>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input tensor.
        - *padding*: an {{MLOperand}}. The 2-D Tensor of integer values indicating the number of padding values to add at the beginning and end of each input dimensions. The tensor has shape [*n*, 2] where *n* is the rank of the input tensor. For each dimension *D* of *input*, *padding[D, 0]* indicates how many values to add before the content in that dimension, and *padding[D, 1]* indicates how many values to add after the content in that dimension.
        - *options*: an optional {{MLPadOptions}}. The optional parameters of the operation.
            - *mode*: a {{MLPaddingMode}}. The different ways to pad the tensor. When not set, it's assumed to be "constant".
            - *value*: a {{float}}. The pad value when the *options.mode* is set to *"constant"*. When not set, it's assumed to be 0.

    **Returns:** an {{MLOperand}}. The padded output tensor.
    <div class="example">
    <pre highlight="js">
    // input: [[1,2,3], [4,5,6]]
    const input = builder.constant(
      { type: 'float32', dimensions: [2,3] }, new Float32Array([1,2,3,4,5,6]));

    // padding: [[1,1], [2,2]]
    const padding = builder.constant(
      { type: 'float32', dimensions: [2,2] }, new Float32Array([1,1,2,2]));

    // "constant" padded:
    //    [[0,0,0,0,0,0,0],
    //     [0,0,1,2,3,0,0],
    //     [0,0,4,5,6,0,0],
    //     [0,0,0,0,0,0,0]]
    builder.pad(input, padding);

    // "edge" padded:
    //    [[1,1,1,2,3,3,3],
    //     [1,1,1,2,3,3,3],
    //     [4,4,4,5,6,6,6],
    //     [4,4,4,5,6,6,6]]
    builder.pad(input, padding, { mode: "edge" });

    // "reflection" padded:
    //    [[6,5,4,5,6,5,4],
    //     [3,2,1,2,3,2,1],
    //     [6,5,4,5,6,5,4],
    //     [3,2,1,2,3,2,1]]
    builder.pad(input, padding, { mode: "reflection" });

    // "symmetric" padded:
    //    [[2,1,1,2,3,3,2],
    //     [2,1,1,2,3,3,2],
    //     [5,4,4,5,6,6,5],
    //     [5,4,4,5,6,6,5]]
    builder.pad(input, padding, { mode: "symmetric" });
    </pre>
    </div>
</div>

### pooling operations ### {#api-mlgraphbuilder-pool2d}
Compute a *mean*, *L2 norm*, or *max* reduction operation across all the elements within the moving window over the input tensor. See the description of each type of reduction in [[#api-mlgraphbuilder-reduce]].
<script type=idl>
enum MLRoundingType {
  "floor",
  "ceil"
};

dictionary MLPool2dOptions {
  sequence<long> windowDimensions;
  sequence<long> padding;
  sequence<long> strides;
  sequence<long> dilations;
  MLAutoPad autoPad = "explicit";
  MLInputOperandLayout layout = "nchw";
  MLRoundingType roundingType = "floor";
  sequence<long> outputSizes;
};

partial interface MLGraphBuilder {
  MLOperand averagePool2d(MLOperand input, optional MLPool2dOptions options = {});
  MLOperand l2Pool2d(MLOperand input, optional MLPool2dOptions options = {});
  MLOperand maxPool2d(MLOperand input, optional MLPool2dOptions options = {});
};
</script>
<div algorithm=pool2d>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input 4-D tensor. The logical shape
            is interpreted according to the value of *options.layout*.
        - *options*: an optional {{MLPool2dOptions}}. The optional parameters of the operation.
            - *windowDimensions*: a sequence of {{long}} of length 2. The dimensions of the sliding window,
                [window_height, window_width]. If not present, the window dimensions are assumed to be the height  
                and width dimensions of the input shape. 
            - *padding*: a sequence of {{long}} of length 4. The additional rows and columns added to the beginning and ending of each spatial dimension of *input*, [beginning_height, ending_height, beginning_width, ending_width]. If not present, the values are assumed to be [0,0,0,0].
            - *strides*: a sequence of {{long}} of length 2. The stride of the
                sliding window for each spatial dimension of *input*,
                [stride_height, stride_width]. If not present, the values are assumed to be [1,1].
            - *dilations*: a sequence of {{long}} of length 2. The dilation factor
                for each spatial dimension of *input*, [dilation_height, dilation_width].
                If not present, the values are assumed to be [1,1].
            - *autoPad*: an {{MLAutoPad}}. The automatic input padding options. By default, this argument is set to *"explicit"*, which means that the values in the *options.padding* array should be used for input padding. When the option is set other than *"explicit"*, the values in the *options.padding* array are ignored. With the *"same-upper"* option, the padding values are automatically computed such that the additional ending padding of the spatial input dimensions would allow all of the input values in the corresponding dimension to be filtered. The *"same-lower"* option is similar but padding is applied to the beginning padding of the spatial input dimensions instead of the ending one.
            - *layout*: an {{MLInputOperandLayout}}. The default value is *"nchw"*. This option specifies the
                layout format of the input and output tensor as follow:

                "nchw":
                    - input tensor: [batches, channels, height, width]
                    - output tensor: [batches, channels, height, width]

                "nhwc":
                    - input tensor: [batches, height, width, channels]
                    - output tensor: [batches, height, width, channels]
            - *roundingType*: an {{MLRoundingType}}. The option specifies the rounding function used to compute the output shape.
            - *outputSizes*: a sequence of long of length 2. The sizes of the two spacial dimensions of the output tensor. When the output sizes are explicitly specified, the options.roundingType is ignored. If not specified, the output sizes are automatically computed.

    **Returns:** an {{MLOperand}}. The output 4-D tensor that contains the
    result of the reduction. The logical shape is interpreted according to the
    value of *layout*. More specifically, if the *options.roundingType* is *"floor"*, the spatial dimensions of the output tensor can be calculated as follow:

    *output size = floor(1 + (input size - filter size + beginning padding + ending padding) / stride)*

    or if *options.roundingType* is *"ceil"*:

    *output size = ceil(1 + (input size - filter size + beginning padding + ending padding) / stride)*

    <div class="note">
    A *global* pooling operation such as one for the max pooling operation is a variant of pooling where the window dimensions is the spatial dimensions (last two dimensions) of the input shape, as follow.
    <pre highlight="js">
    // 'global' max pooling
    builder.maxPool2d(input);
    </pre>
    </div>
</div>

### reduction operations ### {#api-mlgraphbuilder-reduce}
Reduce the input along the dimensions given in *axes*.
<script type=idl>
dictionary MLReduceOptions {
  sequence<long> axes = null;
  boolean keepDimensions = false;
};

partial interface MLGraphBuilder {
  MLOperand reduceL1(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceL2(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceLogSum(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceLogSumExp(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceMax(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceMean(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceMin(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceProduct(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceSum(MLOperand input, optional MLReduceOptions options = {});
  MLOperand reduceSumSquare(MLOperand input, optional MLReduceOptions options = {});
};
</script>
<div algorithm=reduce>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input tensor.
        - *options*: an optional {{MLReduceOptions}}. The optional parameters of the operation.
            - *axes*: a sequence of {{long}}. The dimensions to reduce where -1 means the last dimension.
                If not present, all dimensions are reduced.
            - *keepDimensions*: a {{boolean}}. If true, retains reduced dimensions with size of 1.
                The default value is false.

    **Returns:** an {{MLOperand}}. The reduced output tensor.

    **Reduction types:**
        - *L1*: Compute the <a href="https://mathworld.wolfram.com/L1-Norm.html">L1 norm</a> of all the input values along the axes.
        - *L2*: Compute the <a href="https://mathworld.wolfram.com/L2-Norm.html">L2 norm</a> of all the input values along the axes.
        - *LogSum*: Compute the log value of the sum of all the input values along the axes.
        - *LogSumExp*: Compute the log value of the sum of the exponent of all the input values along the axes.
        - *Max*: Compute the maximum value of all the input values along the axes.
        - *Mean*: Compute the average value of all the input values along the axes.
        - *Min*: Compute the minimum value of all the input values along the axes.
        - *Product*: Compute the product of all the input values along the axes.
        - *Sum*: Compute the sum of all the input values along the axes.
        - *SumSquare*: Compute the sum of the square of all the input values along the axes.
</div>

### relu ### {#api-mlgraphbuilder-relu}
Compute the <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)">rectified linear function</a> of the input tensor.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand relu(MLOperand x);
  MLOperator relu();
};
</script>
<div algorithm=relu>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.

    **Returns:**
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the relu operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.max(builder.constant(0), x);
    </pre>
    </div>
</div>

### resample2d ### {#api-mlgraphbuilder-resample2d}
Resample the tensor values from the source to the destination spatial dimensions according to the scaling factors.
<script type=idl>
enum MLInterpolationMode {
  "nearest-neighbor",
  "linear"
};

dictionary MLResample2dOptions {
  MLInterpolationMode mode = "nearest-neighbor";
  sequence<float> scales;
  sequence<long> sizes;
  sequence<long> axes;
};

partial interface MLGraphBuilder {
  MLOperand resample2d(MLOperand input, optional MLResample2dOptions options = {});
};
</script>
<div algorithm=resample2d>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input 4-D tensor.
        - *options*: an optional {{MLResample2dOptions}}. The optional parameters of the operation.
            - *mode*: an {{MLInterpolationMode}}. The interpolation algorithm used to fill the output tensor values.
                If not set, it is assumed to be the *Nearest Neighbor* interpolation.
            - *scales*: a sequence of {{float}} of length 2. Each value represents the scaling factor used to scale in each spatial dimensions of input, [scale_height, scale_width]. If not set, the values are assumed to be [1.0, 1.0].
            - *sizes*: a sequence of {{long}} of length 2. The target sizes for each spatial dimensions of input, [size_height, size_width]. When the target sizes are specified, the *options.scales* argument is ignored as the scaling factor values are derived from the target sizes of each spatial dimension of input.
            - *axes*: a sequence of {{long}} of length 2. The two consecutive dimensions of the input tensor to which the interpolation algorithm applies. The valid values in the sequence are [0, 1], [1, 2] or [2, 3]. When not specified, the sequence is assumed to be [2, 3].

    **Returns:** an {{MLOperand}}. The output 4-D tensor.
</div>

### reshape ### {#api-mlgraphbuilder-reshape}
Alter the shape of a tensor to a new shape. Reshape does not copy or change the content of the tensor. It just changes the tensor's logical dimensions for the subsequent operations.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand reshape(MLOperand input, sequence<long> newShape);
};
</script>
<div algorithm=reshape>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input tensor.
        - *newShape*: a sequence of {{long}}. The shape of the output tensor.
            The number of elements implied by *newShape* must be the same as the
            number of elements in the input tensor. Only one component of
            *newShape* can be the special value of -1. The size of the dimension
            with the value -1 is computed so that the total size remains
            constant.

    **Returns:** an {{MLOperand}}. The output tensor. The values of the output
    tensor are the same as values of the input tensor. The shape of the output
    tensor is specified by the *newShape* argument.
</div>

### sigmoid ### {#api-mlgraphbuilder-sigmoid}
Compute the <a href="https://en.wikipedia.org/wiki/Sigmoid_function">sigmoid function</a> of the input tensor. The calculation follows the expression `1 / (exp(-x) + 1)`.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand sigmoid(MLOperand x);
  MLOperator sigmoid();
};
</script>
<div algorithm=sigmoid>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.

    **Returns:**
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the sigmoid operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.div(
              builder.constant(1),
              builder.add(
                builder.exp(builder.neg(x)), 
                builder.constant(1)));
    </pre>
    </div>
</div>

### slice ### {#api-mlgraphbuilder-slice}
Produce a slice of the input tensor.
<script type=idl>
dictionary MLSliceOptions {
  sequence<long> axes;
};

partial interface MLGraphBuilder {
  MLOperand slice(MLOperand input, sequence<long> starts, sequence<long> sizes,
                optional MLSliceOptions options = {});
};
</script>
<div algorithm=slice>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input tensor.
        - *starts*: a sequence of {{long}}. The starting indices to slice of the corresponding axes of the input shape. A negative index value is interpreted as counting back from the end. For example, the value -1 
        - *sizes*: a sequence of {{long}}. The lengths to slice of the corresponding axes of the input shape.
            The length value of -1 selects all the remaining elements from the starting index of the given axis.
        - *options*: an optional {{MLSliceOptions}}. The optional parameters of the operation.
            - *axes*: a sequence of {{long}}. The dimensions of the input shape to which *starts* and *sizes* apply. The values in the sequence are either within the [0, *r*-1] range where *r* is the input tensor rank, or the [*-r*, -1] range where negative values mean counting back from the end of the input shape. When not specified, the sequence is assumed to be [0,1,..*r-1*].  

    **Returns:** an {{MLOperand}}. The output tensor of the same rank as the input tensor with tensor values stripped to the specified starting and ending indices in each dimension.
</div>

### softmax ### {#api-mlgraphbuilder-softmax}
Compute the [softmax](https://en.wikipedia.org/wiki/Softmax_function) values of
the 2-D input tensor along axis 1.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand softmax(MLOperand x);
};
</script>
<div algorithm=softmax>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input 2-D tensor.

    **Returns:** an {{MLOperand}}. The output 2-D tensor that contains the softmax
    results, of the same shape as the input tensor.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    // This sample deploys a well-known implementation trick [1] to compute the
    // exponentials of the distances to the max value, instead of the exponentials
    // of the input values itself, in order to increase the numerical stability of
    // the result.
    // [1]: https://cs231n.github.io/linear-classify/#softmax
    const max_x = builder.reduceMax(x, { axes: [1], keepDimensions: true });
    const exp_x = builder.exp(builder.sub(x, max_x));
    return builder.div(exp_x, builder.reduceSum(exp_x, { axes: [1], keepDimensions: true }));
    </pre>
    </div>
</div>

### softplus ### {#api-mlgraphbuilder-softplus}
Compute the <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#Softplus">softplus function</a> of the input tensor. The calculation follows the expression `ln(1 + exp(steepness * x)) / steepness`.
<script type=idl>
dictionary MLSoftplusOptions {
  float steepness = 1;
};

partial interface MLGraphBuilder {
  MLOperand softplus(MLOperand x, optional MLSoftplusOptions options = {});
  MLOperator softplus(optional MLSoftplusOptions options = {});
};
</script>
<div algorithm=softplus>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.

    **Returns:**
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the softplus operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.div(
              builder.log(
                builder.add(
                  builder.exp(builder.mul(x, builder.constant(options.steepness))),
                  builder.constant(1))),
              builder.constant(options.steepness));
    </pre>
    </div>
</div>

### softsign ### {#api-mlgraphbuilder-softsign}
Compute the <a href="https://pytorch.org/docs/stable/generated/torch.nn.Softsign.html">softsign function</a> of the input tensor. The calculation follows the expression `x / (1 + |x|)`.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand softsign(MLOperand x);
  MLOperator softsign();
};
</script>
<div algorithm=softsign>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.

    **Returns:**
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the softsign operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.div(x, builder.add(builder.constant(1), build.abs(x)));
    </pre>
    </div>
</div>

### split ### {#api-mlgraphbuilder-split}
Split the input tensor into a number of sub tensors along the given axis.
<script type=idl>
dictionary MLSplitOptions {
  long axis = 0;
};

partial interface MLGraphBuilder {
  sequence<MLOperand> split(MLOperand input,
                          (unsigned long or sequence<unsigned long>) splits,
                          optional MLSplitOptions options = {});
};
</script>
<div algorithm=split>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input tensor.
        - *splits*: an {{unsigned long}} or a sequence of {{unsigned long}}. If an {{unsigned long}}, it specifies the number of output tensors along the axis. The number must evenly divide the dimension size of *input* along *options.axis*. If a sequence of {{unsigned long}}, it specifies the sizes of each output tensor along the *options.axis*. The sum of sizes must equal to the dimension size of *input* along *options.axis*.
        - *options*: an optional {{MLSplitOptions}}. The optional parameters of the operation.
            - *axis*: a {{long}}. The dimension along which to split. Default to 0. A negative value is interpreted as counting back from the end.

    **Returns:** a sequence of {{MLOperand}}. The splitted output tensors. If *splits* is an {{unsigned long}}, the length of the output sequence equals to *splits*. The shape of each output tensor is the same as *input* except the dimension size of *axis* equals to the quotient of dividing the dimension size of *input* along *axis* by *splits*. If *splits* is a sequence of {{unsigned long}}, the length of the output sequence equals to the length of *splits*. The shape of the i-th output tensor is the same as as *input* except along *axis* where the dimension size is *splits[i]*.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    // This sample shows the case that the splits parameter is an array.
    const outputs = [];
    let start = 0;
    for (const size of splits) {
      outputs.push(builder.slice(input, [start], [size], { axis: [options.axis] }));
      start += size;
    }
    return outputs;
    </pre>
    </div>
</div>

### squeeze ### {#api-mlgraphbuilder-squeeze}
Reduce the rank of a tensor by eliminating dimensions with size 1 of the tensor shape. Squeeze only affects the tensor's logical dimensions. It does not copy or change the content in the tensor.
<script type=idl>
dictionary MLSqueezeOptions {
  sequence<long> axes;
};

partial interface MLGraphBuilder {
  MLOperand squeeze(MLOperand input, optional MLSqueezeOptions options = {});
};
</script>
<div algorithm=squeeze>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input tensor.
        - *options*: an optional {{MLSqueezeOptions}}. The optional parameters of the operation.
            - *axes*: a sequence of {{long}}. Indices to the shape dimensions of size 1 to eliminate. When not specified, every shape dimensions of size 1 in the tensor are eliminated.

    **Returns:** an {{MLOperand}}. The output tensor of the same or reduced rank with the shape dimensions of size 1 eliminated.
</div>

### tanh ### {#api-mlgraphbuilder-tanh}
Compute the <a href="https://en.wikipedia.org/wiki/Hyperbolic_functions">hyperbolic tangent function</a> of the input tensor. The calculation follows the expression `(exp(2 * x) - 1) / (exp(2 * x) + 1)`.
<script type=idl>
partial interface MLGraphBuilder {
  MLOperand tanh(MLOperand x);
  MLOperator tanh();
};
</script>
<div algorithm=tanh>
    **Arguments:**
        - *x*: an {{MLOperand}}. The input tensor.

    **Returns:**
        - an {{MLOperand}}. The output tensor of the same shape as *x*.
        - an {{MLOperator}}. The operator representing the tanh operation.

    <div class="note">
    The behavior of this operation can be generically emulated from the usage of
    other operations as follow. However, user agents typically have a more
    efficient implementation for it, therefore its usage is encouraged from the
    performance standpoint.
    <pre highlight="js">
    return builder.div(
              builder.sub(builder.exp(builder.mul(builder.constant(2), x)), builder.constant(1)),
              builder.add(builder.exp(builder.mul(builder.constant(2), x)), builder.constant(1)));
    </pre>
    </div>
</div>

### transpose ### {#api-mlgraphbuilder-transpose}
Permute the dimensions of the input tensor according to the *permutation* argument.
<script type=idl>
dictionary MLTransposeOptions {
  sequence<long> permutation;
};

partial interface MLGraphBuilder {
  MLOperand transpose(MLOperand input, optional MLTransposeOptions options = {});
};
</script>
<div algorithm=transpose>
    **Arguments:**
        - *input*: an {{MLOperand}}. The input N-D tensor.
        - *options*: an optional {{MLTransposeOptions}}. The optional parameters of the operation.
            - *permutation*: a sequence of {{long}} values. The values used to permute the output shape. When it's not specified, it's set to `[N-1...0]`, where `N` is the rank of the input tensor. These default values cause the output to become a transposed tensor of the input. When specified, the number of values in the sequence must be the same as the rank of the input tensor, and the values in the sequence must be within the range from 0 to N-1 with no two or more same values found in the sequence.

    **Returns:** an {{MLOperand}}. The permuted or transposed N-D tensor. 
</div>

## MLGraph ## {#api-mlgraph}
The {{MLGraph}} interface represents a compiled computational graph. A compiled graph once constructed is immutable and cannot be subsequently changed.

<script type=idl>
typedef (MLBufferView or WebGLTexture or GPUTexture) MLResource;

dictionary MLInput {
  required MLResource resource;
  required sequence<long> dimensions;
};

typedef record<DOMString, (MLResource or MLInput)> MLNamedInputs;
typedef record<DOMString, MLResource> MLNamedOutputs;

[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLGraph {
  undefined compute(MLNamedInputs inputs, MLNamedOutputs outputs);
};
</script>

{{MLGraph}} has the following internal slots:

<dl dfn-type=attribute dfn-for="MLGraph">
    : <dfn>\[[context]]</dfn> of type {{MLContext}}
    ::
        The context of type {{MLContext}} associated with this {{MLGraph}}.

    : <dfn>\[[inputDescriptors]]</dfn> of type [=record=]&lt;{{DOMString}}, {{MLOperandDescriptor}}&gt;
    ::
        Maps the name of an input {{MLOperand}} to its {{MLOperandDescriptor}} for all input {{MLOperand}}s of this {{MLGraph}}.

    : <dfn>\[[outputNames]]</dfn> of type [=sequence=]&lt;{{DOMString}}&gt;
    ::
        Contains the names of all output {{MLOperand}}s of this {{MLGraph}}.

    : <dfn>\[[implementation]]</dfn>
    ::
        The underlying implementation provided by the User Agent.
</dl>

<dl dfn-type=method dfn-for=MLGraph>
    : <dfn>compute(inputs, outputs)</dfn>
    ::
        Compute the {{MLGraph}} given {{MLNamedInputs}} and {{MLNamedOutputs}}. Return once the compute has completed and the results in {{MLNamedOutputs}} are ready to be consumed.

        <div algorithm=MLGraph.compute>
            **Called on:** {{MLGraph}} |this|.

            **Arguments:**
            <pre class=argumentdef for="MLGraph/compute(inputs, outputs)">
                |inputs|: an {{MLNamedInputs}}. The resources and optional dimensions of inputs for the compute.
                |outputs|: an {{MLNamedOutputs}}. The pre-allocated resources of required outputs for the compute.
            </pre>

            **Returns:** {{undefined}}.

            1. If any of the following requirements are unmet, then throw a {{DataError}} {{DOMException}} and stop.
                <div class=validusage>
                    1. For each |key| -> |value| of |inputs|:
                        1. |this|.{{MLGraph/[[inputDescriptors]]}}[|key|] must exist.
                        1. Let |inputDesc| be |this|.{{MLGraph/[[inputDescriptors]]}}[|key|].
                        1. Let |inputSize| be 1.
                        1. If |value| is an {{MLInput}}, then:
                            1. The length of |value|.{{MLInput/dimensions}} must be the same as the length of |inputDesc|.{{MLOperandDescriptor/dimensions}}.
                            1. Let |i| be 0.
                            1. While true:
                                1. Let |dimension| be |value|.{{MLInput/dimensions}}[|i|].
                                1. |dimension| must be greater than 0.
                                1. If |inputDesc|.{{MLOperandDescriptor/dimensions}}[|i|] is greater than 0, then |dimension| must be equal to |inputDesc|.{{MLOperandDescriptor/dimensions}}[|i|].
                                1. Set |inputSize| to the product of |inputSize| and |dimension|.
                                1. Increment |i| by 1.
                                1. If |i| if equal to the length of |value|.{{MLInput/dimensions}}, then break.
                        1. Else:
                            1. For each |dimension| of |inputDesc|.{{MLOperandDescriptor/dimensions}}:
                                1. The value of |dimension| must be greater than 0.
                                1. Set |inputSize| to the product of |inputSize| and |dimension|.
                        1. If |value| is an {{MLInput}}, then let |resource| be |value|.{{MLInput/resource}}.
                        1. If |value| is an {{MLResource}}, then let |resource| be |value|.
                        1. If |resource| is an {{ArrayBufferView}}, then:
                            1. The kind of |resource| must be compatible with |inputDesc|.{{MLOperandDescriptor/type}} according to [this table](#appendices-mloperandtype-arraybufferview-compatibility).
                            1. The length of |resource| must be the same as |inputSize|.

                    1. For each |key| -> |value| of |outputs|:
                        1. |this|.{{MLGraph/[[outputNames]]}}[|key|] must exist.
                </div>
            <!-- Compute -->
            1. For each |key| -> |value| of |inputs|:
                1. Let |inputDesc| be |this|.{{MLGraph/[[inputDescriptors]]}}[|key|].
                1. Let |inputTensor| be a new tensor for |this|.{{MLGraph/[[implementation]]}} of data type that is compatible with |inputDesc|.{{MLOperandDescriptor/type}}.
                1. If |value| is an {{MLInput}}, then:
                    1. Set the dimensions of |inputTensor| to |value|.{{MLInput/dimensions}}.
                1. Else:
                    1. Set the dimensions of |inputTensor| to |inputDesc|.{{MLOperandDescriptor/dimensions}}.
                1. If |value| is an {{MLInput}}, then:
                    1. Set the values of |inputTensor| to the values of |value|.{{MLInput/resource}}.
                1. If |value| is an {{MLResource}}, then:
                    1. Set the values of |inputTensor| to the values of |value|.
                1. Set the input of |this|.{{MLGraph/[[implementation]]}} that is associated with |key| to |inputTensor|.
            1. For each |key| -> |value| of |outputs|:
                1. Issue a compute request for output of |this|.{{MLGraph/[[implementation]]}} that is associated with |key|.
                1. Wait for the compute request to be completed.
                1. If there is an error returned by |this|.{{MLGraph/[[implementation]]}}, then:
                    1. Throw an {{OperationError}} {{DOMException}} and stop.
                1. Else:
                    1. Let |outputTensor| be the output tensor returned by |this|.{{MLGraph/[[implementation]]}}.
                    1. If the kind of |value| is not compatible with the value type of |outputTensor|, then throw a {{DataError}} {{DOMException}} and stop.
                    1. Let |outputSize| be 1.
                    1. For each |dimension| of dimensions of |outputTensor|:
                        1. Set |outputSize| to the product of |outputSize| and |dimension|.
                    1. If |outputSize| is greater than the length of |value|, then:
                        1. Throw a {{DataError}} {{DOMException}} and stop.
                    1. Else:
                        1. Set the values of |value| to the values of |outputTensor|.
            1. Return {{undefined}}.

            Issue: Describe the algorithm steps for |this|.{{MLGraph/[[context]]}} created from {{WebGLRenderingContext}} and {{GPUDevice}}.
        </div>
</dl>

### Examples ### {#compilation-examples}

<div class="example">
The following code showcases the computation with dynamic input dimensions.
<pre highlight="js">
function sizeOfShape(array) {
  return array.reduce(
      (accumulator, currentValue) => accumulator * currentValue);
}

const context = navigator.ml.createContext();

// Create a graph with dynamic shaped inputs.
const builder = new MLGraphBuilder(context);
const descA = {type: 'float32', dimensions: [-1, 4]};
const a = builder.input('a', descA);
const descB = {type: 'float32', dimensions: [4, -1]};
const b = builder.input('b', descB);
const c = builder.matmul(a, b);
const graph = builder.build({'c': c});

function allocateAndCompute(shapeA, shapeB, shapeC) {
  const bufferA = new Float32Array(sizeOfShape(shapeA)).fill(0.5);
  const bufferB = new Float32Array(sizeOfShape(shapeB)).fill(0.5);
  const bufferC = new Float32Array(sizeOfShape(shapeC));

  // Specify the shape of inputs when computing.
  const inputs = {
    'a': {resource: bufferA, dimensions: shapeA},
    'b': {resource: bufferB, dimensions: shapeB},
  };
  const outputs = {'c': bufferC};
  graph.compute(inputs, outputs);
  console.log(&#96;values: ${bufferC}&#96;);
}

allocateAndCompute([3, 4], [4, 3], [3, 3]);
allocateAndCompute([4, 4], [4, 4], [4, 4]);
allocateAndCompute([5, 4], [4, 5], [5, 5]);
</pre>
</div>

<div class="example">
The following code showcases the computation with optional outputs.
<pre highlight="js">
const context = navigator.ml.createContext();

// Build a graph with two outputs.
const builder = new MLGraphBuilder(context);
const descA = {type: 'float32', dimensions: [3, 4]};
const a = builder.input('a', descA);
const descB = {type: 'float32', dimensions: [4, 3]};
const bufferB = new Float32Array(sizeOfShape(descB.dimensions)).fill(0.5);
const b = builder.constant(descB, bufferB);
const descC = {type: 'float32', dimensions: [3, 3]};
const bufferC = new Float32Array(sizeOfShape(descC.dimensions)).fill(1);
const c = builder.constant(descC, bufferC);
const d = builder.matmul(a, b);
const e = builder.add(d, c);
const graph = builder.build({'d': d, 'e': e});

const bufferA = new Float32Array(sizeOfShape(descA.dimensions)).fill(0.5);
const inputs = {'a': bufferA};

// Compute d.
const bufferD = new Float32Array(sizeOfShape([3, 3]));
graph.compute(inputs, {'d': bufferD});
console.log(&#96;values: ${bufferD}&#96;);

// Compute e.
const bufferE = new Float32Array(sizeOfShape([3, 3]));
graph.compute(inputs, {'e': bufferE});
console.log(&#96;values: ${bufferE}&#96;);
</pre>
</div>

Examples {#examples}
=====================

<div class="example">
The following code gets the MLContext object.
<pre highlight="js">
const context = navigator.ml.createContext({powerPreference: 'low-power'});
</pre>
</div>

<div class="example">
The following code builds a graph as:
<pre>
constant1 ---+
             +--- Add ---> intermediateOutput1 ---+
input1    ---+                                    |
                                                  +--- Mul---> output
constant2 ---+                                    |
             +--- Add ---> intermediateOutput2 ---+
input2    ---+
</pre>
<pre highlight="js">
// Use tensors in 4 dimensions.
const TENSOR_DIMS = [1, 2, 2, 2];
const TENSOR_SIZE = 8;

const builder = new MLGraphBuilder(context);

// Create MLOperandDescriptor object.
const desc = {type: 'float32', dimensions: TENSOR_DIMS};

// constant1 is a constant MLOperand with the value 0.5.
const constantBuffer1 = new Float32Array(TENSOR_SIZE).fill(0.5);
const constant1 = builder.constant(desc, constantBuffer1);

// input1 is one of the input MLOperands. Its value will be set before execution.
const input1 = builder.input('input1', desc);

// constant2 is another constant MLOperand with the value 0.5.
const constantBuffer2 = new Float32Array(TENSOR_SIZE).fill(0.5);
const constant2 = builder.constant(desc, constantBuffer2);

// input2 is another input MLOperand. Its value will be set before execution.
const input2 = builder.input('input2', desc);

// intermediateOutput1 is the output of the first Add operation.
const intermediateOutput1 = builder.add(constant1, input1);

// intermediateOutput2 is the output of the second Add operation.
const intermediateOutput2 = builder.add(constant2, input2);

// output is the output MLOperand of the Mul operation.
const output = builder.mul(intermediateOutput1, intermediateOutput2);
</pre>
</div>

<div class="example">
Compile the graph up to the output operand.
<pre highlight="js">
// Compile the constructed graph.
const graph = builder.build({'output': output});
</pre>
</div>

<div class="example">
The following code executes the compiled graph.
<pre highlight="js">
// Setup the input buffers with value 1.
const inputBuffer1 = new Float32Array(TENSOR_SIZE).fill(1);
const inputBuffer2 = new Float32Array(TENSOR_SIZE).fill(1);
const outputBuffer = new Float32Array(TENSOR_SIZE);

// Execute the compiled graph with the specified inputs.
const inputs = {
  'input1': inputBuffer1,
  'input2': inputBuffer2,
};
const outputs = {'output': outputBuffer};
graph.compute(inputs, outputs);

console.log('Output value: ' + outputBuffer);
// Output value: 2.25,2.25,2.25,2.25,2.25,2.25,2.25,2.25
</pre>
</div>

# Appendices # {#appendices}

## {{MLOperandType}} and {{ArrayBufferView}} compatibility ## {#appendices-mloperandtype-arraybufferview-compatibility}

<table class='data'>
    <thead class=stickyheader>
        <tr>
            <th>{{MLOperandType}}
            <th>{{ArrayBufferView}}
    </thead>
    <tr>
        <td>{{MLOperandType/float32}}
        <td>{{Float32Array}}
    <tr>
        <td>{{MLOperandType/int32}}
        <td>{{Int32Array}}
    <tr>
        <td>{{MLOperandType/uint32}}
        <td>{{Uint32Array}}
    <tr>
        <td>{{MLOperandType/int8}}
        <td>{{Int8Array}}
    <tr>
        <td>{{MLOperandType/uint8}}
        <td>{{Uint8Array}}
</table>

Issue(webmachinelearning/webnn#127): clarify the usage of {{ArrayBufferView}} for {{MLOperandType/float16}}.

<h2 id="acknowledgements">Acknowledgements</h2>

This specification follows the concepts of the Android Neural Networks API C
API.

Thanks to Tomoyuki Shimizu, Ningxin Hu, Zhiqiang Yu and Belem Zhang for the use
cases.

Thanks to Nikhil Thorat, Daniel Smilkov, Ganesan Ramalingam, Rafael Cintron and
Benjamin Poulain for their contributions to the API specification.

Thanks to Sangwhan Moon and the W3C Technical Architecture Group for review of this specification for web architecture fit, design consistency and developer ergonomics.

Thanks to W3C Privacy Interest Group for privacy and security review and feedback.
<pre class="biblio">
{
  "Models": {
      "href": "https://github.com/webmachinelearning/webnn/blob/master/op_compatibility/first_wave_models.md",
      "title": "The first-wave models",
      "authors": ["Machine Learning for the Web Community Group"],
      "date": "2020"
  },
  "numpy-broadcasting-rule": {
    "href": "https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html#general-broadcasting-rules",
    "title": "General Broadcasting Rules of NumPy",
    "authors": ["The SciPy community"],
    "date": "July 2019"
  },
  "SSD": {
    "href": "https://arxiv.org/abs/1512.02325",
    "title": "SSD: Single Shot MultiBox Detector",
    "authors": [
      "Wei Liu",
      "Dragomir Anguelov",
      "Dumitru Erhan",
      "Christian Szegedy",
      "Scott Reed",
      "Cheng-Yang Fu",
      "Alexander C. Berg"
    ],
    "date": "December 2016"
  },
  "YOLO": {
    "href": "https://arxiv.org/abs/1506.02640",
    "title": "You Only Look Once: Unified, Real-Time Object Detection",
    "authors": [
      "Joseph Redmon",
      "Santosh Divvala,",
      "Ross Girshick",
      "Ali Farhadi"
    ],
    "date": "May 2016"
  },
  "DeepLabv3+": {
    "href": "https://arxiv.org/abs/1802.02611",
    "title": "Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation",
    "authors": [
      "Liang-Chieh Chen",
      "Yukun Zhu",
      "George Papandreou",
      "Florian Schroff",
      "Hartwig Adam"
    ],
    "date": "August 2018"
  },
  "MaskR-CNN": {
    "href": "https://arxiv.org/abs/1703.06870",
    "title": "Mask R-CNN",
    "authors": [
      "Kaiming He",
      "Georgia Gkioxari",
      "Piotr Dollár",
      "Ross Girshick"
    ],
    "date": "January 2018"
  },
  "PoseNet": {
    "href": "https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5",
    "title": "Real-time Human Pose Estimation in the Browser with TensorFlow.js",
    "authors": [
      "Dan Oved"
    ],
    "date": "May 2018"
  },
  "FaceNet": {
    "href": "https://arxiv.org/abs/1503.03832",
    "title": "FaceNet: A Unified Embedding for Face Recognition and Clustering",
    "authors": [
      "Florian Schroff",
      "Dmitry Kalenichenko",
      "James Philbin"
    ],
    "date": "June 2015"
  },
  "FAN": {
    "href": "https://arxiv.org/abs/1703.07332",
    "title": "How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)",
    "authors": [
      "Adrian Bulat",
      "Georgios Tzimiropoulos"
    ],
    "date": "September 2017"
  },
  "ContextualLoss": {
    "href": "https://arxiv.org/abs/1803.02077",
    "title": "The Contextual Loss for Image Transformation with Non-Aligned Data",
    "authors": [
      "Roey Mechrez",
      "Itamar Talmi",
      "Lihi Zelnik-Manor"
    ],
    "date": "July 2018"
  },
  "PairedCycleGAN": {
    "href": "http://openaccess.thecvf.com/content_cvpr_2018/html/Chang_PairedCycleGAN_Asymmetric_Style_CVPR_2018_paper.html",
    "title": "PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup",
    "authors": [
      "Huiwen Chang",
      "Jingwan Lu",
      "Fisher Yu",
      "Adam Finkelstein"
    ],
    "date": "June 2018"
  },
  "SRGAN": {
    "href": "https://arxiv.org/abs/1609.04802",
    "title": "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network",
    "authors": [
      "Christian Ledig",
      "Lucas Theis",
      "Ferenc Huszar",
      "Jose Caballero",
      "Andrew Cunningham",
      "Alejandro Acosta",
      "Andrew Aitken",
      "Alykhan Tejani",
      "Johannes Totz",
      "Zehan Wang",
      "Wenzhe Shi"
    ],
    "date": "May 2017"
  },
  "im2txt": {
    "href": "https://arxiv.org/abs/1609.06647",
    "title": "Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge",
    "authors": [
      "Oriol Vinyals",
      "Alexander Toshev",
      "Samy Bengio",
      "Dumitru Erhan"
    ],
    "date": "September 2016"
  },
  "GNMT": {
    "href": "https://github.com/tensorflow/nmt",
    "title": "Neural Machine Translation (seq2seq) Tutorial",
    "authors": [
      "Minh-Thang Luong",
      "Eugene Brevdo",
      "Rui Zhao"
    ],
    "date": "May 2017"
  },
  "OpenNMT": {
    "href": "https://arxiv.org/abs/1701.02810",
    "title": "OpenNMT: Open-Source Toolkit for Neural Machine Translation",
    "authors": [
      "Guillaume Klein",
      "Yoon Kim",
      "Yuntian Deng",
      "Jean Senellart",
      "Alexander M. Rush"
    ],
    "date": "March 2017"
  },
  "DeepMoji": {
    "href": "https://arxiv.org/abs/1708.00524",
    "title": "Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm",
    "authors": [
      "Bjarke Felbo",
      "Alan Mislove",
      "Anders Søgaard",
      "Iyad Rahwan",
      "Sune Lehmann"
    ],
    "date": "October 2017"
  },
  "Video-Summarization-with-LSTM": {
    "href": "http://www-scf.usc.edu/~zhan355/ke_eccv2016.pdf",
    "title": "Video summarization with long short-term memory",
    "authors": [
      "Ke Zhang",
      "Wei-Lun Chao",
      "Fei Sha",
      "Kristen Grauman"
    ],
    "date": "October 2016"
  },
  "LeakyReLU": {
    "href": "https://pdfs.semanticscholar.org/367f/2c63a6f6a10b3b64b8729d601e69337ee3cc.pdf",
    "title": "Rectifier Nonlinearities Improve Neural Network Acoustic Models",
    "authors": [
      "Andrew L. Maas",
      "Awni Y. Hannun",
      "Andrew Y. Ng"
    ],
    "date": "June 2013"
  },
  "ELU": {
    "href": "https://arxiv.org/abs/1511.07289",
    "title": "Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)",
    "authors": [
      "Djork-Arné Clevert",
      "Thomas Unterthiner",
      "Sepp Hochreiter"
    ],
    "date": "February 2016"
  },
  "RNNoise": {
    "href": "https://github.com/xiph/rnnoise",
    "title": "Recurrent neural network for audio noise reduction",
    "authors": [
      "Jean-Marc Valin"
    ],
    "date": "September 2017"
  },
  "GRU": {
    "href": "https://arxiv.org/pdf/1406.1078.pdf",
    "title": "Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation",
    "authors": [
      "Kyunghyun Cho",
      "Bart van Merrienboer",
      "Caglar Gulcehre",
      "Dzmitry Bahdanau",
      "Fethi Bougares",
      "Holger Schwenk",
      "Yoshua Bengio"
    ],
    "date": "September 2014"
  },
  "Batch-Normalization": {
    "href": "https://arxiv.org/abs/1502.03167",
    "title": "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift",
    "authors": [
      "Sergey Ioffe",
      "Christian Szegedy"
    ],
    "date": "March 2015"
  },
  "Instance-Normalization": {
    "href": "https://arxiv.org/abs/1607.08022",
    "title": "Instance Normalization: The Missing Ingredient for Fast Stylization",
    "authors": [
      "Dmitry Ulyanov",
      "Andrea Vedaldi",
      "Victor Lempitsky"
    ],
    "date": "July 2016"
  },
  "FaceForensics++": {
    "href": "https://github.com/ondyari/FaceForensics",
    "title": "FaceForensics++",
    "authors": [
      "Andreas Rössler",
      "Davide Cozzolino",
      "Luisa Verdoliva",
      "Christian Riess",
      "Justus Thies",
      "Matthias Nießner"
    ],
    "date": "January 2019"
  },
  "MobileNetV3": {
    "href": "https://arxiv.org/pdf/1905.02244",
    "title": "Searching for MobileNetV3",
    "authors": [
      "Andrew Howard",
      "Mark Sandler",
      "Grace Chu",
      "Liang-Chieh Chen",
      "Bo Chen",
      "Mingxing Tan",
      "Weijun Wang",
      "Yukun Zhu",
      "Ruoming Pang",
      "Vijay Vasudevan",
      "Quoc V. Le",
      "Hartwig Adam"
    ],
    "date": "November 2019"
  }
}
</pre>