diff --git a/samples/azure-quantum/resource-estimation/estimation-profiling.ipynb b/samples/azure-quantum/resource-estimation/estimation-profiling.ipynb new file mode 100644 index 000000000000..f041f53b978c --- /dev/null +++ b/samples/azure-quantum/resource-estimation/estimation-profiling.ipynb @@ -0,0 +1,839 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Resource estimation profiling in quantum adders\n", + "\n", + "In this sample, we are implementing two quantum adders and then inspecting them\n", + "using the profiling feature in the Azure Quantum Resource Estimator. The\n", + "profiling feature allows you to analyze how sub-operations in the quantum\n", + "algorithm impact the overall resources.\n", + "\n", + "In particular, you will learn how to use the `profiling.call_stack_depth` and\n", + "`profiling.inline_functions` job parameters to enable profiles in resource\n", + "estimation jobs, and how to use the `call_graph` and `profile` properties on\n", + "resource estimation results to show call graphs and resource estimation\n", + "profiles, respectively.\n", + "\n", + "The notebook is structured as follows. First, we connect to an Azure Quantum\n", + "workspace and import the necessary functions, then we describe the two quantum\n", + "adder implementations. It's okay to skip the implementation part and jump right\n", + "into the next session, which shows how to use the profiling feature to construct\n", + "call graphs and resource estimation profiles for time and space. Finally, we\n", + "show how the detailed profiling information can be used for advanced analyses." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from azure.quantum import Workspace\n", + "from azure.quantum.target.microsoft import MicrosoftEstimator\n", + "import pandas\n", + "import qsharp\n", + "import re" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "workspace = Workspace (\n", + " resource_id = \"\",\n", + " location = \"\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Quantum adders\n", + "\n", + "In this section, we implement two Q# operations `RippleCarryAdd` and\n", + "`LookAheadAdd` together with estimation entry points `EstimateRippleCarryAdd`\n", + "and `EstimateLookAheadAdd`, respectively, which each can be provided a\n", + "bit-width. Feel free to skip over the implementation details of these adders,\n", + "right to the next section in which we are using the profiling feature to analyze\n", + "them.\n", + "\n", + "In the realm of quantum computing, the ability to perform addition is crucial\n", + "for a wide range of applications. In this section, we will introduce two\n", + "different quantum adders that can be used to perform addition: the ripple-carry\n", + "adder and the carry-lookahead adder. Given two qubit $n$-bit registers\n", + "$|x\\rangle = |(x_{n-1}\\dots x_1x_0)_2\\rangle$ and $|y\\rangle = |(y_{n-1}\\dots\n", + "y_1y_0)_2\\rangle$, the goal is compute an $n+1$ bit register $|z\\rangle = |x +\n", + "y\\rangle$.\n", + "\n", + "Both implementations will also support the overflowing variant, in which the\n", + "output register has $n$ bits, and we compute $|z\\rangle = |(x + y) \\bmod\n", + "2^n\\rangle$." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Ripple-carry adder\n", + "\n", + "We review the ripple-carry adder described in [Halving the cost of quantum\n", + "addition](https://arxiv.org/abs/1709.06648). We focus on the out-of-place\n", + "variant, in which the sum of input registers $|x\\rangle$ and $|y\\rangle$ is\n", + "computed into a $|0\\rangle$-initialized output register $|z\\rangle$. A\n", + "ripple-carry adder adds the bits of $|x\\rangle$ and $|y\\rangle$ one by one,\n", + "starting from the rightmost bit, and propagates the carry generated by the\n", + "addition to the next bit to the left. This process is repeated until all the\n", + "bits have been added, resulting in a final sum and carry-out. The core\n", + "component of the ripple-carry adder is the full adder, which is a one-bit adder\n", + "that takes a carry input bit, and returns a carry output bit, in addition to the\n", + "sum bit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "internal operation FullAdder(carryIn : Qubit, x : Qubit, y : Qubit, carryOut : Qubit) : Unit is Adj {\n", + " CNOT(x, y);\n", + " CNOT(x, carryIn);\n", + " ApplyAnd(y, carryIn, carryOut);\n", + " CNOT(x, y);\n", + " CNOT(x, carryOut);\n", + " CNOT(y, carryIn);\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With this operation at hand, implementing the ripple carry adder is as simple as\n", + "carrying the carry bit through successive full adder invocations. The two CNOT\n", + "operations at the end handle the case in which $|z\\rangle$ has $n$ bits, in\n", + "which the sum for the most significant bit is not computed by a full adder\n", + "(since the corresponding carry out bit does not exist)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "open Microsoft.Quantum.Arrays;\n", + "open Microsoft.Quantum.Diagnostics;\n", + "\n", + "operation RippleCarryAdd(xs : Qubit[], ys : Qubit[], zs : Qubit[]) : Unit is Adj {\n", + " let n = Length(xs);\n", + " let m = Length(zs);\n", + "\n", + " Fact(Length(ys) == n, \"Registers xs and ys must be of same length\");\n", + " Fact(m == n or m == n + 1, \"Register zs must be same length as xs or one bit larger\");\n", + "\n", + " for k in 0..m - 2 {\n", + " FullAdder(zs[k], xs[k], ys[k], zs[k + 1]);\n", + " }\n", + "\n", + " if n > 0 and n == Length(zs) {\n", + " CNOT(Tail(xs), Tail(zs));\n", + " CNOT(Tail(ys), Tail(zs));\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we are creating an entry point for resource estimation, in which we are\n", + "calling the ripple-carry adder for a given bitwidth, which can be provided as\n", + "an input argument. (We are considering the $n+1$ output register in this\n", + "notebook.)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "operation EstimateRippleCarryAdd(bitwidth : Int) : Unit {\n", + " use xs = Qubit[bitwidth];\n", + " use ys = Qubit[bitwidth];\n", + " use zs = Qubit[bitwidth + 1];\n", + "\n", + " RippleCarryAdd(xs, ys, zs);\n", + "}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Carry-lookahead adder\n", + "\n", + "In [_A logarithmic-depth quantum carry-lookahead adder_](https://arxiv.org/abs/quant-ph/0406142), the authors present an out-of-place quantum adder implementation based on the [classical carry-lookahead adder method](https://en.wikipedia.org/wiki/Carry-lookahead_adder). To briefly summarize, the idea of carry-lookahead adder is to compute all carry bits $c_i$ based on propagate bits $p_i = x_i \\oplus y_i$ and generate bits $g_i = x_i \\land y_i$, without requiring other carry bits except for the carry-in $c_0$.\n", + "\n", + "For example, the first carry bit can be computed as $c_1 = g_0 \\oplus (p_0 \\land c_0)$, since either it is generated from bits $x_0$ and $y_0$ (when both are 1, and therefore $g_0 = 1$) or the carry bit $c_0$ is propagated (if either $x_0$ or $y_0$ is 1, and therefore $p_0 = 1$). More significant carry bits are computed in a similar way, for example $c_3 = g_2 \\oplus (g_1 \\land p_2) \\oplus (g_0 \\land p_1 \\land p_2) \\oplus (c_0 \\land p_0 \\land p_1 \\land p_2)$. That is, $c_3$ is either generated from bits at index 2, or generated from bits at index 1 _and_ propagated from bits at index 2, and so on.\n", + "\n", + "In order to minimize AND gates, these intermediate products can be computed in a clever way, as well as in logarithmic depth. We are now looking at an implementation of the carry-lookahead adder in Q#, and start by implementing a helper function to compute the number of 1-bits in an integer, also called Hamming weight, using a compact implementation based on a sequence of bitwise manipulations (you can learn more about these constants and why it works [on this article in Wikipedia](https://en.wikipedia.org/wiki/Hamming_weight#Efficient_implementation))." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "function HammingWeight(n : Int) : Int {\n", + " mutable i = n - ((n >>> 1) &&& 0x5555555555555555);\n", + " set i = (i &&& 0x3333333333333333) + ((i >>> 2) &&& 0x3333333333333333);\n", + " return (((i + (i >>> 4)) &&& 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >>> 56;\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we are going to implement internal routines that compute the generalized propagate bits, generalized generate, as well as the carry bits from them. These are called `PRounds`, `GRounds`, and `CRounds`, and descriptions of their implementations can be found in the paper above." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "open Microsoft.Quantum.Arrays;\n", + "open Microsoft.Quantum.Convert;\n", + "open Microsoft.Quantum.Math;\n", + "\n", + "internal operation PRounds(pWorkspace : Qubit[][]) : Unit is Adj {\n", + " for ws in Windows(2, pWorkspace) {\n", + " let (current, next) = (Rest(ws[0]), ws[1]);\n", + "\n", + " for (m, target) in Enumerated(next) {\n", + " ApplyAnd(current[2 * m], current[2 * m + 1], target);\n", + " }\n", + " }\n", + "}\n", + "\n", + "internal operation GRounds(pWorkspace : Qubit[][], gs : Qubit[]) : Unit is Adj {\n", + " let numRounds = Length(pWorkspace);\n", + " let n = Length(gs);\n", + "\n", + " for t in 1..numRounds {\n", + " let length = Floor(IntAsDouble(n) / IntAsDouble(2^t)) - 1;\n", + " let ps = pWorkspace[t - 1][0..2...];\n", + "\n", + " for m in 0..length {\n", + " CCNOT(gs[2^t * m + 2^(t - 1) - 1], ps[m], gs[2^t * m + 2^t - 1]);\n", + " }\n", + " }\n", + "}\n", + "\n", + "internal operation CRounds(pWorkspace : Qubit[][], gs : Qubit[]) : Unit is Adj {\n", + " let n = Length(gs);\n", + "\n", + " let start = Floor(Lg(IntAsDouble(2 * n) / 3.0));\n", + " for t in start..-1..1 {\n", + " let length = Floor(IntAsDouble(n - 2^(t - 1)) / IntAsDouble(2^t));\n", + " let ps = pWorkspace[t - 1][1..2...];\n", + "\n", + " for m in 1..length {\n", + " CCNOT(gs[2^t * m - 1], ps[m - 1], gs[2^t * m + 2^(t - 1) - 1]);\n", + " }\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With these operations we can build an operation that computes the carry bits\n", + "from the initial propagate and generate bits. Note that the generalized\n", + "propagate bits are computed out-of-place into some helper qubits `qs`, whereas\n", + "the generalized generate and carry bits are computed in-place into the register\n", + "`gs`, which contains the initial generate bits." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "internal operation ComputeCarries(ps : Qubit[], gs : Qubit[]) : Unit is Adj {\n", + " let n = Length(gs);\n", + "\n", + " let numRounds = Floor(Lg(IntAsDouble(n)));\n", + " use qs = Qubit[n - HammingWeight(n) - numRounds];\n", + "\n", + " let registerPartition = MappedOverRange(t -> Floor(IntAsDouble(n) / IntAsDouble(2^t)) - 1, 1..numRounds - 1);\n", + " let pWorkspace = [ps] + Partitioned(registerPartition, qs);\n", + "\n", + " within {\n", + " PRounds(pWorkspace);\n", + " } apply {\n", + " GRounds(pWorkspace, gs);\n", + " CRounds(pWorkspace, gs);\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we are all set up to implement the carry-lookahead adder, by first\n", + "computing the initial propagate and generate bits, then computing the carry\n", + "bits, and finally computing the output bits using the sums (initial propagate\n", + "bits) together with the carry bits. Note that this implementation supports both\n", + "an variant where the output register is 1 bit larger and does not overflow, as\n", + "well as a variant in which the sum is computed modulo $2^n$. The latter uses\n", + "the former by using a special treatment of the most-significant bits of the\n", + "input registers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "open Microsoft.Quantum.Arrays;\n", + "open Microsoft.Quantum.Canon;\n", + "open Microsoft.Quantum.Diagnostics;\n", + "\n", + "operation LookAheadAdd(xs : Qubit[], ys : Qubit[], zs : Qubit[]) : Unit is Adj {\n", + " let n = Length(xs);\n", + " let m = Length(zs);\n", + "\n", + " Fact(Length(ys) == n, \"Registers xs and ys must be of same length\");\n", + " Fact(m == n or m == n + 1, \"Register zs must be same length as xs or one bit larger\");\n", + "\n", + " if m == n + 1 { // with carry-out\n", + " // compute initial generate values\n", + " for k in 0..n - 1 {\n", + " ApplyAnd(xs[k], ys[k], zs[k + 1]);\n", + " }\n", + "\n", + " within {\n", + " // compute initial propagate values\n", + " ApplyToEachA(CNOT, Zipped(xs, ys));\n", + " } apply {\n", + " if n > 1 {\n", + " ComputeCarries(Rest(ys), Rest(zs));\n", + " }\n", + "\n", + " // compute sum into carries\n", + " for k in 0..n - 1 {\n", + " CNOT(ys[k], zs[k]);\n", + " }\n", + " }\n", + " } else { // without carry-out\n", + " LookAheadAdd(Most(xs), Most(ys), zs);\n", + " CNOT(Tail(xs), Tail(zs));\n", + " CNOT(Tail(ys), Tail(zs));\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we are creating an entry point for resource estimation, in which we are\n", + "calling the carry-lookahead adder for a given bitwidth, which can be provided as\n", + "an input argument." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "operation EstimateLookAheadAdd(bitwidth : Int) : Unit {\n", + " use xs = Qubit[bitwidth];\n", + " use ys = Qubit[bitwidth];\n", + " use zs = Qubit[bitwidth + 1];\n", + "\n", + " LookAheadAdd(xs, ys, zs);\n", + "}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Profiling\n", + "\n", + "With these two adder implementations, we are now ready for some profiling. The\n", + "profiling feature in Azure Quantum Resource Estimator will create a resource\n", + "estimation profile that will show how sub operations in the program (e.g.,\n", + "`FullAdder` inside `RippleCarryAdd`) contribute to the overall costs.\n", + "\n", + "There are two important concepts to review to best understand the outputs of the\n", + "profiling feature, a *call graph* and a *call tree*. The *call graph* is static\n", + "representation of the quantum program which informs which operations call which\n", + "other operations. For example, `RippleCarryAdder` calls `FullAdder`, but both\n", + "of these operations call `CNOT`. The call graph contains a node for each\n", + "operation and a directed edge for each calling relation. The call graph may\n", + "contain cycles, e.g., in the case of recursive operations.\n", + "\n", + "In contrast, a call tree is a dynamic representation of the program execution in\n", + "which there are no cycles and for each node there is a clear path from the root\n", + "node. For example, distinguishes the calls to `CCNOT` from `GRounds` and\n", + "`CRounds` within the `ComputeCarries` operation in the carry-lookahead adder.\n", + "\n", + "But let's start looking at concrete examples, and create a resource estimator\n", + "instance together with a parameter object to configure the bitwidth argument and\n", + "enable the profiling feature." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "estimator = MicrosoftEstimator(workspace)\n", + "\n", + "params = estimator.make_params()\n", + "params.arguments[\"bitwidth\"] = 32\n", + "params.profiling.call_stack_depth = 10" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We enable the profiling feature by setting the `call_stack_depth` variable in\n", + "the `profiling` group to a number that indicates the maximum level up to which\n", + "sub operations are tracked. More precisely, the entry point operation is at\n", + "level 0. Any operation called from the entry point is at level 1, any operation\n", + "therein at 2, and so on. The call stack depth is setting a maximum value to an\n", + "operation's level in the call stack for which we track resources in the profile.\n", + "\n", + "Next, we submit a resource estimation job for the ripple carry adder by\n", + "providing the Q# operation `EstimateRippleCarryAdd` and the job parameter\n", + "object. We store there the results of the job in the variable `result_rca`,\n", + "where RCA is an abbreviation for ripple-carry adder." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "job = estimator.submit(EstimateRippleCarryAdd, input_params=params)\n", + "result_rca = job.get_results()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can inspect the call graph by calling the `call_graph` property on the result\n", + "object. It displays the call graph with the node corresponding to the entry\n", + "point operation at the top and aligns other operations top-down according to\n", + "their level." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result_rca.call_graph" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that the operation names are mangled, i.e., they contain additional\n", + "information. In this case, this can be a prefix `SNIPPET` or `ENTRYPOINT`,\n", + "which is generated by `qsharp` Python library from the Q# that is integrated\n", + "into the notebook. Some operations are prefixed by their namespace, and\n", + "operations have a suffix to indicate their variant (e.g., `body` for their\n", + "`body` implementation, and `adj` for their `adjoint` implementation).\n", + "\n", + "As described above, we can see that `RippleCarryAdd` calls both the `FullAdder`\n", + "and `CNOT`, and that the `FullAdder` also calls `CNOT` and `ApplyAnd`. In fact,\n", + "the `CNOT` operation is called from 3 other operations.\n", + "\n", + "Next, let's also generate resource estimates for the carry-lookahead\n", + "implementation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "job = estimator.submit(EstimateLookAheadAdd, input_params=params)\n", + "result_cla = job.get_results()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Again, we can look at its call graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result_cla.call_graph" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This call graph is much larger since more operations are involved. We can see\n", + "the `PRounds`, `GRounds`, and `CRounds` operations, and see that `GRounds` and\n", + "`CRounds` call `CCNOT`, but `PRounds` calls `ApplyAnd`. We also see that the\n", + "`adjoint` variant of `PRounds` calls the `adjoint` variant of `ApplyAnd`.\n", + "\n", + "It's possible to obtain a more compact call graph (and resource estimation\n", + "profile) by inlining some functions, e.g., those that just call a different\n", + "function and have no other logic inside. We can use the `inline_functions`\n", + "parameter for that:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "params.profiling.inline_functions = True\n", + "job = estimator.submit(EstimateLookAheadAdd, input_params=params)\n", + "result_cla_inline = job.get_results()\n", + "result_cla_inline.call_graph" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Although the call graph is now smaller, certain details—such as the rounds in\n", + "`ComputeCarries`—have been inlined and are no longer available for analysis.\n", + "Thus, the value for the `inline` parameter is typically chosen based on the\n", + "desired type of analysis. For the purposes of our remaining analysis, we will be\n", + "considering profiles for the adder operations without inlining." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we are inspecting the resource estimation profiles of both results. These\n", + "will provide the sub operations' contributions to runtime, logical depth,\n", + "physical qubits for the algorithm, and logical qubits for the algorithm. Let's\n", + "first have an overview of the total counts for each result:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "breakdown_rca = result_rca['physicalCounts']['breakdown']\n", + "breakdown_cla = result_cla['physicalCounts']['breakdown']\n", + "\n", + "pandas.DataFrame(data = {\n", + " \"Runtime\": [f\"{result_rca['physicalCounts']['runtime']} ns\", f\"{result_cla['physicalCounts']['runtime']} ns\"],\n", + " \"Logical depth\": [breakdown_rca['logicalDepth'], breakdown_cla['logicalDepth']],\n", + " \"Physical qubits\": [breakdown_rca['physicalQubitsForAlgorithm'], breakdown_cla['physicalQubitsForAlgorithm']],\n", + " \"Logical qubits\": [breakdown_rca['algorithmicLogicalQubits'], breakdown_cla['algorithmicLogicalQubits']],\n", + "}, index = [\"Ripple-carry adder\", \"Carry-lookahead adder\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The resource estimation profile is generated as a JSON file that can be read\n", + "using the speedscope interactive online profile viewer. In order to generate\n", + "and download the file call the `profile` property on a result object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result_rca.profile" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After downloading and opening the profile, we first see the runtime profile. For\n", + "this ripple-carry adder, we can readily see that all the runtime cost is caused\n", + "by the And operation inside the full adder. No other operation contributes cost\n", + "the overall runtime. The top most box is the entry point operation and the root\n", + "note of the call tree. In the next layer below are all operations that are\n", + "called from the root node and that contribute to the overall runtime.\n", + "\n", + "