diff --git a/samples/azure-quantum/resource-estimation/estimation-profiling.ipynb b/samples/azure-quantum/resource-estimation/estimation-profiling.ipynb new file mode 100644 index 000000000000..f041f53b978c --- /dev/null +++ b/samples/azure-quantum/resource-estimation/estimation-profiling.ipynb @@ -0,0 +1,839 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Resource estimation profiling in quantum adders\n", + "\n", + "In this sample, we are implementing two quantum adders and then inspecting them\n", + "using the profiling feature in the Azure Quantum Resource Estimator. The\n", + "profiling feature allows you to analyze how sub-operations in the quantum\n", + "algorithm impact the overall resources.\n", + "\n", + "In particular, you will learn how to use the `profiling.call_stack_depth` and\n", + "`profiling.inline_functions` job parameters to enable profiles in resource\n", + "estimation jobs, and how to use the `call_graph` and `profile` properties on\n", + "resource estimation results to show call graphs and resource estimation\n", + "profiles, respectively.\n", + "\n", + "The notebook is structured as follows. First, we connect to an Azure Quantum\n", + "workspace and import the necessary functions, then we describe the two quantum\n", + "adder implementations. It's okay to skip the implementation part and jump right\n", + "into the next session, which shows how to use the profiling feature to construct\n", + "call graphs and resource estimation profiles for time and space. Finally, we\n", + "show how the detailed profiling information can be used for advanced analyses." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from azure.quantum import Workspace\n", + "from azure.quantum.target.microsoft import MicrosoftEstimator\n", + "import pandas\n", + "import qsharp\n", + "import re" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "workspace = Workspace (\n", + " resource_id = \"\",\n", + " location = \"\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Quantum adders\n", + "\n", + "In this section, we implement two Q# operations `RippleCarryAdd` and\n", + "`LookAheadAdd` together with estimation entry points `EstimateRippleCarryAdd`\n", + "and `EstimateLookAheadAdd`, respectively, which each can be provided a\n", + "bit-width. Feel free to skip over the implementation details of these adders,\n", + "right to the next section in which we are using the profiling feature to analyze\n", + "them.\n", + "\n", + "In the realm of quantum computing, the ability to perform addition is crucial\n", + "for a wide range of applications. In this section, we will introduce two\n", + "different quantum adders that can be used to perform addition: the ripple-carry\n", + "adder and the carry-lookahead adder. Given two qubit $n$-bit registers\n", + "$|x\\rangle = |(x_{n-1}\\dots x_1x_0)_2\\rangle$ and $|y\\rangle = |(y_{n-1}\\dots\n", + "y_1y_0)_2\\rangle$, the goal is compute an $n+1$ bit register $|z\\rangle = |x +\n", + "y\\rangle$.\n", + "\n", + "Both implementations will also support the overflowing variant, in which the\n", + "output register has $n$ bits, and we compute $|z\\rangle = |(x + y) \\bmod\n", + "2^n\\rangle$." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Ripple-carry adder\n", + "\n", + "We review the ripple-carry adder described in [Halving the cost of quantum\n", + "addition](https://arxiv.org/abs/1709.06648). We focus on the out-of-place\n", + "variant, in which the sum of input registers $|x\\rangle$ and $|y\\rangle$ is\n", + "computed into a $|0\\rangle$-initialized output register $|z\\rangle$. A\n", + "ripple-carry adder adds the bits of $|x\\rangle$ and $|y\\rangle$ one by one,\n", + "starting from the rightmost bit, and propagates the carry generated by the\n", + "addition to the next bit to the left. This process is repeated until all the\n", + "bits have been added, resulting in a final sum and carry-out. The core\n", + "component of the ripple-carry adder is the full adder, which is a one-bit adder\n", + "that takes a carry input bit, and returns a carry output bit, in addition to the\n", + "sum bit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "internal operation FullAdder(carryIn : Qubit, x : Qubit, y : Qubit, carryOut : Qubit) : Unit is Adj {\n", + " CNOT(x, y);\n", + " CNOT(x, carryIn);\n", + " ApplyAnd(y, carryIn, carryOut);\n", + " CNOT(x, y);\n", + " CNOT(x, carryOut);\n", + " CNOT(y, carryIn);\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With this operation at hand, implementing the ripple carry adder is as simple as\n", + "carrying the carry bit through successive full adder invocations. The two CNOT\n", + "operations at the end handle the case in which $|z\\rangle$ has $n$ bits, in\n", + "which the sum for the most significant bit is not computed by a full adder\n", + "(since the corresponding carry out bit does not exist)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "open Microsoft.Quantum.Arrays;\n", + "open Microsoft.Quantum.Diagnostics;\n", + "\n", + "operation RippleCarryAdd(xs : Qubit[], ys : Qubit[], zs : Qubit[]) : Unit is Adj {\n", + " let n = Length(xs);\n", + " let m = Length(zs);\n", + "\n", + " Fact(Length(ys) == n, \"Registers xs and ys must be of same length\");\n", + " Fact(m == n or m == n + 1, \"Register zs must be same length as xs or one bit larger\");\n", + "\n", + " for k in 0..m - 2 {\n", + " FullAdder(zs[k], xs[k], ys[k], zs[k + 1]);\n", + " }\n", + "\n", + " if n > 0 and n == Length(zs) {\n", + " CNOT(Tail(xs), Tail(zs));\n", + " CNOT(Tail(ys), Tail(zs));\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we are creating an entry point for resource estimation, in which we are\n", + "calling the ripple-carry adder for a given bitwidth, which can be provided as\n", + "an input argument. (We are considering the $n+1$ output register in this\n", + "notebook.)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "operation EstimateRippleCarryAdd(bitwidth : Int) : Unit {\n", + " use xs = Qubit[bitwidth];\n", + " use ys = Qubit[bitwidth];\n", + " use zs = Qubit[bitwidth + 1];\n", + "\n", + " RippleCarryAdd(xs, ys, zs);\n", + "}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Carry-lookahead adder\n", + "\n", + "In [_A logarithmic-depth quantum carry-lookahead adder_](https://arxiv.org/abs/quant-ph/0406142), the authors present an out-of-place quantum adder implementation based on the [classical carry-lookahead adder method](https://en.wikipedia.org/wiki/Carry-lookahead_adder). To briefly summarize, the idea of carry-lookahead adder is to compute all carry bits $c_i$ based on propagate bits $p_i = x_i \\oplus y_i$ and generate bits $g_i = x_i \\land y_i$, without requiring other carry bits except for the carry-in $c_0$.\n", + "\n", + "For example, the first carry bit can be computed as $c_1 = g_0 \\oplus (p_0 \\land c_0)$, since either it is generated from bits $x_0$ and $y_0$ (when both are 1, and therefore $g_0 = 1$) or the carry bit $c_0$ is propagated (if either $x_0$ or $y_0$ is 1, and therefore $p_0 = 1$). More significant carry bits are computed in a similar way, for example $c_3 = g_2 \\oplus (g_1 \\land p_2) \\oplus (g_0 \\land p_1 \\land p_2) \\oplus (c_0 \\land p_0 \\land p_1 \\land p_2)$. That is, $c_3$ is either generated from bits at index 2, or generated from bits at index 1 _and_ propagated from bits at index 2, and so on.\n", + "\n", + "In order to minimize AND gates, these intermediate products can be computed in a clever way, as well as in logarithmic depth. We are now looking at an implementation of the carry-lookahead adder in Q#, and start by implementing a helper function to compute the number of 1-bits in an integer, also called Hamming weight, using a compact implementation based on a sequence of bitwise manipulations (you can learn more about these constants and why it works [on this article in Wikipedia](https://en.wikipedia.org/wiki/Hamming_weight#Efficient_implementation))." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "function HammingWeight(n : Int) : Int {\n", + " mutable i = n - ((n >>> 1) &&& 0x5555555555555555);\n", + " set i = (i &&& 0x3333333333333333) + ((i >>> 2) &&& 0x3333333333333333);\n", + " return (((i + (i >>> 4)) &&& 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >>> 56;\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we are going to implement internal routines that compute the generalized propagate bits, generalized generate, as well as the carry bits from them. These are called `PRounds`, `GRounds`, and `CRounds`, and descriptions of their implementations can be found in the paper above." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "open Microsoft.Quantum.Arrays;\n", + "open Microsoft.Quantum.Convert;\n", + "open Microsoft.Quantum.Math;\n", + "\n", + "internal operation PRounds(pWorkspace : Qubit[][]) : Unit is Adj {\n", + " for ws in Windows(2, pWorkspace) {\n", + " let (current, next) = (Rest(ws[0]), ws[1]);\n", + "\n", + " for (m, target) in Enumerated(next) {\n", + " ApplyAnd(current[2 * m], current[2 * m + 1], target);\n", + " }\n", + " }\n", + "}\n", + "\n", + "internal operation GRounds(pWorkspace : Qubit[][], gs : Qubit[]) : Unit is Adj {\n", + " let numRounds = Length(pWorkspace);\n", + " let n = Length(gs);\n", + "\n", + " for t in 1..numRounds {\n", + " let length = Floor(IntAsDouble(n) / IntAsDouble(2^t)) - 1;\n", + " let ps = pWorkspace[t - 1][0..2...];\n", + "\n", + " for m in 0..length {\n", + " CCNOT(gs[2^t * m + 2^(t - 1) - 1], ps[m], gs[2^t * m + 2^t - 1]);\n", + " }\n", + " }\n", + "}\n", + "\n", + "internal operation CRounds(pWorkspace : Qubit[][], gs : Qubit[]) : Unit is Adj {\n", + " let n = Length(gs);\n", + "\n", + " let start = Floor(Lg(IntAsDouble(2 * n) / 3.0));\n", + " for t in start..-1..1 {\n", + " let length = Floor(IntAsDouble(n - 2^(t - 1)) / IntAsDouble(2^t));\n", + " let ps = pWorkspace[t - 1][1..2...];\n", + "\n", + " for m in 1..length {\n", + " CCNOT(gs[2^t * m - 1], ps[m - 1], gs[2^t * m + 2^(t - 1) - 1]);\n", + " }\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With these operations we can build an operation that computes the carry bits\n", + "from the initial propagate and generate bits. Note that the generalized\n", + "propagate bits are computed out-of-place into some helper qubits `qs`, whereas\n", + "the generalized generate and carry bits are computed in-place into the register\n", + "`gs`, which contains the initial generate bits." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "internal operation ComputeCarries(ps : Qubit[], gs : Qubit[]) : Unit is Adj {\n", + " let n = Length(gs);\n", + "\n", + " let numRounds = Floor(Lg(IntAsDouble(n)));\n", + " use qs = Qubit[n - HammingWeight(n) - numRounds];\n", + "\n", + " let registerPartition = MappedOverRange(t -> Floor(IntAsDouble(n) / IntAsDouble(2^t)) - 1, 1..numRounds - 1);\n", + " let pWorkspace = [ps] + Partitioned(registerPartition, qs);\n", + "\n", + " within {\n", + " PRounds(pWorkspace);\n", + " } apply {\n", + " GRounds(pWorkspace, gs);\n", + " CRounds(pWorkspace, gs);\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we are all set up to implement the carry-lookahead adder, by first\n", + "computing the initial propagate and generate bits, then computing the carry\n", + "bits, and finally computing the output bits using the sums (initial propagate\n", + "bits) together with the carry bits. Note that this implementation supports both\n", + "an variant where the output register is 1 bit larger and does not overflow, as\n", + "well as a variant in which the sum is computed modulo $2^n$. The latter uses\n", + "the former by using a special treatment of the most-significant bits of the\n", + "input registers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "open Microsoft.Quantum.Arrays;\n", + "open Microsoft.Quantum.Canon;\n", + "open Microsoft.Quantum.Diagnostics;\n", + "\n", + "operation LookAheadAdd(xs : Qubit[], ys : Qubit[], zs : Qubit[]) : Unit is Adj {\n", + " let n = Length(xs);\n", + " let m = Length(zs);\n", + "\n", + " Fact(Length(ys) == n, \"Registers xs and ys must be of same length\");\n", + " Fact(m == n or m == n + 1, \"Register zs must be same length as xs or one bit larger\");\n", + "\n", + " if m == n + 1 { // with carry-out\n", + " // compute initial generate values\n", + " for k in 0..n - 1 {\n", + " ApplyAnd(xs[k], ys[k], zs[k + 1]);\n", + " }\n", + "\n", + " within {\n", + " // compute initial propagate values\n", + " ApplyToEachA(CNOT, Zipped(xs, ys));\n", + " } apply {\n", + " if n > 1 {\n", + " ComputeCarries(Rest(ys), Rest(zs));\n", + " }\n", + "\n", + " // compute sum into carries\n", + " for k in 0..n - 1 {\n", + " CNOT(ys[k], zs[k]);\n", + " }\n", + " }\n", + " } else { // without carry-out\n", + " LookAheadAdd(Most(xs), Most(ys), zs);\n", + " CNOT(Tail(xs), Tail(zs));\n", + " CNOT(Tail(ys), Tail(zs));\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we are creating an entry point for resource estimation, in which we are\n", + "calling the carry-lookahead adder for a given bitwidth, which can be provided as\n", + "an input argument." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%qsharp\n", + "operation EstimateLookAheadAdd(bitwidth : Int) : Unit {\n", + " use xs = Qubit[bitwidth];\n", + " use ys = Qubit[bitwidth];\n", + " use zs = Qubit[bitwidth + 1];\n", + "\n", + " LookAheadAdd(xs, ys, zs);\n", + "}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Profiling\n", + "\n", + "With these two adder implementations, we are now ready for some profiling. The\n", + "profiling feature in Azure Quantum Resource Estimator will create a resource\n", + "estimation profile that will show how sub operations in the program (e.g.,\n", + "`FullAdder` inside `RippleCarryAdd`) contribute to the overall costs.\n", + "\n", + "There are two important concepts to review to best understand the outputs of the\n", + "profiling feature, a *call graph* and a *call tree*. The *call graph* is static\n", + "representation of the quantum program which informs which operations call which\n", + "other operations. For example, `RippleCarryAdder` calls `FullAdder`, but both\n", + "of these operations call `CNOT`. The call graph contains a node for each\n", + "operation and a directed edge for each calling relation. The call graph may\n", + "contain cycles, e.g., in the case of recursive operations.\n", + "\n", + "In contrast, a call tree is a dynamic representation of the program execution in\n", + "which there are no cycles and for each node there is a clear path from the root\n", + "node. For example, distinguishes the calls to `CCNOT` from `GRounds` and\n", + "`CRounds` within the `ComputeCarries` operation in the carry-lookahead adder.\n", + "\n", + "But let's start looking at concrete examples, and create a resource estimator\n", + "instance together with a parameter object to configure the bitwidth argument and\n", + "enable the profiling feature." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "estimator = MicrosoftEstimator(workspace)\n", + "\n", + "params = estimator.make_params()\n", + "params.arguments[\"bitwidth\"] = 32\n", + "params.profiling.call_stack_depth = 10" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We enable the profiling feature by setting the `call_stack_depth` variable in\n", + "the `profiling` group to a number that indicates the maximum level up to which\n", + "sub operations are tracked. More precisely, the entry point operation is at\n", + "level 0. Any operation called from the entry point is at level 1, any operation\n", + "therein at 2, and so on. The call stack depth is setting a maximum value to an\n", + "operation's level in the call stack for which we track resources in the profile.\n", + "\n", + "Next, we submit a resource estimation job for the ripple carry adder by\n", + "providing the Q# operation `EstimateRippleCarryAdd` and the job parameter\n", + "object. We store there the results of the job in the variable `result_rca`,\n", + "where RCA is an abbreviation for ripple-carry adder." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "job = estimator.submit(EstimateRippleCarryAdd, input_params=params)\n", + "result_rca = job.get_results()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can inspect the call graph by calling the `call_graph` property on the result\n", + "object. It displays the call graph with the node corresponding to the entry\n", + "point operation at the top and aligns other operations top-down according to\n", + "their level." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result_rca.call_graph" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that the operation names are mangled, i.e., they contain additional\n", + "information. In this case, this can be a prefix `SNIPPET` or `ENTRYPOINT`,\n", + "which is generated by `qsharp` Python library from the Q# that is integrated\n", + "into the notebook. Some operations are prefixed by their namespace, and\n", + "operations have a suffix to indicate their variant (e.g., `body` for their\n", + "`body` implementation, and `adj` for their `adjoint` implementation).\n", + "\n", + "As described above, we can see that `RippleCarryAdd` calls both the `FullAdder`\n", + "and `CNOT`, and that the `FullAdder` also calls `CNOT` and `ApplyAnd`. In fact,\n", + "the `CNOT` operation is called from 3 other operations.\n", + "\n", + "Next, let's also generate resource estimates for the carry-lookahead\n", + "implementation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "job = estimator.submit(EstimateLookAheadAdd, input_params=params)\n", + "result_cla = job.get_results()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Again, we can look at its call graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result_cla.call_graph" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This call graph is much larger since more operations are involved. We can see\n", + "the `PRounds`, `GRounds`, and `CRounds` operations, and see that `GRounds` and\n", + "`CRounds` call `CCNOT`, but `PRounds` calls `ApplyAnd`. We also see that the\n", + "`adjoint` variant of `PRounds` calls the `adjoint` variant of `ApplyAnd`.\n", + "\n", + "It's possible to obtain a more compact call graph (and resource estimation\n", + "profile) by inlining some functions, e.g., those that just call a different\n", + "function and have no other logic inside. We can use the `inline_functions`\n", + "parameter for that:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "params.profiling.inline_functions = True\n", + "job = estimator.submit(EstimateLookAheadAdd, input_params=params)\n", + "result_cla_inline = job.get_results()\n", + "result_cla_inline.call_graph" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Although the call graph is now smaller, certain details—such as the rounds in\n", + "`ComputeCarries`—have been inlined and are no longer available for analysis.\n", + "Thus, the value for the `inline` parameter is typically chosen based on the\n", + "desired type of analysis. For the purposes of our remaining analysis, we will be\n", + "considering profiles for the adder operations without inlining." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we are inspecting the resource estimation profiles of both results. These\n", + "will provide the sub operations' contributions to runtime, logical depth,\n", + "physical qubits for the algorithm, and logical qubits for the algorithm. Let's\n", + "first have an overview of the total counts for each result:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "breakdown_rca = result_rca['physicalCounts']['breakdown']\n", + "breakdown_cla = result_cla['physicalCounts']['breakdown']\n", + "\n", + "pandas.DataFrame(data = {\n", + " \"Runtime\": [f\"{result_rca['physicalCounts']['runtime']} ns\", f\"{result_cla['physicalCounts']['runtime']} ns\"],\n", + " \"Logical depth\": [breakdown_rca['logicalDepth'], breakdown_cla['logicalDepth']],\n", + " \"Physical qubits\": [breakdown_rca['physicalQubitsForAlgorithm'], breakdown_cla['physicalQubitsForAlgorithm']],\n", + " \"Logical qubits\": [breakdown_rca['algorithmicLogicalQubits'], breakdown_cla['algorithmicLogicalQubits']],\n", + "}, index = [\"Ripple-carry adder\", \"Carry-lookahead adder\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The resource estimation profile is generated as a JSON file that can be read\n", + "using the speedscope interactive online profile viewer. In order to generate\n", + "and download the file call the `profile` property on a result object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result_rca.profile" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After downloading and opening the profile, we first see the runtime profile. For\n", + "this ripple-carry adder, we can readily see that all the runtime cost is caused\n", + "by the And operation inside the full adder. No other operation contributes cost\n", + "the overall runtime. The top most box is the entry point operation and the root\n", + "note of the call tree. In the next layer below are all operations that are\n", + "called from the root node and that contribute to the overall runtime.\n", + "\n", + "
\n", + "\n", + "On the top center is a button that displays _Runtime (1/4)_, which we can press\n", + "to look at profiles for the other metrics. If we click on physical algorithmic\n", + "qubits, we get this profile:\n", + "\n", + "
\n", + "\n", + "For the number of qubits we account how many new qubits were allocated by some\n", + "operation and track the maximum number of allocated qubits. In this sample,\n", + "`EstimateRippleCarryAdd` allocated all qubits for the adder and there are no\n", + "additional helper qubits, e.g., in the implementation of `FullAdder`. The entry\n", + "point operation accounts for the additional qubits that are required for the\n", + "padding in the 2D layout on the surface code.\n", + "\n", + "Let's also generate the profile for the carry-lookahead adder." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result_cla.profile" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The runtime profile for the carry-lookahead adder contains more operations:\n", + "\n", + "
\n", + "\n", + "We can see that the AND gates in the `LookAheadAdd` operation and the\n", + "`ComputeCarries` operation contribute to the overall runtime. Inside the\n", + "`ComputeCarries` operation, we can analyze the contribution of the sub\n", + "operations for the different rounds, and, e.g., note that the `adjoint` variant\n", + "for the `PRounds` takes the fewest time, whereas all other 3 rounds are of\n", + "similar complexity.\n", + "\n", + "In the qubit profile, we see that some additional helper qubits are allocated in\n", + "the `ComputeCarries` operation to hold the auxiliary $p$ bits, that are computed\n", + "out-of-place and required in the computation of the `GRounds` and `CRounds`.\n", + "\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Advanced analysis\n", + "\n", + "In this notebook's final section, we will be conducting advanced analysis on the\n", + "profiling data. Specifically, we will programmatically access the profiling data\n", + "and examine the impact of `PRounds`, `GRounds`, and `CRounds` on the\n", + "carry-lookahead adder's runtime for increasing bitwidths.\n", + "\n", + "To begin, we will establish parameters for a batching job with bitwidths ranging\n", + "from $2^1 = 2$ to $2^{10} = 1024$, with power-of-2 steps. We will also set the\n", + "call stack depth to 10, as demonstrated in the examples above. It's worth noting\n", + "that while the call stack depth parameter applies to all items in the batching\n", + "parameters, the bitwidth must be specified for each item individually." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "bitwidths = [2**k for k in range(1, 11)]\n", + "params = estimator.make_params(num_items=len(bitwidths))\n", + "\n", + "params.profiling.call_stack_depth = 10\n", + "for i, bitwidth in enumerate(bitwidths):\n", + " params.items[i].arguments['bitwidth'] = bitwidth" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We are submitting the job and store its results in `results_all`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "job = estimator.submit(EstimateLookAheadAdd, input_params=params)\n", + "results_all = job.get_results()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The raw profiling data can be accessed in JSON format via the `'profile'` key in\n", + "a resource estimation result. For example, to access the profiling data for the\n", + "first item in `results_all`, we would use `results_all[0]['profile']`. The\n", + "format follows a custom scheme for speedscope and is documented in detail\n", + "[here](https://github.com/jlfwong/speedscope/blob/master/src/lib/file-format-spec.ts),\n", + "with the corresponding JSON schema available\n", + "[here](https://www.speedscope.app/file-format-schema.json).\n", + " \n", + "Each node in the tree is assigned a frame with an index, and the profile\n", + "contains samples organized by calling order, with each sample assigned a weight\n", + "(e.g. runtime). To locate the frame associated with a given round name (e.g.\n", + "`PRounds`), the following Python function can be used: it finds the frame, then\n", + "identifies all samples that contain it, and sums up the corresponding weights." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def rounds_runtime(profile, round_name):\n", + " # Find the frame index which name contains `round_name` and ends in \"body\".\n", + " frame_index = [i for (i, frame) in enumerate(profile['shared']['frames']) if re.match(f'.*{round_name}.+body', frame['name'])][0]\n", + "\n", + " # The runtime profile is the first profile\n", + " runtime_profile = profile['profiles'][0]\n", + "\n", + " # Get variables to the samples and weights field of the runtime profile\n", + " samples = runtime_profile['samples']\n", + " weights = runtime_profile['weights']\n", + "\n", + " # Sum up all the weights that correspond to samples that contain the operation,\n", + " # i.e., that contain the frame_index\n", + " return sum(weight for (sample, weight) in zip(samples, weights) if frame_index in sample)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We now extract the profile for each job result item and use the `rounds_runtime`\n", + "function to obtain the runtime for each round, add it to a data frame together\n", + "with the total runtime and return a plot." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "entries = []\n", + "for idx in range(len(results_all)):\n", + " bitwidth = bitwidths[idx]\n", + " result = results_all[idx]\n", + " profile = result['profile']\n", + "\n", + " total_runtime = result['physicalCounts']['runtime']\n", + " entries.append([\n", + " rounds_runtime(profile, \"PRounds\"),\n", + " rounds_runtime(profile, \"GRounds\"),\n", + " rounds_runtime(profile, \"CRounds\"),\n", + " total_runtime\n", + " ])\n", + "\n", + "df = pandas.DataFrame(data=entries, index=bitwidths, columns=[\"PRounds\", \"GRounds\", \"CRounds\", \"Total\"])\n", + "ax = df.plot(logx=True, xticks=bitwidths, xlabel=\"Bitwidth\", ylabel=\"Runtime (ns)\")\n", + "# show all xticks\n", + "from matplotlib.text import Text\n", + "ax.set_xticklabels([Text(b, 0, b) for b in bitwidths]);" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note how the total runtime grows much faster compared to the runtime of the\n", + "rounds. The reason is that we need $O(n)$ AND gates in the preparation part of\n", + "`LookAheadAdd` but only $O(\\log(n))$ AND and CCNOT gates in the `ComputeCarries`\n", + "operation.\n", + "\n", + "Further note that logical depth of a the carry-lookahead adder is also\n", + "logarithmic in $n$, since on the logical level, all AND and CCNOT gates, in both\n", + "the preparation parts and in the rounds can be applied in parallel. However,\n", + "when mapping to surface code operations using Parallel Synthesis Sequential\n", + "Pauli Computation (PSSPC), these operations are sequentialized (see Appendix D\n", + "in [Assessing requirements to scale to practical quantum\n", + "advantage](https://arxiv.org/pdf/2211.07629.pdf))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Next steps" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Great job! You now know how to use the profiling feature of the Azure Quantum\n", + "Resource Estimator to analyze how different parts of your program contribute to\n", + "overall resource estimates.\n", + "\n", + "To summarize, you can use the use the `profiling.call_stack_depth` and\n", + "`profiling.inline_functions` job parameters to enable profiles in resource\n", + "estimation jobs. In the resource estimation results that you receive after\n", + "successful job submission, the `call_graph` and `profile` properties show call\n", + "graphs and resource estimation profiles, respectively.\n", + "\n", + "We encourage you to further explore this feature and try out the following\n", + "ideas:\n", + " \n", + "* Experiment with changing the `call_stack_depth` parameter\n", + "* Investigate call graphs and profiles of recursive programs\n", + "* Generate profiles from programs in other notebooks\n", + "* Perform an advanced profile analysis to compare the number of helper qubits to\n", + " the number of input and output qubits in the carry-lookahead implementation" + ] + } + ], + "metadata": {}, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/samples/azure-quantum/resource-estimation/profile_cla_qubits.png b/samples/azure-quantum/resource-estimation/profile_cla_qubits.png new file mode 100644 index 000000000000..1efb277058bb Binary files /dev/null and b/samples/azure-quantum/resource-estimation/profile_cla_qubits.png differ diff --git a/samples/azure-quantum/resource-estimation/profile_cla_runtime.png b/samples/azure-quantum/resource-estimation/profile_cla_runtime.png new file mode 100644 index 000000000000..2b9d89052670 Binary files /dev/null and b/samples/azure-quantum/resource-estimation/profile_cla_runtime.png differ diff --git a/samples/azure-quantum/resource-estimation/profile_rca_qubits.png b/samples/azure-quantum/resource-estimation/profile_rca_qubits.png new file mode 100644 index 000000000000..ff7f7c078882 Binary files /dev/null and b/samples/azure-quantum/resource-estimation/profile_rca_qubits.png differ diff --git a/samples/azure-quantum/resource-estimation/profile_rca_runtime.png b/samples/azure-quantum/resource-estimation/profile_rca_runtime.png new file mode 100644 index 000000000000..e5ac29a59100 Binary files /dev/null and b/samples/azure-quantum/resource-estimation/profile_rca_runtime.png differ