Skip to content

Commit

Permalink
Deploying to gh-pages from @ 7cee6e3 🚀
Browse files Browse the repository at this point in the history
  • Loading branch information
facebook-github-bot committed Jan 10, 2024
1 parent 9223069 commit 8134fea
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 14 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -1381,11 +1381,6 @@ <h1>Source code for fbgemm_gpu.split_table_batched_embeddings_ops_training</h1><
<span class="n">indice_weights</span><span class="o">=</span><span class="n">per_sample_weights</span><span class="p">,</span>
<span class="n">feature_requires_grad</span><span class="o">=</span><span class="n">feature_requires_grad</span><span class="p">,</span>
<span class="n">lxu_cache_locations</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">lxu_cache_locations</span><span class="p">,</span>
<span class="c1"># Pass the local_uvm_cache_stats bc only that information is</span>
<span class="c1"># relevant for the current iteration</span>
<span class="n">uvm_cache_stats</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">local_uvm_cache_stats</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">gather_uvm_cache_stats</span>
<span class="k">else</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">output_dtype</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">output_dtype</span><span class="p">,</span>
<span class="n">vbe_metadata</span><span class="o">=</span><span class="n">vbe_metadata</span><span class="p">,</span>
<span class="n">is_experimental</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">is_experimental</span><span class="p">,</span>
Expand Down Expand Up @@ -1581,12 +1576,6 @@ <h1>Source code for fbgemm_gpu.split_table_batched_embeddings_ops_training</h1><
<span class="k">if</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">lxu_cache_weights</span><span class="o">.</span><span class="n">numel</span><span class="p">():</span>
<span class="k">return</span>

<span class="c1"># Clear the local_uvm_cache_stats before the prefetch instead of after</span>
<span class="c1"># the prefetch step, since it will be used in the CommonArgs in the</span>
<span class="c1"># forward step</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">gather_uvm_cache_stats</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">local_uvm_cache_stats</span><span class="o">.</span><span class="n">zero_</span><span class="p">()</span>

<span class="n">linear_cache_indices</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">ops</span><span class="o">.</span><span class="n">fbgemm</span><span class="o">.</span><span class="n">linearize_cache_indices</span><span class="p">(</span>
<span class="bp">self</span><span class="o">.</span><span class="n">cache_hash_size_cumsum</span><span class="p">,</span>
<span class="n">indices</span><span class="p">,</span>
Expand Down Expand Up @@ -1663,11 +1652,12 @@ <h1>Source code for fbgemm_gpu.split_table_batched_embeddings_ops_training</h1><

<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">gather_uvm_cache_stats</span><span class="p">:</span>
<span class="c1"># Accumulate local_uvm_cache_stats (int32) into uvm_cache_stats (int64).</span>
<span class="c1"># We may want to do this accumulation atomically, but as it&#39;s only</span>
<span class="c1"># for monitoring, slightly inaccurate result may be acceptable.</span>
<span class="c1"># We may wanna do this accumulation atomically, but as it&#39;s only for monitoring,</span>
<span class="c1"># slightly inaccurate result may be acceptable.</span>
<span class="bp">self</span><span class="o">.</span><span class="n">uvm_cache_stats</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">add</span><span class="p">(</span>
<span class="bp">self</span><span class="o">.</span><span class="n">uvm_cache_stats</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">local_uvm_cache_stats</span>
<span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">local_uvm_cache_stats</span><span class="o">.</span><span class="n">zero_</span><span class="p">()</span>

<span class="k">def</span> <span class="nf">_prefetch_tensors_record_stream</span><span class="p">(</span>
<span class="bp">self</span><span class="p">,</span> <span class="n">forward_stream</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">Stream</span>
Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.

0 comments on commit 8134fea

Please sign in to comment.