Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] UnbatchedTensor #1170

Merged
merged 4 commits into from
Jan 9, 2025
Merged

[Feature] UnbatchedTensor #1170

merged 4 commits into from
Jan 9, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 9, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 9, 2025
ghstack-source-id: 982fbef0214e38841dcce82c34116ae991473798
Pull Request resolved: #1170
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 9, 2025
Copy link

github-actions bot commented Jan 9, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 53.3500μs 20.5679μs 48.6195 KOps/s 48.6590 KOps/s $\color{#d91a1a}-0.08\%$
test_plain_set_stack_nested 49.3620μs 20.9773μs 47.6706 KOps/s 48.7385 KOps/s $\color{#d91a1a}-2.19\%$
test_plain_set_nested_inplace 66.1530μs 22.5404μs 44.3649 KOps/s 44.8090 KOps/s $\color{#d91a1a}-0.99\%$
test_plain_set_stack_nested_inplace 66.2230μs 22.3847μs 44.6734 KOps/s 44.3933 KOps/s $\color{#35bf28}+0.63\%$
test_items 57.0660μs 4.1750μs 239.5237 KOps/s 244.6243 KOps/s $\color{#d91a1a}-2.09\%$
test_items_nested 0.6230ms 0.3936ms 2.5410 KOps/s 2.5421 KOps/s $\color{#d91a1a}-0.05\%$
test_items_nested_locked 0.6748ms 0.3912ms 2.5565 KOps/s 2.5507 KOps/s $\color{#35bf28}+0.23\%$
test_items_nested_leaf 0.1382ms 76.9874μs 12.9891 KOps/s 13.0276 KOps/s $\color{#d91a1a}-0.30\%$
test_items_stack_nested 0.5805ms 0.3944ms 2.5355 KOps/s 2.5205 KOps/s $\color{#35bf28}+0.60\%$
test_items_stack_nested_leaf 0.1562ms 76.8054μs 13.0199 KOps/s 12.6675 KOps/s $\color{#35bf28}+2.78\%$
test_items_stack_nested_locked 0.7431ms 0.3955ms 2.5285 KOps/s 2.5083 KOps/s $\color{#35bf28}+0.80\%$
test_keys 43.9620μs 3.5328μs 283.0584 KOps/s 280.6814 KOps/s $\color{#35bf28}+0.85\%$
test_keys_nested 0.2866ms 0.1642ms 6.0916 KOps/s 6.2066 KOps/s $\color{#d91a1a}-1.85\%$
test_keys_nested_locked 0.6924ms 0.1699ms 5.8851 KOps/s 5.9376 KOps/s $\color{#d91a1a}-0.88\%$
test_keys_nested_leaf 0.2551ms 0.1434ms 6.9742 KOps/s 7.0904 KOps/s $\color{#d91a1a}-1.64\%$
test_keys_stack_nested 0.2648ms 0.1638ms 6.1062 KOps/s 5.9645 KOps/s $\color{#35bf28}+2.38\%$
test_keys_stack_nested_leaf 0.2252ms 0.1436ms 6.9626 KOps/s 7.1610 KOps/s $\color{#d91a1a}-2.77\%$
test_keys_stack_nested_locked 0.2604ms 0.1702ms 5.8759 KOps/s 5.8804 KOps/s $\color{#d91a1a}-0.08\%$
test_values 7.2536μs 1.0332μs 967.8897 KOps/s 867.8004 KOps/s $\textbf{\color{#35bf28}+11.53\%}$
test_values_nested 0.1206ms 62.3655μs 16.0345 KOps/s 16.3538 KOps/s $\color{#d91a1a}-1.95\%$
test_values_nested_locked 0.1363ms 62.2160μs 16.0730 KOps/s 16.4133 KOps/s $\color{#d91a1a}-2.07\%$
test_values_nested_leaf 0.1253ms 70.8958μs 14.1052 KOps/s 13.9567 KOps/s $\color{#35bf28}+1.06\%$
test_values_stack_nested 0.1399ms 61.7020μs 16.2069 KOps/s 15.9368 KOps/s $\color{#35bf28}+1.70\%$
test_values_stack_nested_leaf 0.1708ms 70.6482μs 14.1546 KOps/s 14.0110 KOps/s $\color{#35bf28}+1.03\%$
test_values_stack_nested_locked 0.1296ms 61.8814μs 16.1599 KOps/s 15.8828 KOps/s $\color{#35bf28}+1.74\%$
test_membership 61.6250μs 0.8934μs 1.1193 MOps/s 1.1797 MOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_membership_nested 34.8750μs 2.9366μs 340.5331 KOps/s 352.8445 KOps/s $\color{#d91a1a}-3.49\%$
test_membership_nested_leaf 28.4130μs 2.9306μs 341.2244 KOps/s 351.5709 KOps/s $\color{#d91a1a}-2.94\%$
test_membership_stacked_nested 47.3480μs 2.9280μs 341.5308 KOps/s 351.6117 KOps/s $\color{#d91a1a}-2.87\%$
test_membership_stacked_nested_leaf 28.1930μs 2.9121μs 343.3942 KOps/s 342.5905 KOps/s $\color{#35bf28}+0.23\%$
test_membership_nested_last 31.0180μs 4.4214μs 226.1727 KOps/s 235.8693 KOps/s $\color{#d91a1a}-4.11\%$
test_membership_nested_leaf_last 45.1450μs 4.4435μs 225.0501 KOps/s 235.9800 KOps/s $\color{#d91a1a}-4.63\%$
test_membership_stacked_nested_last 28.0430μs 4.3594μs 229.3895 KOps/s 228.9228 KOps/s $\color{#35bf28}+0.20\%$
test_membership_stacked_nested_leaf_last 51.3560μs 4.4027μs 227.1329 KOps/s 233.4638 KOps/s $\color{#d91a1a}-2.71\%$
test_nested_getleaf 52.8290μs 10.3560μs 96.5624 KOps/s 98.7977 KOps/s $\color{#d91a1a}-2.26\%$
test_nested_get 52.0870μs 9.9248μs 100.7575 KOps/s 101.4317 KOps/s $\color{#d91a1a}-0.66\%$
test_stacked_getleaf 50.0130μs 10.4114μs 96.0490 KOps/s 95.6232 KOps/s $\color{#35bf28}+0.45\%$
test_stacked_get 39.3230μs 9.8581μs 101.4392 KOps/s 100.3051 KOps/s $\color{#35bf28}+1.13\%$
test_nested_getitemleaf 45.7550μs 11.0635μs 90.3872 KOps/s 90.9636 KOps/s $\color{#d91a1a}-0.63\%$
test_nested_getitem 41.4370μs 10.5195μs 95.0611 KOps/s 97.7230 KOps/s $\color{#d91a1a}-2.72\%$
test_stacked_getitemleaf 44.5130μs 11.0863μs 90.2012 KOps/s 90.8823 KOps/s $\color{#d91a1a}-0.75\%$
test_stacked_getitem 49.5530μs 10.5590μs 94.7055 KOps/s 98.6709 KOps/s $\color{#d91a1a}-4.02\%$
test_lock_nested 1.2966ms 0.4756ms 2.1025 KOps/s 1.6123 KOps/s $\textbf{\color{#35bf28}+30.41\%}$
test_lock_stack_nested 0.8264ms 0.4491ms 2.2266 KOps/s 2.0790 KOps/s $\textbf{\color{#35bf28}+7.10\%}$
test_unlock_nested 1.1007ms 0.3931ms 2.5439 KOps/s 2.3286 KOps/s $\textbf{\color{#35bf28}+9.25\%}$
test_unlock_stack_nested 0.7082ms 0.3634ms 2.7518 KOps/s 2.5794 KOps/s $\textbf{\color{#35bf28}+6.68\%}$
test_flatten_speed 0.1776ms 99.1364μs 10.0871 KOps/s 9.9111 KOps/s $\color{#35bf28}+1.78\%$
test_unflatten_speed 0.9477ms 0.5158ms 1.9387 KOps/s 1.9237 KOps/s $\color{#35bf28}+0.78\%$
test_common_ops 5.1514ms 0.8284ms 1.2072 KOps/s 1.2146 KOps/s $\color{#d91a1a}-0.61\%$
test_creation 25.8780μs 2.4875μs 402.0110 KOps/s 407.4576 KOps/s $\color{#d91a1a}-1.34\%$
test_creation_empty 59.3610μs 12.1824μs 82.0857 KOps/s 85.3025 KOps/s $\color{#d91a1a}-3.77\%$
test_creation_nested_1 59.5610μs 15.3004μs 65.3580 KOps/s 68.0419 KOps/s $\color{#d91a1a}-3.94\%$
test_creation_nested_2 74.2280μs 19.6982μs 50.7661 KOps/s 51.6774 KOps/s $\color{#d91a1a}-1.76\%$
test_clone 49.9430μs 13.5858μs 73.6064 KOps/s 72.8769 KOps/s $\color{#35bf28}+1.00\%$
test_getitem[int] 1.4638ms 13.0965μs 76.3562 KOps/s 77.4859 KOps/s $\color{#d91a1a}-1.46\%$
test_getitem[slice_int] 0.1658ms 25.0657μs 39.8951 KOps/s 40.2189 KOps/s $\color{#d91a1a}-0.81\%$
test_getitem[range] 0.2215ms 50.4593μs 19.8179 KOps/s 20.3555 KOps/s $\color{#d91a1a}-2.64\%$
test_getitem[tuple] 0.1693ms 20.2805μs 49.3084 KOps/s 48.6413 KOps/s $\color{#35bf28}+1.37\%$
test_getitem[list] 0.2109ms 45.6506μs 21.9055 KOps/s 22.4210 KOps/s $\color{#d91a1a}-2.30\%$
test_setitem_dim[int] 67.7960μs 26.3557μs 37.9425 KOps/s 40.9427 KOps/s $\textbf{\color{#d91a1a}-7.33\%}$
test_setitem_dim[slice_int] 0.1187ms 52.4323μs 19.0722 KOps/s 19.3270 KOps/s $\color{#d91a1a}-1.32\%$
test_setitem_dim[range] 0.1200ms 74.6818μs 13.3901 KOps/s 13.8317 KOps/s $\color{#d91a1a}-3.19\%$
test_setitem_dim[tuple] 85.3290μs 41.4222μs 24.1416 KOps/s 25.1835 KOps/s $\color{#d91a1a}-4.14\%$
test_setitem 77.9140μs 21.2929μs 46.9641 KOps/s 47.0667 KOps/s $\color{#d91a1a}-0.22\%$
test_set 68.4680μs 20.6032μs 48.5361 KOps/s 47.7877 KOps/s $\color{#35bf28}+1.57\%$
test_set_shared 4.9751ms 0.1829ms 5.4689 KOps/s 5.4115 KOps/s $\color{#35bf28}+1.06\%$
test_update 0.4708ms 23.4809μs 42.5878 KOps/s 42.9975 KOps/s $\color{#d91a1a}-0.95\%$
test_update_nested 0.3800ms 33.4083μs 29.9326 KOps/s 29.5808 KOps/s $\color{#35bf28}+1.19\%$
test_update__nested 0.5539ms 33.8538μs 29.5388 KOps/s 28.9596 KOps/s $\color{#35bf28}+2.00\%$
test_set_nested 0.4360ms 22.5578μs 44.3306 KOps/s 43.4893 KOps/s $\color{#35bf28}+1.93\%$
test_set_nested_new 91.4000μs 27.0916μs 36.9119 KOps/s 35.8379 KOps/s $\color{#35bf28}+3.00\%$
test_select 0.1015ms 43.0460μs 23.2309 KOps/s 22.4102 KOps/s $\color{#35bf28}+3.66\%$
test_select_nested 0.1301ms 61.9627μs 16.1387 KOps/s 15.8082 KOps/s $\color{#35bf28}+2.09\%$
test_exclude_nested 0.1500ms 79.5998μs 12.5628 KOps/s 12.2784 KOps/s $\color{#35bf28}+2.32\%$
test_empty[True] 0.5412ms 0.4034ms 2.4789 KOps/s 2.4479 KOps/s $\color{#35bf28}+1.27\%$
test_empty[False] 45.0540μs 1.4953μs 668.7813 KOps/s 723.4456 KOps/s $\textbf{\color{#d91a1a}-7.56\%}$
test_unbind_speed 0.4540ms 0.2686ms 3.7235 KOps/s 3.7044 KOps/s $\color{#35bf28}+0.51\%$
test_unbind_speed_stack0 0.4705ms 0.2672ms 3.7432 KOps/s 3.7333 KOps/s $\color{#35bf28}+0.26\%$
test_unbind_speed_stack1 0.1253s 0.8557ms 1.1687 KOps/s 1.4989 KOps/s $\textbf{\color{#d91a1a}-22.03\%}$
test_split 0.1270s 1.8147ms 551.0476 Ops/s 499.2232 Ops/s $\textbf{\color{#35bf28}+10.38\%}$
test_chunk 1.9985ms 1.6077ms 621.9900 Ops/s 499.0749 Ops/s $\textbf{\color{#35bf28}+24.63\%}$
test_consolidate_njt[False-None] 0.1331s 9.7196ms 102.8846 Ops/s 108.8334 Ops/s $\textbf{\color{#d91a1a}-5.47\%}$
test_creation[device0] 0.3445ms 92.3694μs 10.8261 KOps/s 10.4468 KOps/s $\color{#35bf28}+3.63\%$
test_creation_from_tensor 5.0527ms 95.6403μs 10.4558 KOps/s 10.3654 KOps/s $\color{#35bf28}+0.87\%$
test_add_one[memmap_tensor0] 0.8522ms 5.0786μs 196.9065 KOps/s 197.5912 KOps/s $\color{#d91a1a}-0.35\%$
test_contiguous[memmap_tensor0] 36.1970μs 0.5136μs 1.9469 MOps/s 1.9449 MOps/s $\color{#35bf28}+0.10\%$
test_stack[memmap_tensor0] 0.2133ms 3.6089μs 277.0915 KOps/s 286.0283 KOps/s $\color{#d91a1a}-3.12\%$
test_memmaptd_index 1.1609ms 0.2417ms 4.1382 KOps/s 3.9760 KOps/s $\color{#35bf28}+4.08\%$
test_memmaptd_index_astensor 0.6655ms 0.3318ms 3.0141 KOps/s 2.9116 KOps/s $\color{#35bf28}+3.52\%$
test_memmaptd_index_op 1.3498ms 0.6169ms 1.6211 KOps/s 1.5944 KOps/s $\color{#35bf28}+1.68\%$
test_serialize_model 0.1360s 0.1278s 7.8251 Ops/s 7.8968 Ops/s $\color{#d91a1a}-0.91\%$
test_serialize_model_pickle 0.5097s 0.4042s 2.4738 Ops/s 2.4422 Ops/s $\color{#35bf28}+1.30\%$
test_serialize_weights 0.1302s 0.1225s 8.1616 Ops/s 6.7889 Ops/s $\textbf{\color{#35bf28}+20.22\%}$
test_serialize_weights_returnearly 0.2831s 0.1818s 5.4995 Ops/s 6.0130 Ops/s $\textbf{\color{#d91a1a}-8.54\%}$
test_serialize_weights_pickle 0.4837s 0.4070s 2.4572 Ops/s 2.5235 Ops/s $\color{#d91a1a}-2.63\%$
test_serialize_weights_filesystem 0.1599s 0.1530s 6.5366 Ops/s 6.2744 Ops/s $\color{#35bf28}+4.18\%$
test_serialize_model_filesystem 0.1687s 0.1578s 6.3371 Ops/s 5.7230 Ops/s $\textbf{\color{#35bf28}+10.73\%}$
test_reshape_pytree 0.1283ms 26.5931μs 37.6037 KOps/s 37.5755 KOps/s $\color{#35bf28}+0.08\%$
test_reshape_td 84.8080μs 32.5948μs 30.6798 KOps/s 29.8670 KOps/s $\color{#35bf28}+2.72\%$
test_view_pytree 0.1138ms 26.7303μs 37.4107 KOps/s 37.5031 KOps/s $\color{#d91a1a}-0.25\%$
test_view_td 0.1136ms 37.5066μs 26.6619 KOps/s 25.7395 KOps/s $\color{#35bf28}+3.58\%$
test_unbind_pytree 0.1174ms 29.4598μs 33.9445 KOps/s 33.8915 KOps/s $\color{#35bf28}+0.16\%$
test_unbind_td 0.3818ms 39.4114μs 25.3733 KOps/s 25.1346 KOps/s $\color{#35bf28}+0.95\%$
test_split_pytree 85.9500μs 29.7720μs 33.5886 KOps/s 33.5579 KOps/s $\color{#35bf28}+0.09\%$
test_split_td 0.5631ms 45.6722μs 21.8952 KOps/s 21.8816 KOps/s $\color{#35bf28}+0.06\%$
test_add_pytree 92.5220μs 34.8481μs 28.6960 KOps/s 28.0266 KOps/s $\color{#35bf28}+2.39\%$
test_add_td 0.3266ms 57.9810μs 17.2470 KOps/s 17.7635 KOps/s $\color{#d91a1a}-2.91\%$
test_compile_add_one_nested[tensordict-compile] 0.1645ms 63.5835μs 15.7273 KOps/s 15.5877 KOps/s $\color{#35bf28}+0.90\%$
test_compile_add_one_nested[tensordict-eager] 0.4413ms 0.1740ms 5.7477 KOps/s 5.6206 KOps/s $\color{#35bf28}+2.26\%$
test_compile_add_one_nested[pytree-compile] 0.1416ms 45.1231μs 22.1616 KOps/s 21.4329 KOps/s $\color{#35bf28}+3.40\%$
test_compile_add_one_nested[pytree-eager] 0.2394ms 0.1183ms 8.4560 KOps/s 8.3681 KOps/s $\color{#35bf28}+1.05\%$
test_compile_copy_nested[tensordict-compile] 0.1085ms 26.1726μs 38.2079 KOps/s 38.8174 KOps/s $\color{#d91a1a}-1.57\%$
test_compile_copy_nested[tensordict-eager] 0.1367ms 57.9499μs 17.2563 KOps/s 16.7387 KOps/s $\color{#35bf28}+3.09\%$
test_compile_copy_nested[pytree-compile] 0.1831ms 77.1306μs 12.9650 KOps/s 12.7638 KOps/s $\color{#35bf28}+1.58\%$
test_compile_copy_nested[pytree-eager] 0.1215ms 66.9230μs 14.9425 KOps/s 14.8448 KOps/s $\color{#35bf28}+0.66\%$
test_compile_add_one_flat[tensordict-compile] 0.2218ms 0.1053ms 9.4927 KOps/s 9.3134 KOps/s $\color{#35bf28}+1.93\%$
test_compile_add_one_flat[tensordict-eager] 0.4940ms 0.2180ms 4.5877 KOps/s 4.6372 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_add_one_flat[tensorclass-compile] 0.1384ms 46.1396μs 21.6734 KOps/s 21.1178 KOps/s $\color{#35bf28}+2.63\%$
test_compile_add_one_flat[tensorclass-eager] 0.6635ms 66.4241μs 15.0548 KOps/s 14.4377 KOps/s $\color{#35bf28}+4.27\%$
test_compile_add_one_flat[pytree-compile] 0.1981ms 0.1024ms 9.7657 KOps/s 9.6768 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_one_flat[pytree-eager] 0.4147ms 0.2061ms 4.8525 KOps/s 4.9699 KOps/s $\color{#d91a1a}-2.36\%$
test_compile_add_self_flat[tensordict-eager] 0.3971ms 0.2339ms 4.2752 KOps/s 4.2538 KOps/s $\color{#35bf28}+0.50\%$
test_compile_add_self_flat[tensordict-compile] 0.2760ms 0.1082ms 9.2407 KOps/s 9.2989 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_add_self_flat[tensorclass-eager] 0.2897ms 62.2503μs 16.0642 KOps/s 16.0095 KOps/s $\color{#35bf28}+0.34\%$
test_compile_add_self_flat[tensorclass-compile] 0.1523ms 47.4329μs 21.0824 KOps/s 21.1063 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_add_self_flat[pytree-eager] 0.3305ms 0.1673ms 5.9790 KOps/s 6.3688 KOps/s $\textbf{\color{#d91a1a}-6.12\%}$
test_compile_add_self_flat[pytree-compile] 0.2175ms 0.1023ms 9.7791 KOps/s 9.5918 KOps/s $\color{#35bf28}+1.95\%$
test_compile_copy_flat[tensordict-compile] 89.1460μs 21.1665μs 47.2446 KOps/s 46.7870 KOps/s $\color{#35bf28}+0.98\%$
test_compile_copy_flat[tensordict-eager] 0.1544ms 66.6294μs 15.0084 KOps/s 15.1751 KOps/s $\color{#d91a1a}-1.10\%$
test_compile_copy_flat[pytree-compile] 0.1730ms 78.7057μs 12.7056 KOps/s 12.6411 KOps/s $\color{#35bf28}+0.51\%$
test_compile_copy_flat[pytree-eager] 0.1423ms 67.9216μs 14.7229 KOps/s 14.7415 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_assign_and_add[tensordict-compile] 0.4395ms 0.2098ms 4.7666 KOps/s 4.7734 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_assign_and_add[tensordict-eager] 2.2891ms 1.3541ms 738.5184 Ops/s 761.3921 Ops/s $\color{#d91a1a}-3.00\%$
test_compile_assign_and_add[pytree-compile] 0.3135ms 0.2038ms 4.9067 KOps/s 4.8570 KOps/s $\color{#35bf28}+1.02\%$
test_compile_assign_and_add[pytree-eager] 0.9670ms 0.7796ms 1.2828 KOps/s 1.2818 KOps/s $\color{#35bf28}+0.08\%$
test_compile_assign_and_add_stack[compile] 0.5957ms 0.4504ms 2.2202 KOps/s 2.1330 KOps/s $\color{#35bf28}+4.09\%$
test_compile_assign_and_add_stack[eager] 4.7833ms 2.7771ms 360.0829 Ops/s 345.8552 Ops/s $\color{#35bf28}+4.11\%$
test_compile_indexing[tensor-tensordict-compile] 0.1264ms 37.3369μs 26.7832 KOps/s 27.0557 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[tensor-tensordict-eager] 0.7948ms 34.2026μs 29.2376 KOps/s 28.9715 KOps/s $\color{#35bf28}+0.92\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1134ms 29.5477μs 33.8436 KOps/s 33.9424 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1196ms 23.0669μs 43.3522 KOps/s 43.4029 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_indexing[tensor-pytree-compile] 0.1426ms 30.2352μs 33.0740 KOps/s 32.6290 KOps/s $\color{#35bf28}+1.36\%$
test_compile_indexing[tensor-pytree-eager] 99.4250μs 23.2460μs 43.0182 KOps/s 42.8631 KOps/s $\color{#35bf28}+0.36\%$
test_compile_indexing[slice-tensordict-compile] 0.1338ms 52.5933μs 19.0138 KOps/s 18.8201 KOps/s $\color{#35bf28}+1.03\%$
test_compile_indexing[slice-tensordict-eager] 0.9647ms 20.8930μs 47.8630 KOps/s 48.1646 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_indexing[slice-tensorclass-compile] 0.1626ms 44.0632μs 22.6947 KOps/s 22.1041 KOps/s $\color{#35bf28}+2.67\%$
test_compile_indexing[slice-tensorclass-eager] 0.1364ms 19.2010μs 52.0806 KOps/s 51.2345 KOps/s $\color{#35bf28}+1.65\%$
test_compile_indexing[slice-pytree-compile] 0.1152ms 44.4662μs 22.4890 KOps/s 21.4747 KOps/s $\color{#35bf28}+4.72\%$
test_compile_indexing[slice-pytree-eager] 72.4350μs 18.6984μs 53.4806 KOps/s 52.2586 KOps/s $\color{#35bf28}+2.34\%$
test_compile_indexing[int-tensordict-compile] 0.1217ms 54.1201μs 18.4774 KOps/s 18.6430 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_indexing[int-tensordict-eager] 1.3089ms 20.6669μs 48.3866 KOps/s 48.5937 KOps/s $\color{#d91a1a}-0.43\%$
test_compile_indexing[int-tensorclass-compile] 0.1160ms 45.0306μs 22.2071 KOps/s 21.5411 KOps/s $\color{#35bf28}+3.09\%$
test_compile_indexing[int-tensorclass-eager] 65.4720μs 18.8078μs 53.1695 KOps/s 52.0524 KOps/s $\color{#35bf28}+2.15\%$
test_compile_indexing[int-pytree-compile] 0.1277ms 44.9413μs 22.2512 KOps/s 21.5218 KOps/s $\color{#35bf28}+3.39\%$
test_compile_indexing[int-pytree-eager] 72.5150μs 18.8581μs 53.0275 KOps/s 52.3619 KOps/s $\color{#35bf28}+1.27\%$
test_mod_add[eager] 0.1334ms 35.4892μs 28.1776 KOps/s 26.7831 KOps/s $\textbf{\color{#35bf28}+5.21\%}$
test_mod_add[compile] 0.1236ms 48.1798μs 20.7556 KOps/s 20.0485 KOps/s $\color{#35bf28}+3.53\%$
test_mod_add[compile-overhead] 0.1171ms 48.1102μs 20.7856 KOps/s 19.7216 KOps/s $\textbf{\color{#35bf28}+5.40\%}$
test_mod_wrap[eager] 0.4154ms 0.2325ms 4.3011 KOps/s 4.2344 KOps/s $\color{#35bf28}+1.57\%$
test_mod_wrap[compile] 0.3272ms 0.2012ms 4.9694 KOps/s 4.6883 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_mod_wrap[compile-overhead] 0.4673ms 0.2049ms 4.8805 KOps/s 4.7393 KOps/s $\color{#35bf28}+2.98\%$
test_mod_wrap_and_backward[eager] 20.6115ms 12.7180ms 78.6290 Ops/s 63.4358 Ops/s $\textbf{\color{#35bf28}+23.95\%}$
test_mod_wrap_and_backward[compile] 15.4701ms 11.9163ms 83.9188 Ops/s 67.4303 Ops/s $\textbf{\color{#35bf28}+24.45\%}$
test_mod_wrap_and_backward[compile-overhead] 13.1598ms 11.4591ms 87.2671 Ops/s 62.5524 Ops/s $\textbf{\color{#35bf28}+39.51\%}$
test_seq_add[eager] 0.2544ms 0.1186ms 8.4283 KOps/s 8.2312 KOps/s $\color{#35bf28}+2.39\%$
test_seq_add[compile] 0.1479ms 63.2920μs 15.7998 KOps/s 15.4217 KOps/s $\color{#35bf28}+2.45\%$
test_seq_add[compile-overhead] 0.1335ms 61.5294μs 16.2524 KOps/s 15.9062 KOps/s $\color{#35bf28}+2.18\%$
test_seq_wrap[eager] 0.8460ms 0.4583ms 2.1818 KOps/s 2.1494 KOps/s $\color{#35bf28}+1.50\%$
test_seq_wrap[compile] 0.4603ms 0.2270ms 4.4054 KOps/s 4.1829 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_seq_wrap[compile-overhead] 0.4018ms 0.2274ms 4.3966 KOps/s 4.3484 KOps/s $\color{#35bf28}+1.11\%$
test_func_call_runtime[False-eager] 0.8139ms 0.5693ms 1.7566 KOps/s 1.8101 KOps/s $\color{#d91a1a}-2.96\%$
test_func_call_runtime[False-compile] 0.8203ms 0.4255ms 2.3504 KOps/s 2.3133 KOps/s $\color{#35bf28}+1.60\%$
test_func_call_runtime[False-compile-overhead] 0.6118ms 0.4264ms 2.3451 KOps/s 2.3104 KOps/s $\color{#35bf28}+1.50\%$
test_func_call_runtime[True-eager] 1.1107ms 0.7806ms 1.2811 KOps/s 1.2660 KOps/s $\color{#35bf28}+1.19\%$
test_func_call_runtime[True-compile] 0.9095ms 0.4702ms 2.1266 KOps/s 2.1134 KOps/s $\color{#35bf28}+0.62\%$
test_func_call_runtime[True-compile-overhead] 0.6376ms 0.4702ms 2.1266 KOps/s 2.1241 KOps/s $\color{#35bf28}+0.12\%$
test_func_call_cm_runtime[False-eager] 0.8438ms 0.5581ms 1.7919 KOps/s 1.7912 KOps/s $\color{#35bf28}+0.04\%$
test_func_call_cm_runtime[False-compile] 0.8682ms 0.4277ms 2.3379 KOps/s 2.3192 KOps/s $\color{#35bf28}+0.81\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5539ms 0.4233ms 2.3624 KOps/s 2.3227 KOps/s $\color{#35bf28}+1.71\%$
test_func_call_cm_runtime[True-eager] 1.9394ms 0.9408ms 1.0629 KOps/s 1.0765 KOps/s $\color{#d91a1a}-1.27\%$
test_func_call_cm_runtime[True-compile] 0.9015ms 0.4903ms 2.0394 KOps/s 2.0200 KOps/s $\color{#35bf28}+0.96\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6566ms 0.4927ms 2.0297 KOps/s 2.0144 KOps/s $\color{#35bf28}+0.76\%$
test_vmap_func_call_cm_runtime[eager] 2.8424ms 2.0582ms 485.8600 Ops/s 467.2149 Ops/s $\color{#35bf28}+3.99\%$
test_vmap_func_call_cm_runtime[compile] 0.7828ms 0.5289ms 1.8907 KOps/s 1.9225 KOps/s $\color{#d91a1a}-1.65\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.1823ms 0.5455ms 1.8331 KOps/s 1.9087 KOps/s $\color{#d91a1a}-3.96\%$
test_distributed 0.4005ms 0.1256ms 7.9644 KOps/s 7.6594 KOps/s $\color{#35bf28}+3.98\%$
test_tdmodule 0.1260ms 27.2754μs 36.6630 KOps/s 37.6900 KOps/s $\color{#d91a1a}-2.72\%$
test_tdmodule_dispatch 0.1084ms 50.4772μs 19.8109 KOps/s 20.2123 KOps/s $\color{#d91a1a}-1.99\%$
test_tdseq 86.1800μs 30.3149μs 32.9871 KOps/s 33.5381 KOps/s $\color{#d91a1a}-1.64\%$
test_tdseq_dispatch 0.1033ms 55.6276μs 17.9767 KOps/s 17.8085 KOps/s $\color{#35bf28}+0.94\%$
test_instantiation_functorch 2.2982ms 1.5886ms 629.4734 Ops/s 642.5866 Ops/s $\color{#d91a1a}-2.04\%$
test_exec_functorch 0.3045ms 0.1816ms 5.5062 KOps/s 5.4410 KOps/s $\color{#35bf28}+1.20\%$
test_exec_functional_call 0.3313ms 0.1769ms 5.6534 KOps/s 5.6079 KOps/s $\color{#35bf28}+0.81\%$
test_exec_td_decorator 0.5975ms 0.2361ms 4.2361 KOps/s 4.2452 KOps/s $\color{#d91a1a}-0.21\%$
test_vmap_mlp_speed_decorator[True-True] 1.1847ms 0.6767ms 1.4778 KOps/s 1.4899 KOps/s $\color{#d91a1a}-0.81\%$
test_vmap_mlp_speed_decorator[True-False] 1.2866ms 0.6860ms 1.4577 KOps/s 1.4911 KOps/s $\color{#d91a1a}-2.24\%$
test_vmap_mlp_speed_decorator[False-True] 0.9396ms 0.5402ms 1.8511 KOps/s 1.8656 KOps/s $\color{#d91a1a}-0.78\%$
test_vmap_mlp_speed_decorator[False-False] 0.8631ms 0.5427ms 1.8426 KOps/s 1.8673 KOps/s $\color{#d91a1a}-1.32\%$
test_to_module_speed[True] 2.4548ms 1.3548ms 738.1003 Ops/s 729.8391 Ops/s $\color{#35bf28}+1.13\%$
test_to_module_speed[False] 2.1937ms 1.3543ms 738.3897 Ops/s 762.0032 Ops/s $\color{#d91a1a}-3.10\%$
test_tc_init 88.9750μs 46.4037μs 21.5500 KOps/s 21.5807 KOps/s $\color{#d91a1a}-0.14\%$
test_tc_init_nested 0.1699ms 93.9529μs 10.6436 KOps/s 10.7354 KOps/s $\color{#d91a1a}-0.86\%$
test_tc_first_layer_tensor 31.1380μs 1.6020μs 624.2298 KOps/s 625.6424 KOps/s $\color{#d91a1a}-0.23\%$
test_tc_first_layer_nontensor 32.9610μs 4.6057μs 217.1206 KOps/s 211.5221 KOps/s $\color{#35bf28}+2.65\%$
test_tc_second_layer_tensor 23.8240μs 2.8577μs 349.9261 KOps/s 348.5311 KOps/s $\color{#35bf28}+0.40\%$
test_tc_second_layer_nontensor 55.2530μs 5.9729μs 167.4230 KOps/s 164.2193 KOps/s $\color{#35bf28}+1.95\%$
test_unbind 0.2749s 16.2281ms 61.6216 Ops/s 50.8657 Ops/s $\textbf{\color{#35bf28}+21.15\%}$
test_full_like 16.8880ms 11.5881ms 86.2951 Ops/s 71.2935 Ops/s $\textbf{\color{#35bf28}+21.04\%}$
test_zeros_like 7.8381ms 4.4552ms 224.4584 Ops/s 244.7646 Ops/s $\textbf{\color{#d91a1a}-8.30\%}$
test_ones_like 7.2662ms 5.0027ms 199.8922 Ops/s 129.4478 Ops/s $\textbf{\color{#35bf28}+54.42\%}$
test_clone 12.1308ms 8.1177ms 123.1879 Ops/s 93.0662 Ops/s $\textbf{\color{#35bf28}+32.37\%}$
test_squeeze 76.7230μs 12.2337μs 81.7413 KOps/s 81.3912 KOps/s $\color{#35bf28}+0.43\%$
test_unsqueeze 0.2018ms 93.6463μs 10.6785 KOps/s 10.7109 KOps/s $\color{#d91a1a}-0.30\%$
test_split 0.5395ms 0.1980ms 5.0517 KOps/s 5.0024 KOps/s $\color{#35bf28}+0.99\%$
test_permute 0.4062ms 0.2022ms 4.9460 KOps/s 4.9657 KOps/s $\color{#d91a1a}-0.40\%$
test_stack 39.6256ms 31.7117ms 31.5341 Ops/s 30.6699 Ops/s $\color{#35bf28}+2.82\%$
test_cat 45.3636ms 32.3152ms 30.9452 Ops/s 32.2363 Ops/s $\color{#d91a1a}-4.01\%$

Copy link

github-actions bot commented Jan 9, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}35$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.3967ms 11.1975μs 89.3057 KOps/s 78.2651 KOps/s $\textbf{\color{#35bf28}+14.11\%}$
test_plain_set_stack_nested 37.7610μs 11.5410μs 86.6476 KOps/s 77.0228 KOps/s $\textbf{\color{#35bf28}+12.50\%}$
test_plain_set_nested_inplace 0.3989ms 12.4017μs 80.6340 KOps/s 72.2581 KOps/s $\textbf{\color{#35bf28}+11.59\%}$
test_plain_set_stack_nested_inplace 55.0220μs 12.4905μs 80.0607 KOps/s 71.9302 KOps/s $\textbf{\color{#35bf28}+11.30\%}$
test_items 31.5310μs 2.9032μs 344.4514 KOps/s 336.7721 KOps/s $\color{#35bf28}+2.28\%$
test_items_nested 0.7522ms 0.3612ms 2.7687 KOps/s 2.7802 KOps/s $\color{#d91a1a}-0.41\%$
test_items_nested_locked 0.7613ms 0.3644ms 2.7443 KOps/s 2.7529 KOps/s $\color{#d91a1a}-0.31\%$
test_items_nested_leaf 88.6340μs 57.8702μs 17.2800 KOps/s 17.1509 KOps/s $\color{#35bf28}+0.75\%$
test_items_stack_nested 0.7623ms 0.3643ms 2.7448 KOps/s 2.8023 KOps/s $\color{#d91a1a}-2.05\%$
test_items_stack_nested_leaf 0.4678ms 58.9768μs 16.9558 KOps/s 16.5219 KOps/s $\color{#35bf28}+2.63\%$
test_items_stack_nested_locked 0.4991ms 0.3625ms 2.7589 KOps/s 2.7769 KOps/s $\color{#d91a1a}-0.65\%$
test_keys 35.4410μs 3.4949μs 286.1332 KOps/s 291.8173 KOps/s $\color{#d91a1a}-1.95\%$
test_keys_nested 0.1333ms 88.2676μs 11.3292 KOps/s 12.1831 KOps/s $\textbf{\color{#d91a1a}-7.01\%}$
test_keys_nested_locked 0.8146ms 93.6598μs 10.6769 KOps/s 11.3990 KOps/s $\textbf{\color{#d91a1a}-6.33\%}$
test_keys_nested_leaf 0.1272ms 78.3781μs 12.7587 KOps/s 13.7230 KOps/s $\textbf{\color{#d91a1a}-7.03\%}$
test_keys_stack_nested 0.1395ms 89.3664μs 11.1899 KOps/s 12.0634 KOps/s $\textbf{\color{#d91a1a}-7.24\%}$
test_keys_stack_nested_leaf 0.1295ms 79.9151μs 12.5133 KOps/s 13.5795 KOps/s $\textbf{\color{#d91a1a}-7.85\%}$
test_keys_stack_nested_locked 0.1611ms 96.0333μs 10.4131 KOps/s 11.3289 KOps/s $\textbf{\color{#d91a1a}-8.08\%}$
test_values 5.3103μs 0.8535μs 1.1716 MOps/s 1.1776 MOps/s $\color{#d91a1a}-0.51\%$
test_values_nested 67.6220μs 37.2980μs 26.8111 KOps/s 29.0247 KOps/s $\textbf{\color{#d91a1a}-7.63\%}$
test_values_nested_locked 86.6930μs 38.6941μs 25.8438 KOps/s 27.6203 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_values_nested_leaf 69.8530μs 41.4700μs 24.1138 KOps/s 25.3669 KOps/s $\color{#d91a1a}-4.94\%$
test_values_stack_nested 79.5630μs 38.0046μs 26.3126 KOps/s 28.5240 KOps/s $\textbf{\color{#d91a1a}-7.75\%}$
test_values_stack_nested_leaf 0.1143ms 41.8545μs 23.8923 KOps/s 25.4482 KOps/s $\textbf{\color{#d91a1a}-6.11\%}$
test_values_stack_nested_locked 74.3130μs 39.7158μs 25.1789 KOps/s 27.4248 KOps/s $\textbf{\color{#d91a1a}-8.19\%}$
test_membership 1.9226μs 0.5082μs 1.9677 MOps/s 1.9418 MOps/s $\color{#35bf28}+1.34\%$
test_membership_nested 18.2560μs 1.9526μs 512.1270 KOps/s 512.6203 KOps/s $\color{#d91a1a}-0.10\%$
test_membership_nested_leaf 18.8210μs 1.9836μs 504.1352 KOps/s 505.3926 KOps/s $\color{#d91a1a}-0.25\%$
test_membership_stacked_nested 32.2420μs 2.0388μs 490.4956 KOps/s 484.6061 KOps/s $\color{#35bf28}+1.22\%$
test_membership_stacked_nested_leaf 23.5200μs 2.0179μs 495.5566 KOps/s 483.9393 KOps/s $\color{#35bf28}+2.40\%$
test_membership_nested_last 39.5510μs 3.0685μs 325.8872 KOps/s 327.2809 KOps/s $\color{#d91a1a}-0.43\%$
test_membership_nested_leaf_last 37.5320μs 3.0846μs 324.1913 KOps/s 321.2412 KOps/s $\color{#35bf28}+0.92\%$
test_membership_stacked_nested_last 35.7920μs 3.0639μs 326.3836 KOps/s 120.9242 KOps/s $\textbf{\color{#35bf28}+169.91\%}$
test_membership_stacked_nested_leaf_last 42.5220μs 3.0847μs 324.1833 KOps/s 121.9812 KOps/s $\textbf{\color{#35bf28}+165.76\%}$
test_nested_getleaf 36.7910μs 6.0891μs 164.2267 KOps/s 162.3881 KOps/s $\color{#35bf28}+1.13\%$
test_nested_get 30.7620μs 5.7869μs 172.8046 KOps/s 171.9933 KOps/s $\color{#35bf28}+0.47\%$
test_stacked_getleaf 42.2510μs 6.1172μs 163.4733 KOps/s 163.7006 KOps/s $\color{#d91a1a}-0.14\%$
test_stacked_get 27.3010μs 5.7772μs 173.0931 KOps/s 172.8298 KOps/s $\color{#35bf28}+0.15\%$
test_nested_getitemleaf 32.2810μs 6.3617μs 157.1896 KOps/s 161.7074 KOps/s $\color{#d91a1a}-2.79\%$
test_nested_getitem 31.7010μs 6.0338μs 165.7322 KOps/s 166.7548 KOps/s $\color{#d91a1a}-0.61\%$
test_stacked_getitemleaf 40.6910μs 6.3833μs 156.6583 KOps/s 160.0527 KOps/s $\color{#d91a1a}-2.12\%$
test_stacked_getitem 31.2310μs 6.0743μs 164.6270 KOps/s 168.2540 KOps/s $\color{#d91a1a}-2.16\%$
test_lock_nested 0.6993ms 0.3733ms 2.6791 KOps/s 2.6434 KOps/s $\color{#35bf28}+1.35\%$
test_lock_stack_nested 0.3967ms 0.3454ms 2.8951 KOps/s 2.9426 KOps/s $\color{#d91a1a}-1.61\%$
test_unlock_nested 0.6251ms 0.3154ms 3.1709 KOps/s 3.1577 KOps/s $\color{#35bf28}+0.42\%$
test_unlock_stack_nested 0.3375ms 0.2862ms 3.4939 KOps/s 3.5999 KOps/s $\color{#d91a1a}-2.94\%$
test_flatten_speed 0.1423ms 73.0494μs 13.6894 KOps/s 13.4292 KOps/s $\color{#35bf28}+1.94\%$
test_unflatten_speed 0.4388ms 0.3158ms 3.1663 KOps/s 3.0967 KOps/s $\color{#35bf28}+2.25\%$
test_common_ops 1.5616ms 0.5806ms 1.7223 KOps/s 1.6022 KOps/s $\textbf{\color{#35bf28}+7.50\%}$
test_creation 0.1929ms 1.7389μs 575.0670 KOps/s 569.5233 KOps/s $\color{#35bf28}+0.97\%$
test_creation_empty 36.8420μs 6.9528μs 143.8261 KOps/s 108.5713 KOps/s $\textbf{\color{#35bf28}+32.47\%}$
test_creation_nested_1 33.1910μs 8.5789μs 116.5650 KOps/s 90.8440 KOps/s $\textbf{\color{#35bf28}+28.31\%}$
test_creation_nested_2 44.0520μs 11.3194μs 88.3438 KOps/s 72.5512 KOps/s $\textbf{\color{#35bf28}+21.77\%}$
test_clone 45.8620μs 10.9446μs 91.3691 KOps/s 97.7012 KOps/s $\textbf{\color{#d91a1a}-6.48\%}$
test_getitem[int] 1.8032ms 11.3495μs 88.1096 KOps/s 91.1449 KOps/s $\color{#d91a1a}-3.33\%$
test_getitem[slice_int] 0.1132ms 21.3129μs 46.9200 KOps/s 47.7795 KOps/s $\color{#d91a1a}-1.80\%$
test_getitem[range] 0.1368ms 37.2299μs 26.8601 KOps/s 25.8509 KOps/s $\color{#35bf28}+3.90\%$
test_getitem[tuple] 0.1078ms 18.7957μs 53.2035 KOps/s 54.0529 KOps/s $\color{#d91a1a}-1.57\%$
test_getitem[list] 0.1597ms 32.4661μs 30.8014 KOps/s 31.1136 KOps/s $\color{#d91a1a}-1.00\%$
test_setitem_dim[int] 37.2310μs 19.5029μs 51.2743 KOps/s 53.9591 KOps/s $\color{#d91a1a}-4.98\%$
test_setitem_dim[slice_int] 66.1730μs 38.4305μs 26.0210 KOps/s 25.9404 KOps/s $\color{#35bf28}+0.31\%$
test_setitem_dim[range] 81.8830μs 52.0890μs 19.1979 KOps/s 18.8058 KOps/s $\color{#35bf28}+2.09\%$
test_setitem_dim[tuple] 58.9920μs 32.1073μs 31.1456 KOps/s 32.5284 KOps/s $\color{#d91a1a}-4.25\%$
test_setitem 49.9520μs 14.3795μs 69.5432 KOps/s 66.7330 KOps/s $\color{#35bf28}+4.21\%$
test_set 49.4020μs 13.7543μs 72.7044 KOps/s 67.9367 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_set_shared 1.6608ms 0.1515ms 6.5992 KOps/s 6.6792 KOps/s $\color{#d91a1a}-1.20\%$
test_update 0.4668ms 15.9222μs 62.8056 KOps/s 55.3758 KOps/s $\textbf{\color{#35bf28}+13.42\%}$
test_update_nested 0.1184ms 21.3477μs 46.8435 KOps/s 42.4369 KOps/s $\textbf{\color{#35bf28}+10.38\%}$
test_update__nested 1.2862ms 25.4112μs 39.3527 KOps/s 40.5874 KOps/s $\color{#d91a1a}-3.04\%$
test_set_nested 0.1247ms 15.1475μs 66.0176 KOps/s 63.3865 KOps/s $\color{#35bf28}+4.15\%$
test_set_nested_new 0.1203ms 17.7960μs 56.1923 KOps/s 54.5334 KOps/s $\color{#35bf28}+3.04\%$
test_select 83.2130μs 29.3188μs 34.1078 KOps/s 33.1654 KOps/s $\color{#35bf28}+2.84\%$
test_select_nested 77.2130μs 44.2448μs 22.6015 KOps/s 22.5653 KOps/s $\color{#35bf28}+0.16\%$
test_exclude_nested 0.1141ms 62.9298μs 15.8907 KOps/s 15.9458 KOps/s $\color{#d91a1a}-0.35\%$
test_empty[True] 0.6891ms 0.2951ms 3.3887 KOps/s 3.4427 KOps/s $\color{#d91a1a}-1.57\%$
test_empty[False] 41.2455μs 0.8272μs 1.2089 MOps/s 1.2223 MOps/s $\color{#d91a1a}-1.10\%$
test_to 88.1340μs 57.0337μs 17.5335 KOps/s 17.6578 KOps/s $\color{#d91a1a}-0.70\%$
test_to_nonblocking 0.9618ms 48.9686μs 20.4213 KOps/s 21.1963 KOps/s $\color{#d91a1a}-3.66\%$
test_unbind_speed 0.6614ms 0.2410ms 4.1492 KOps/s 4.2059 KOps/s $\color{#d91a1a}-1.35\%$
test_unbind_speed_stack0 0.6464ms 0.2431ms 4.1134 KOps/s 4.2600 KOps/s $\color{#d91a1a}-3.44\%$
test_unbind_speed_stack1 94.2779ms 0.6710ms 1.4903 KOps/s 1.5041 KOps/s $\color{#d91a1a}-0.92\%$
test_split 95.5320ms 1.7071ms 585.7878 Ops/s 617.3459 Ops/s $\textbf{\color{#d91a1a}-5.11\%}$
test_chunk 95.9011ms 1.7130ms 583.7676 Ops/s 613.9080 Ops/s $\color{#d91a1a}-4.91\%$
test_consolidate[False-None] 98.0917ms 3.0344ms 329.5538 Ops/s 332.6700 Ops/s $\color{#d91a1a}-0.94\%$
test_consolidate[default-None] 1.9227ms 1.7687ms 565.3947 Ops/s 586.5856 Ops/s $\color{#d91a1a}-3.61\%$
test_consolidate[reduce-overhead-None] 1.8735ms 1.7783ms 562.3458 Ops/s 567.9303 Ops/s $\color{#d91a1a}-0.98\%$
test_consolidate_njt[False-None] 0.3028s 8.2237ms 121.5991 Ops/s 113.0538 Ops/s $\textbf{\color{#35bf28}+7.56\%}$
test_to[False-False-None] 1.8406ms 1.7265ms 579.2150 Ops/s 573.7178 Ops/s $\color{#35bf28}+0.96\%$
test_to[True-False-None] 1.4263ms 1.3247ms 754.9050 Ops/s 768.2462 Ops/s $\color{#d91a1a}-1.74\%$
test_to[within-False-None] 4.3801ms 4.1481ms 241.0715 Ops/s 241.4042 Ops/s $\color{#d91a1a}-0.14\%$
test_to[True-default-None] 5.2621ms 5.1317ms 194.8685 Ops/s 189.9998 Ops/s $\color{#35bf28}+2.56\%$
test_to_njt[False-False-None] 6.8793ms 6.7480ms 148.1914 Ops/s 144.2342 Ops/s $\color{#35bf28}+2.74\%$
test_to_njt[True-False-None] 5.7425ms 5.3315ms 187.5659 Ops/s 179.7203 Ops/s $\color{#35bf28}+4.37\%$
test_to_njt[within-False-None] 11.8247ms 11.7224ms 85.3067 Ops/s 81.6011 Ops/s $\color{#35bf28}+4.54\%$
test_creation[device0] 0.4638ms 79.6123μs 12.5609 KOps/s 12.4708 KOps/s $\color{#35bf28}+0.72\%$
test_creation_from_tensor 0.4706ms 83.0086μs 12.0469 KOps/s 11.8355 KOps/s $\color{#35bf28}+1.79\%$
test_add_one[memmap_tensor0] 0.4678ms 6.4886μs 154.1153 KOps/s 159.0490 KOps/s $\color{#d91a1a}-3.10\%$
test_contiguous[memmap_tensor0] 4.7337μs 0.4015μs 2.4906 MOps/s 2.5390 MOps/s $\color{#d91a1a}-1.91\%$
test_stack[memmap_tensor0] 42.4510μs 4.8773μs 205.0323 KOps/s 213.3569 KOps/s $\color{#d91a1a}-3.90\%$
test_memmaptd_index 2.0705ms 0.2613ms 3.8268 KOps/s 3.8734 KOps/s $\color{#d91a1a}-1.20\%$
test_memmaptd_index_astensor 0.8222ms 0.3213ms 3.1122 KOps/s 3.1130 KOps/s $\color{#d91a1a}-0.03\%$
test_memmaptd_index_op 1.4494ms 0.5584ms 1.7907 KOps/s 1.6831 KOps/s $\textbf{\color{#35bf28}+6.39\%}$
test_serialize_model 0.1322s 0.1311s 7.6292 Ops/s 7.6213 Ops/s $\color{#35bf28}+0.10\%$
test_serialize_model_pickle 1.3494s 1.2163s 0.8222 Ops/s 0.8236 Ops/s $\color{#d91a1a}-0.17\%$
test_serialize_weights 0.1340s 0.1308s 7.6472 Ops/s 7.6515 Ops/s $\color{#d91a1a}-0.06\%$
test_serialize_weights_returnearly 0.3443s 60.8175ms 16.4426 Ops/s 12.6522 Ops/s $\textbf{\color{#35bf28}+29.96\%}$
test_serialize_weights_pickle 1.3564s 1.2127s 0.8246 Ops/s 0.8211 Ops/s $\color{#35bf28}+0.42\%$
test_reshape_pytree 69.6830μs 22.1808μs 45.0841 KOps/s 45.7273 KOps/s $\color{#d91a1a}-1.41\%$
test_reshape_td 57.8620μs 26.6657μs 37.5014 KOps/s 36.6081 KOps/s $\color{#35bf28}+2.44\%$
test_view_pytree 53.4720μs 22.0353μs 45.3816 KOps/s 45.8446 KOps/s $\color{#d91a1a}-1.01\%$
test_view_td 60.2030μs 29.3418μs 34.0810 KOps/s 33.2689 KOps/s $\color{#35bf28}+2.44\%$
test_unbind_pytree 68.9120μs 28.4337μs 35.1695 KOps/s 36.0727 KOps/s $\color{#d91a1a}-2.50\%$
test_unbind_td 0.6059ms 36.5805μs 27.3370 KOps/s 27.5051 KOps/s $\color{#d91a1a}-0.61\%$
test_split_pytree 61.2520μs 30.5973μs 32.6826 KOps/s 32.6778 KOps/s $\color{#35bf28}+0.01\%$
test_split_td 0.7775ms 40.7569μs 24.5357 KOps/s 25.9045 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_add_pytree 63.4120μs 33.2455μs 30.0793 KOps/s 30.2455 KOps/s $\color{#d91a1a}-0.55\%$
test_add_td 0.1925ms 47.6821μs 20.9722 KOps/s 19.7487 KOps/s $\textbf{\color{#35bf28}+6.20\%}$
test_compile_add_one_nested[tensordict-compile] 0.1735ms 0.1179ms 8.4782 KOps/s 8.1308 KOps/s $\color{#35bf28}+4.27\%$
test_compile_add_one_nested[tensordict-eager] 0.2301ms 0.1305ms 7.6651 KOps/s 7.5263 KOps/s $\color{#35bf28}+1.84\%$
test_compile_add_one_nested[pytree-compile] 0.2043ms 94.3155μs 10.6027 KOps/s 10.5679 KOps/s $\color{#35bf28}+0.33\%$
test_compile_add_one_nested[pytree-eager] 1.0774ms 0.1509ms 6.6281 KOps/s 6.6773 KOps/s $\color{#d91a1a}-0.74\%$
test_compile_copy_nested[tensordict-compile] 63.7320μs 22.9187μs 43.6325 KOps/s 30.1951 KOps/s $\textbf{\color{#35bf28}+44.50\%}$
test_compile_copy_nested[tensordict-eager] 62.0520μs 29.6432μs 33.7346 KOps/s 33.6078 KOps/s $\color{#35bf28}+0.38\%$
test_compile_copy_nested[pytree-compile] 0.4431ms 64.0379μs 15.6158 KOps/s 15.2413 KOps/s $\color{#35bf28}+2.46\%$
test_compile_copy_nested[pytree-eager] 83.4730μs 48.7642μs 20.5069 KOps/s 20.2471 KOps/s $\color{#35bf28}+1.28\%$
test_compile_add_one_flat[tensordict-compile] 0.1827ms 0.1414ms 7.0723 KOps/s 7.1110 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_add_one_flat[tensordict-eager] 0.3189ms 0.2159ms 4.6319 KOps/s 4.6275 KOps/s $\color{#35bf28}+0.09\%$
test_compile_add_one_flat[tensorclass-compile] 0.1431ms 96.8478μs 10.3255 KOps/s 10.3239 KOps/s $\color{#35bf28}+0.02\%$
test_compile_add_one_flat[tensorclass-eager] 0.1131ms 55.3411μs 18.0697 KOps/s 18.1951 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_add_one_flat[pytree-compile] 0.2427ms 0.1351ms 7.4027 KOps/s 7.4688 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_add_one_flat[pytree-eager] 0.5303ms 0.4897ms 2.0422 KOps/s 2.0611 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_add_self_flat[tensordict-eager] 0.4118ms 0.2582ms 3.8729 KOps/s 3.8136 KOps/s $\color{#35bf28}+1.56\%$
test_compile_add_self_flat[tensordict-compile] 0.2006ms 0.1426ms 7.0104 KOps/s 7.1636 KOps/s $\color{#d91a1a}-2.14\%$
test_compile_add_self_flat[tensorclass-eager] 0.1606ms 66.4871μs 15.0405 KOps/s 15.0314 KOps/s $\color{#35bf28}+0.06\%$
test_compile_add_self_flat[tensorclass-compile] 0.1336ms 96.6467μs 10.3470 KOps/s 10.2373 KOps/s $\color{#35bf28}+1.07\%$
test_compile_add_self_flat[pytree-eager] 0.4801ms 0.4247ms 2.3546 KOps/s 2.4507 KOps/s $\color{#d91a1a}-3.92\%$
test_compile_add_self_flat[pytree-compile] 0.2045ms 0.1329ms 7.5217 KOps/s 7.4990 KOps/s $\color{#35bf28}+0.30\%$
test_compile_copy_flat[tensordict-compile] 79.7730μs 18.5820μs 53.8154 KOps/s 28.7793 KOps/s $\textbf{\color{#35bf28}+86.99\%}$
test_compile_copy_flat[tensordict-eager] 55.0720μs 30.7652μs 32.5043 KOps/s 31.6139 KOps/s $\color{#35bf28}+2.82\%$
test_compile_copy_flat[pytree-compile] 0.2140ms 70.5348μs 14.1774 KOps/s 14.3632 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_copy_flat[pytree-eager] 87.0230μs 52.0480μs 19.2130 KOps/s 19.6424 KOps/s $\color{#d91a1a}-2.19\%$
test_compile_assign_and_add[tensordict-compile] 1.6276ms 0.3904ms 2.5617 KOps/s 2.2024 KOps/s $\textbf{\color{#35bf28}+16.31\%}$
test_compile_assign_and_add[tensordict-eager] 2.8218ms 2.6039ms 384.0373 Ops/s 389.7622 Ops/s $\color{#d91a1a}-1.47\%$
test_compile_assign_and_add[pytree-compile] 1.6003ms 0.4358ms 2.2947 KOps/s 2.2689 KOps/s $\color{#35bf28}+1.14\%$
test_compile_assign_and_add[pytree-eager] 2.7618ms 2.6584ms 376.1629 Ops/s 384.9034 Ops/s $\color{#d91a1a}-2.27\%$
test_compile_indexing[tensor-tensordict-compile] 0.2187ms 0.1161ms 8.6140 KOps/s 8.9580 KOps/s $\color{#d91a1a}-3.84\%$
test_compile_indexing[tensor-tensordict-eager] 0.6093ms 80.4755μs 12.4261 KOps/s 12.9202 KOps/s $\color{#d91a1a}-3.82\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2374ms 0.1036ms 9.6555 KOps/s 9.7005 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1183ms 67.8409μs 14.7404 KOps/s 14.4759 KOps/s $\color{#35bf28}+1.83\%$
test_compile_indexing[tensor-pytree-compile] 0.1989ms 0.1102ms 9.0706 KOps/s 9.4812 KOps/s $\color{#d91a1a}-4.33\%$
test_compile_indexing[tensor-pytree-eager] 0.1602ms 69.3091μs 14.4281 KOps/s 14.3931 KOps/s $\color{#35bf28}+0.24\%$
test_compile_indexing[slice-tensordict-compile] 0.2506ms 0.1050ms 9.5254 KOps/s 9.7458 KOps/s $\color{#d91a1a}-2.26\%$
test_compile_indexing[slice-tensordict-eager] 0.1435ms 17.1472μs 58.3187 KOps/s 58.1011 KOps/s $\color{#35bf28}+0.37\%$
test_compile_indexing[slice-tensorclass-compile] 0.1546ms 97.6665μs 10.2389 KOps/s 10.4085 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_indexing[slice-tensorclass-eager] 0.1043ms 15.8510μs 63.0874 KOps/s 63.3007 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_indexing[slice-pytree-compile] 0.2015ms 98.6285μs 10.1391 KOps/s 10.2586 KOps/s $\color{#d91a1a}-1.16\%$
test_compile_indexing[slice-pytree-eager] 0.1075ms 15.9271μs 62.7862 KOps/s 64.3393 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_indexing[int-tensordict-compile] 0.1725ms 0.1031ms 9.7039 KOps/s 9.8823 KOps/s $\color{#d91a1a}-1.81\%$
test_compile_indexing[int-tensordict-eager] 0.6068ms 17.1647μs 58.2592 KOps/s 58.6559 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_indexing[int-tensorclass-compile] 0.2012ms 98.8809μs 10.1132 KOps/s 10.3095 KOps/s $\color{#d91a1a}-1.90\%$
test_compile_indexing[int-tensorclass-eager] 0.1049ms 15.9131μs 62.8413 KOps/s 63.8137 KOps/s $\color{#d91a1a}-1.52\%$
test_compile_indexing[int-pytree-compile] 0.2155ms 99.9613μs 10.0039 KOps/s 10.3156 KOps/s $\color{#d91a1a}-3.02\%$
test_compile_indexing[int-pytree-eager] 0.1031ms 15.8108μs 63.2481 KOps/s 64.3232 KOps/s $\color{#d91a1a}-1.67\%$
test_mod_add[eager] 0.2080ms 35.4478μs 28.2105 KOps/s 24.4867 KOps/s $\textbf{\color{#35bf28}+15.21\%}$
test_mod_add[compile] 0.1849ms 77.8906μs 12.8385 KOps/s 12.2481 KOps/s $\color{#35bf28}+4.82\%$
test_mod_add[compile-overhead] 0.3253ms 0.1658ms 6.0324 KOps/s 5.6966 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_mod_wrap[eager] 0.3445ms 0.2382ms 4.1983 KOps/s 3.9099 KOps/s $\textbf{\color{#35bf28}+7.38\%}$
test_mod_wrap[compile] 0.3810ms 0.2825ms 3.5397 KOps/s 3.5261 KOps/s $\color{#35bf28}+0.39\%$
test_mod_wrap[compile-overhead] 6.9529ms 3.7510ms 266.5935 Ops/s 267.5136 Ops/s $\color{#d91a1a}-0.34\%$
test_mod_wrap_and_backward[eager] 1.6992ms 1.3422ms 745.0369 Ops/s 697.5581 Ops/s $\textbf{\color{#35bf28}+6.81\%}$
test_mod_wrap_and_backward[compile] 1.3851ms 1.2657ms 790.0522 Ops/s 728.7287 Ops/s $\textbf{\color{#35bf28}+8.42\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3551ms 0.9158ms 1.0919 KOps/s 953.7447 Ops/s $\textbf{\color{#35bf28}+14.49\%}$
test_seq_add[eager] 0.1756ms 0.1112ms 8.9920 KOps/s 8.5530 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_seq_add[compile] 0.1581ms 92.0681μs 10.8615 KOps/s 11.3565 KOps/s $\color{#d91a1a}-4.36\%$
test_seq_add[compile-overhead] 0.1773ms 0.1316ms 7.5962 KOps/s 7.8283 KOps/s $\color{#d91a1a}-2.96\%$
test_seq_wrap[eager] 0.8242ms 0.4206ms 2.3777 KOps/s 2.3767 KOps/s $\color{#35bf28}+0.04\%$
test_seq_wrap[compile] 0.7200ms 0.3086ms 3.2401 KOps/s 3.3353 KOps/s $\color{#d91a1a}-2.85\%$
test_seq_wrap[compile-overhead] 0.2697ms 0.2211ms 4.5227 KOps/s 4.4563 KOps/s $\color{#35bf28}+1.49\%$
test_func_call_runtime[False-eager] 0.8014ms 0.7068ms 1.4148 KOps/s 1.3922 KOps/s $\color{#35bf28}+1.62\%$
test_func_call_runtime[False-compile] 1.2232ms 0.7664ms 1.3048 KOps/s 1.3492 KOps/s $\color{#d91a1a}-3.29\%$
test_func_call_runtime[False-compile-overhead] 0.7659ms 0.3724ms 2.6856 KOps/s 2.7726 KOps/s $\color{#d91a1a}-3.14\%$
test_func_call_runtime[True-eager] 1.3210ms 0.9102ms 1.0987 KOps/s 1.1384 KOps/s $\color{#d91a1a}-3.49\%$
test_func_call_runtime[True-compile] 1.2279ms 0.7844ms 1.2748 KOps/s 1.3224 KOps/s $\color{#d91a1a}-3.60\%$
test_func_call_runtime[True-compile-overhead] 0.4424ms 0.3783ms 2.6433 KOps/s 2.6313 KOps/s $\color{#35bf28}+0.46\%$
test_func_call_cm_runtime[False-eager] 0.8108ms 0.7038ms 1.4208 KOps/s 1.3910 KOps/s $\color{#35bf28}+2.14\%$
test_func_call_cm_runtime[False-compile] 0.8147ms 0.7363ms 1.3582 KOps/s 1.3309 KOps/s $\color{#35bf28}+2.05\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4570ms 0.3606ms 2.7730 KOps/s 2.7316 KOps/s $\color{#35bf28}+1.52\%$
test_func_call_cm_runtime[True-eager] 1.1194ms 0.9728ms 1.0279 KOps/s 1.0098 KOps/s $\color{#35bf28}+1.80\%$
test_func_call_cm_runtime[True-compile] 0.9412ms 0.7792ms 1.2834 KOps/s 1.2571 KOps/s $\color{#35bf28}+2.09\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8208ms 0.4059ms 2.4636 KOps/s 2.4558 KOps/s $\color{#35bf28}+0.32\%$
test_vmap_func_call_cm_runtime[eager] 2.4641ms 2.0177ms 495.6225 Ops/s 489.2664 Ops/s $\color{#35bf28}+1.30\%$
test_vmap_func_call_cm_runtime[compile] 1.1960ms 0.7907ms 1.2647 KOps/s 1.2364 KOps/s $\color{#35bf28}+2.29\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8485ms 0.4088ms 2.4459 KOps/s 2.4424 KOps/s $\color{#35bf28}+0.14\%$
test_distributed 4.4408ms 0.1800ms 5.5556 KOps/s 8.3538 KOps/s $\textbf{\color{#d91a1a}-33.50\%}$
test_tdmodule 0.3045ms 18.9542μs 52.7588 KOps/s 47.5214 KOps/s $\textbf{\color{#35bf28}+11.02\%}$
test_tdmodule_dispatch 59.0930μs 33.4583μs 29.8880 KOps/s 27.0124 KOps/s $\textbf{\color{#35bf28}+10.65\%}$
test_tdseq 44.3310μs 19.5473μs 51.1579 KOps/s 47.1525 KOps/s $\textbf{\color{#35bf28}+8.49\%}$
test_tdseq_dispatch 60.8030μs 36.3552μs 27.5064 KOps/s 25.2633 KOps/s $\textbf{\color{#35bf28}+8.88\%}$
test_instantiation_functorch 1.6068ms 1.5333ms 652.1730 Ops/s 652.0140 Ops/s $\color{#35bf28}+0.02\%$
test_exec_functorch 0.1856ms 0.1429ms 6.9958 KOps/s 7.1209 KOps/s $\color{#d91a1a}-1.76\%$
test_exec_functional_call 0.1756ms 0.1334ms 7.4960 KOps/s 7.6157 KOps/s $\color{#d91a1a}-1.57\%$
test_exec_td_decorator 0.3836ms 0.1801ms 5.5535 KOps/s 5.6046 KOps/s $\color{#d91a1a}-0.91\%$
test_vmap_mlp_speed_decorator[True-True] 0.7861ms 0.6671ms 1.4991 KOps/s 1.4860 KOps/s $\color{#35bf28}+0.88\%$
test_vmap_mlp_speed_decorator[True-False] 0.7539ms 0.6609ms 1.5130 KOps/s 1.4933 KOps/s $\color{#35bf28}+1.32\%$
test_vmap_mlp_speed_decorator[False-True] 0.7023ms 0.5791ms 1.7268 KOps/s 1.7209 KOps/s $\color{#35bf28}+0.34\%$
test_vmap_mlp_speed_decorator[False-False] 0.7199ms 0.5848ms 1.7099 KOps/s 1.7162 KOps/s $\color{#d91a1a}-0.37\%$
test_vmap_transformer_speed_decorator[True-True] 19.0862ms 18.7659ms 53.2880 Ops/s 53.1251 Ops/s $\color{#35bf28}+0.31\%$
test_vmap_transformer_speed_decorator[True-False] 19.4739ms 18.7997ms 53.1925 Ops/s 53.1897 Ops/s $+0.01\%$
test_vmap_transformer_speed_decorator[False-True] 19.7307ms 18.8049ms 53.1775 Ops/s 53.6781 Ops/s $\color{#d91a1a}-0.93\%$
test_vmap_transformer_speed_decorator[False-False] 19.0227ms 18.6289ms 53.6800 Ops/s 53.5237 Ops/s $\color{#35bf28}+0.29\%$
test_to_module_speed[True] 1.0787ms 0.9736ms 1.0271 KOps/s 1.0308 KOps/s $\color{#d91a1a}-0.36\%$
test_to_module_speed[False] 1.5131ms 0.9391ms 1.0648 KOps/s 1.0372 KOps/s $\color{#35bf28}+2.66\%$
test_tc_init 62.9830μs 34.5004μs 28.9851 KOps/s 27.4256 KOps/s $\textbf{\color{#35bf28}+5.69\%}$
test_tc_init_nested 0.1051ms 71.9114μs 13.9060 KOps/s 13.9148 KOps/s $\color{#d91a1a}-0.06\%$
test_tc_first_layer_tensor 32.7220μs 0.7981μs 1.2530 MOps/s 1.2304 MOps/s $\color{#35bf28}+1.84\%$
test_tc_first_layer_nontensor 20.0810μs 2.2615μs 442.1940 KOps/s 451.0266 KOps/s $\color{#d91a1a}-1.96\%$
test_tc_second_layer_tensor 31.2510μs 1.4944μs 669.1583 KOps/s 705.8785 KOps/s $\textbf{\color{#d91a1a}-5.20\%}$
test_tc_second_layer_nontensor 33.6310μs 2.9997μs 333.3643 KOps/s 338.8064 KOps/s $\color{#d91a1a}-1.61\%$
test_unbind 0.2305s 10.0063ms 99.9367 Ops/s 143.7064 Ops/s $\textbf{\color{#d91a1a}-30.46\%}$
test_full_like 9.7356ms 9.1479ms 109.3142 Ops/s 109.1881 Ops/s $\color{#35bf28}+0.12\%$
test_zeros_like 6.6437ms 4.3602ms 229.3470 Ops/s 140.8856 Ops/s $\textbf{\color{#35bf28}+62.79\%}$
test_ones_like 9.2830ms 7.3040ms 136.9109 Ops/s 138.5971 Ops/s $\color{#d91a1a}-1.22\%$
test_clone 6.6458ms 6.4393ms 155.2975 Ops/s 109.4984 Ops/s $\textbf{\color{#35bf28}+41.83\%}$
test_squeeze 53.1620μs 9.4885μs 105.3907 KOps/s 104.6228 KOps/s $\color{#35bf28}+0.73\%$
test_unsqueeze 0.1299ms 72.3765μs 13.8166 KOps/s 13.6171 KOps/s $\color{#35bf28}+1.47\%$
test_split 0.3893ms 0.1590ms 6.2887 KOps/s 5.9441 KOps/s $\textbf{\color{#35bf28}+5.80\%}$
test_permute 0.2781ms 0.1723ms 5.8039 KOps/s 5.7556 KOps/s $\color{#35bf28}+0.84\%$
test_stack 50.7087ms 50.3082ms 19.8775 Ops/s 19.6853 Ops/s $\color{#35bf28}+0.98\%$
test_cat 50.8993ms 50.2622ms 19.8957 Ops/s 19.8337 Ops/s $\color{#35bf28}+0.31\%$

@vmoens vmoens added the enhancement New feature or request label Jan 9, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 9, 2025
ghstack-source-id: 611d21a58aabf24ec2e7843d637d5f20ccd04a3b
Pull Request resolved: #1170
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 9, 2025
ghstack-source-id: d3c5067beeb099d1ae080752bc6e218d543c7515
Pull Request resolved: #1170
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 9, 2025
ghstack-source-id: fa25726d61e913a725a71f1579eb06b09455e7c8
Pull Request resolved: #1170
@vmoens vmoens merged commit cbd7a68 into gh/vmoens/43/base Jan 9, 2025
7 of 15 checks passed
vmoens added a commit that referenced this pull request Jan 9, 2025
ghstack-source-id: fa25726d61e913a725a71f1579eb06b09455e7c8
Pull Request resolved: #1170
@vmoens vmoens deleted the gh/vmoens/43/head branch January 9, 2025 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants