-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] UnbatchedTensor #1170
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 9, 2025
ghstack-source-id: 982fbef0214e38841dcce82c34116ae991473798 Pull Request resolved: #1170
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 9, 2025
1 task
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 53.3500μs | 20.5679μs | 48.6195 KOps/s | 48.6590 KOps/s | |
test_plain_set_stack_nested | 49.3620μs | 20.9773μs | 47.6706 KOps/s | 48.7385 KOps/s | |
test_plain_set_nested_inplace | 66.1530μs | 22.5404μs | 44.3649 KOps/s | 44.8090 KOps/s | |
test_plain_set_stack_nested_inplace | 66.2230μs | 22.3847μs | 44.6734 KOps/s | 44.3933 KOps/s | |
test_items | 57.0660μs | 4.1750μs | 239.5237 KOps/s | 244.6243 KOps/s | |
test_items_nested | 0.6230ms | 0.3936ms | 2.5410 KOps/s | 2.5421 KOps/s | |
test_items_nested_locked | 0.6748ms | 0.3912ms | 2.5565 KOps/s | 2.5507 KOps/s | |
test_items_nested_leaf | 0.1382ms | 76.9874μs | 12.9891 KOps/s | 13.0276 KOps/s | |
test_items_stack_nested | 0.5805ms | 0.3944ms | 2.5355 KOps/s | 2.5205 KOps/s | |
test_items_stack_nested_leaf | 0.1562ms | 76.8054μs | 13.0199 KOps/s | 12.6675 KOps/s | |
test_items_stack_nested_locked | 0.7431ms | 0.3955ms | 2.5285 KOps/s | 2.5083 KOps/s | |
test_keys | 43.9620μs | 3.5328μs | 283.0584 KOps/s | 280.6814 KOps/s | |
test_keys_nested | 0.2866ms | 0.1642ms | 6.0916 KOps/s | 6.2066 KOps/s | |
test_keys_nested_locked | 0.6924ms | 0.1699ms | 5.8851 KOps/s | 5.9376 KOps/s | |
test_keys_nested_leaf | 0.2551ms | 0.1434ms | 6.9742 KOps/s | 7.0904 KOps/s | |
test_keys_stack_nested | 0.2648ms | 0.1638ms | 6.1062 KOps/s | 5.9645 KOps/s | |
test_keys_stack_nested_leaf | 0.2252ms | 0.1436ms | 6.9626 KOps/s | 7.1610 KOps/s | |
test_keys_stack_nested_locked | 0.2604ms | 0.1702ms | 5.8759 KOps/s | 5.8804 KOps/s | |
test_values | 7.2536μs | 1.0332μs | 967.8897 KOps/s | 867.8004 KOps/s | |
test_values_nested | 0.1206ms | 62.3655μs | 16.0345 KOps/s | 16.3538 KOps/s | |
test_values_nested_locked | 0.1363ms | 62.2160μs | 16.0730 KOps/s | 16.4133 KOps/s | |
test_values_nested_leaf | 0.1253ms | 70.8958μs | 14.1052 KOps/s | 13.9567 KOps/s | |
test_values_stack_nested | 0.1399ms | 61.7020μs | 16.2069 KOps/s | 15.9368 KOps/s | |
test_values_stack_nested_leaf | 0.1708ms | 70.6482μs | 14.1546 KOps/s | 14.0110 KOps/s | |
test_values_stack_nested_locked | 0.1296ms | 61.8814μs | 16.1599 KOps/s | 15.8828 KOps/s | |
test_membership | 61.6250μs | 0.8934μs | 1.1193 MOps/s | 1.1797 MOps/s | |
test_membership_nested | 34.8750μs | 2.9366μs | 340.5331 KOps/s | 352.8445 KOps/s | |
test_membership_nested_leaf | 28.4130μs | 2.9306μs | 341.2244 KOps/s | 351.5709 KOps/s | |
test_membership_stacked_nested | 47.3480μs | 2.9280μs | 341.5308 KOps/s | 351.6117 KOps/s | |
test_membership_stacked_nested_leaf | 28.1930μs | 2.9121μs | 343.3942 KOps/s | 342.5905 KOps/s | |
test_membership_nested_last | 31.0180μs | 4.4214μs | 226.1727 KOps/s | 235.8693 KOps/s | |
test_membership_nested_leaf_last | 45.1450μs | 4.4435μs | 225.0501 KOps/s | 235.9800 KOps/s | |
test_membership_stacked_nested_last | 28.0430μs | 4.3594μs | 229.3895 KOps/s | 228.9228 KOps/s | |
test_membership_stacked_nested_leaf_last | 51.3560μs | 4.4027μs | 227.1329 KOps/s | 233.4638 KOps/s | |
test_nested_getleaf | 52.8290μs | 10.3560μs | 96.5624 KOps/s | 98.7977 KOps/s | |
test_nested_get | 52.0870μs | 9.9248μs | 100.7575 KOps/s | 101.4317 KOps/s | |
test_stacked_getleaf | 50.0130μs | 10.4114μs | 96.0490 KOps/s | 95.6232 KOps/s | |
test_stacked_get | 39.3230μs | 9.8581μs | 101.4392 KOps/s | 100.3051 KOps/s | |
test_nested_getitemleaf | 45.7550μs | 11.0635μs | 90.3872 KOps/s | 90.9636 KOps/s | |
test_nested_getitem | 41.4370μs | 10.5195μs | 95.0611 KOps/s | 97.7230 KOps/s | |
test_stacked_getitemleaf | 44.5130μs | 11.0863μs | 90.2012 KOps/s | 90.8823 KOps/s | |
test_stacked_getitem | 49.5530μs | 10.5590μs | 94.7055 KOps/s | 98.6709 KOps/s | |
test_lock_nested | 1.2966ms | 0.4756ms | 2.1025 KOps/s | 1.6123 KOps/s | |
test_lock_stack_nested | 0.8264ms | 0.4491ms | 2.2266 KOps/s | 2.0790 KOps/s | |
test_unlock_nested | 1.1007ms | 0.3931ms | 2.5439 KOps/s | 2.3286 KOps/s | |
test_unlock_stack_nested | 0.7082ms | 0.3634ms | 2.7518 KOps/s | 2.5794 KOps/s | |
test_flatten_speed | 0.1776ms | 99.1364μs | 10.0871 KOps/s | 9.9111 KOps/s | |
test_unflatten_speed | 0.9477ms | 0.5158ms | 1.9387 KOps/s | 1.9237 KOps/s | |
test_common_ops | 5.1514ms | 0.8284ms | 1.2072 KOps/s | 1.2146 KOps/s | |
test_creation | 25.8780μs | 2.4875μs | 402.0110 KOps/s | 407.4576 KOps/s | |
test_creation_empty | 59.3610μs | 12.1824μs | 82.0857 KOps/s | 85.3025 KOps/s | |
test_creation_nested_1 | 59.5610μs | 15.3004μs | 65.3580 KOps/s | 68.0419 KOps/s | |
test_creation_nested_2 | 74.2280μs | 19.6982μs | 50.7661 KOps/s | 51.6774 KOps/s | |
test_clone | 49.9430μs | 13.5858μs | 73.6064 KOps/s | 72.8769 KOps/s | |
test_getitem[int] | 1.4638ms | 13.0965μs | 76.3562 KOps/s | 77.4859 KOps/s | |
test_getitem[slice_int] | 0.1658ms | 25.0657μs | 39.8951 KOps/s | 40.2189 KOps/s | |
test_getitem[range] | 0.2215ms | 50.4593μs | 19.8179 KOps/s | 20.3555 KOps/s | |
test_getitem[tuple] | 0.1693ms | 20.2805μs | 49.3084 KOps/s | 48.6413 KOps/s | |
test_getitem[list] | 0.2109ms | 45.6506μs | 21.9055 KOps/s | 22.4210 KOps/s | |
test_setitem_dim[int] | 67.7960μs | 26.3557μs | 37.9425 KOps/s | 40.9427 KOps/s | |
test_setitem_dim[slice_int] | 0.1187ms | 52.4323μs | 19.0722 KOps/s | 19.3270 KOps/s | |
test_setitem_dim[range] | 0.1200ms | 74.6818μs | 13.3901 KOps/s | 13.8317 KOps/s | |
test_setitem_dim[tuple] | 85.3290μs | 41.4222μs | 24.1416 KOps/s | 25.1835 KOps/s | |
test_setitem | 77.9140μs | 21.2929μs | 46.9641 KOps/s | 47.0667 KOps/s | |
test_set | 68.4680μs | 20.6032μs | 48.5361 KOps/s | 47.7877 KOps/s | |
test_set_shared | 4.9751ms | 0.1829ms | 5.4689 KOps/s | 5.4115 KOps/s | |
test_update | 0.4708ms | 23.4809μs | 42.5878 KOps/s | 42.9975 KOps/s | |
test_update_nested | 0.3800ms | 33.4083μs | 29.9326 KOps/s | 29.5808 KOps/s | |
test_update__nested | 0.5539ms | 33.8538μs | 29.5388 KOps/s | 28.9596 KOps/s | |
test_set_nested | 0.4360ms | 22.5578μs | 44.3306 KOps/s | 43.4893 KOps/s | |
test_set_nested_new | 91.4000μs | 27.0916μs | 36.9119 KOps/s | 35.8379 KOps/s | |
test_select | 0.1015ms | 43.0460μs | 23.2309 KOps/s | 22.4102 KOps/s | |
test_select_nested | 0.1301ms | 61.9627μs | 16.1387 KOps/s | 15.8082 KOps/s | |
test_exclude_nested | 0.1500ms | 79.5998μs | 12.5628 KOps/s | 12.2784 KOps/s | |
test_empty[True] | 0.5412ms | 0.4034ms | 2.4789 KOps/s | 2.4479 KOps/s | |
test_empty[False] | 45.0540μs | 1.4953μs | 668.7813 KOps/s | 723.4456 KOps/s | |
test_unbind_speed | 0.4540ms | 0.2686ms | 3.7235 KOps/s | 3.7044 KOps/s | |
test_unbind_speed_stack0 | 0.4705ms | 0.2672ms | 3.7432 KOps/s | 3.7333 KOps/s | |
test_unbind_speed_stack1 | 0.1253s | 0.8557ms | 1.1687 KOps/s | 1.4989 KOps/s | |
test_split | 0.1270s | 1.8147ms | 551.0476 Ops/s | 499.2232 Ops/s | |
test_chunk | 1.9985ms | 1.6077ms | 621.9900 Ops/s | 499.0749 Ops/s | |
test_consolidate_njt[False-None] | 0.1331s | 9.7196ms | 102.8846 Ops/s | 108.8334 Ops/s | |
test_creation[device0] | 0.3445ms | 92.3694μs | 10.8261 KOps/s | 10.4468 KOps/s | |
test_creation_from_tensor | 5.0527ms | 95.6403μs | 10.4558 KOps/s | 10.3654 KOps/s | |
test_add_one[memmap_tensor0] | 0.8522ms | 5.0786μs | 196.9065 KOps/s | 197.5912 KOps/s | |
test_contiguous[memmap_tensor0] | 36.1970μs | 0.5136μs | 1.9469 MOps/s | 1.9449 MOps/s | |
test_stack[memmap_tensor0] | 0.2133ms | 3.6089μs | 277.0915 KOps/s | 286.0283 KOps/s | |
test_memmaptd_index | 1.1609ms | 0.2417ms | 4.1382 KOps/s | 3.9760 KOps/s | |
test_memmaptd_index_astensor | 0.6655ms | 0.3318ms | 3.0141 KOps/s | 2.9116 KOps/s | |
test_memmaptd_index_op | 1.3498ms | 0.6169ms | 1.6211 KOps/s | 1.5944 KOps/s | |
test_serialize_model | 0.1360s | 0.1278s | 7.8251 Ops/s | 7.8968 Ops/s | |
test_serialize_model_pickle | 0.5097s | 0.4042s | 2.4738 Ops/s | 2.4422 Ops/s | |
test_serialize_weights | 0.1302s | 0.1225s | 8.1616 Ops/s | 6.7889 Ops/s | |
test_serialize_weights_returnearly | 0.2831s | 0.1818s | 5.4995 Ops/s | 6.0130 Ops/s | |
test_serialize_weights_pickle | 0.4837s | 0.4070s | 2.4572 Ops/s | 2.5235 Ops/s | |
test_serialize_weights_filesystem | 0.1599s | 0.1530s | 6.5366 Ops/s | 6.2744 Ops/s | |
test_serialize_model_filesystem | 0.1687s | 0.1578s | 6.3371 Ops/s | 5.7230 Ops/s | |
test_reshape_pytree | 0.1283ms | 26.5931μs | 37.6037 KOps/s | 37.5755 KOps/s | |
test_reshape_td | 84.8080μs | 32.5948μs | 30.6798 KOps/s | 29.8670 KOps/s | |
test_view_pytree | 0.1138ms | 26.7303μs | 37.4107 KOps/s | 37.5031 KOps/s | |
test_view_td | 0.1136ms | 37.5066μs | 26.6619 KOps/s | 25.7395 KOps/s | |
test_unbind_pytree | 0.1174ms | 29.4598μs | 33.9445 KOps/s | 33.8915 KOps/s | |
test_unbind_td | 0.3818ms | 39.4114μs | 25.3733 KOps/s | 25.1346 KOps/s | |
test_split_pytree | 85.9500μs | 29.7720μs | 33.5886 KOps/s | 33.5579 KOps/s | |
test_split_td | 0.5631ms | 45.6722μs | 21.8952 KOps/s | 21.8816 KOps/s | |
test_add_pytree | 92.5220μs | 34.8481μs | 28.6960 KOps/s | 28.0266 KOps/s | |
test_add_td | 0.3266ms | 57.9810μs | 17.2470 KOps/s | 17.7635 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1645ms | 63.5835μs | 15.7273 KOps/s | 15.5877 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4413ms | 0.1740ms | 5.7477 KOps/s | 5.6206 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1416ms | 45.1231μs | 22.1616 KOps/s | 21.4329 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2394ms | 0.1183ms | 8.4560 KOps/s | 8.3681 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1085ms | 26.1726μs | 38.2079 KOps/s | 38.8174 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1367ms | 57.9499μs | 17.2563 KOps/s | 16.7387 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1831ms | 77.1306μs | 12.9650 KOps/s | 12.7638 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1215ms | 66.9230μs | 14.9425 KOps/s | 14.8448 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2218ms | 0.1053ms | 9.4927 KOps/s | 9.3134 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4940ms | 0.2180ms | 4.5877 KOps/s | 4.6372 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1384ms | 46.1396μs | 21.6734 KOps/s | 21.1178 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.6635ms | 66.4241μs | 15.0548 KOps/s | 14.4377 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1981ms | 0.1024ms | 9.7657 KOps/s | 9.6768 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4147ms | 0.2061ms | 4.8525 KOps/s | 4.9699 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3971ms | 0.2339ms | 4.2752 KOps/s | 4.2538 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2760ms | 0.1082ms | 9.2407 KOps/s | 9.2989 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2897ms | 62.2503μs | 16.0642 KOps/s | 16.0095 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1523ms | 47.4329μs | 21.0824 KOps/s | 21.1063 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3305ms | 0.1673ms | 5.9790 KOps/s | 6.3688 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2175ms | 0.1023ms | 9.7791 KOps/s | 9.5918 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 89.1460μs | 21.1665μs | 47.2446 KOps/s | 46.7870 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1544ms | 66.6294μs | 15.0084 KOps/s | 15.1751 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1730ms | 78.7057μs | 12.7056 KOps/s | 12.6411 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1423ms | 67.9216μs | 14.7229 KOps/s | 14.7415 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4395ms | 0.2098ms | 4.7666 KOps/s | 4.7734 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.2891ms | 1.3541ms | 738.5184 Ops/s | 761.3921 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3135ms | 0.2038ms | 4.9067 KOps/s | 4.8570 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.9670ms | 0.7796ms | 1.2828 KOps/s | 1.2818 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5957ms | 0.4504ms | 2.2202 KOps/s | 2.1330 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.7833ms | 2.7771ms | 360.0829 Ops/s | 345.8552 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1264ms | 37.3369μs | 26.7832 KOps/s | 27.0557 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.7948ms | 34.2026μs | 29.2376 KOps/s | 28.9715 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1134ms | 29.5477μs | 33.8436 KOps/s | 33.9424 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1196ms | 23.0669μs | 43.3522 KOps/s | 43.4029 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1426ms | 30.2352μs | 33.0740 KOps/s | 32.6290 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 99.4250μs | 23.2460μs | 43.0182 KOps/s | 42.8631 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1338ms | 52.5933μs | 19.0138 KOps/s | 18.8201 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.9647ms | 20.8930μs | 47.8630 KOps/s | 48.1646 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1626ms | 44.0632μs | 22.6947 KOps/s | 22.1041 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1364ms | 19.2010μs | 52.0806 KOps/s | 51.2345 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1152ms | 44.4662μs | 22.4890 KOps/s | 21.4747 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 72.4350μs | 18.6984μs | 53.4806 KOps/s | 52.2586 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1217ms | 54.1201μs | 18.4774 KOps/s | 18.6430 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.3089ms | 20.6669μs | 48.3866 KOps/s | 48.5937 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1160ms | 45.0306μs | 22.2071 KOps/s | 21.5411 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 65.4720μs | 18.8078μs | 53.1695 KOps/s | 52.0524 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1277ms | 44.9413μs | 22.2512 KOps/s | 21.5218 KOps/s | |
test_compile_indexing[int-pytree-eager] | 72.5150μs | 18.8581μs | 53.0275 KOps/s | 52.3619 KOps/s | |
test_mod_add[eager] | 0.1334ms | 35.4892μs | 28.1776 KOps/s | 26.7831 KOps/s | |
test_mod_add[compile] | 0.1236ms | 48.1798μs | 20.7556 KOps/s | 20.0485 KOps/s | |
test_mod_add[compile-overhead] | 0.1171ms | 48.1102μs | 20.7856 KOps/s | 19.7216 KOps/s | |
test_mod_wrap[eager] | 0.4154ms | 0.2325ms | 4.3011 KOps/s | 4.2344 KOps/s | |
test_mod_wrap[compile] | 0.3272ms | 0.2012ms | 4.9694 KOps/s | 4.6883 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4673ms | 0.2049ms | 4.8805 KOps/s | 4.7393 KOps/s | |
test_mod_wrap_and_backward[eager] | 20.6115ms | 12.7180ms | 78.6290 Ops/s | 63.4358 Ops/s | |
test_mod_wrap_and_backward[compile] | 15.4701ms | 11.9163ms | 83.9188 Ops/s | 67.4303 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.1598ms | 11.4591ms | 87.2671 Ops/s | 62.5524 Ops/s | |
test_seq_add[eager] | 0.2544ms | 0.1186ms | 8.4283 KOps/s | 8.2312 KOps/s | |
test_seq_add[compile] | 0.1479ms | 63.2920μs | 15.7998 KOps/s | 15.4217 KOps/s | |
test_seq_add[compile-overhead] | 0.1335ms | 61.5294μs | 16.2524 KOps/s | 15.9062 KOps/s | |
test_seq_wrap[eager] | 0.8460ms | 0.4583ms | 2.1818 KOps/s | 2.1494 KOps/s | |
test_seq_wrap[compile] | 0.4603ms | 0.2270ms | 4.4054 KOps/s | 4.1829 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4018ms | 0.2274ms | 4.3966 KOps/s | 4.3484 KOps/s | |
test_func_call_runtime[False-eager] | 0.8139ms | 0.5693ms | 1.7566 KOps/s | 1.8101 KOps/s | |
test_func_call_runtime[False-compile] | 0.8203ms | 0.4255ms | 2.3504 KOps/s | 2.3133 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6118ms | 0.4264ms | 2.3451 KOps/s | 2.3104 KOps/s | |
test_func_call_runtime[True-eager] | 1.1107ms | 0.7806ms | 1.2811 KOps/s | 1.2660 KOps/s | |
test_func_call_runtime[True-compile] | 0.9095ms | 0.4702ms | 2.1266 KOps/s | 2.1134 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6376ms | 0.4702ms | 2.1266 KOps/s | 2.1241 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8438ms | 0.5581ms | 1.7919 KOps/s | 1.7912 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8682ms | 0.4277ms | 2.3379 KOps/s | 2.3192 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5539ms | 0.4233ms | 2.3624 KOps/s | 2.3227 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.9394ms | 0.9408ms | 1.0629 KOps/s | 1.0765 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9015ms | 0.4903ms | 2.0394 KOps/s | 2.0200 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6566ms | 0.4927ms | 2.0297 KOps/s | 2.0144 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.8424ms | 2.0582ms | 485.8600 Ops/s | 467.2149 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7828ms | 0.5289ms | 1.8907 KOps/s | 1.9225 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.1823ms | 0.5455ms | 1.8331 KOps/s | 1.9087 KOps/s | |
test_distributed | 0.4005ms | 0.1256ms | 7.9644 KOps/s | 7.6594 KOps/s | |
test_tdmodule | 0.1260ms | 27.2754μs | 36.6630 KOps/s | 37.6900 KOps/s | |
test_tdmodule_dispatch | 0.1084ms | 50.4772μs | 19.8109 KOps/s | 20.2123 KOps/s | |
test_tdseq | 86.1800μs | 30.3149μs | 32.9871 KOps/s | 33.5381 KOps/s | |
test_tdseq_dispatch | 0.1033ms | 55.6276μs | 17.9767 KOps/s | 17.8085 KOps/s | |
test_instantiation_functorch | 2.2982ms | 1.5886ms | 629.4734 Ops/s | 642.5866 Ops/s | |
test_exec_functorch | 0.3045ms | 0.1816ms | 5.5062 KOps/s | 5.4410 KOps/s | |
test_exec_functional_call | 0.3313ms | 0.1769ms | 5.6534 KOps/s | 5.6079 KOps/s | |
test_exec_td_decorator | 0.5975ms | 0.2361ms | 4.2361 KOps/s | 4.2452 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1847ms | 0.6767ms | 1.4778 KOps/s | 1.4899 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.2866ms | 0.6860ms | 1.4577 KOps/s | 1.4911 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9396ms | 0.5402ms | 1.8511 KOps/s | 1.8656 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8631ms | 0.5427ms | 1.8426 KOps/s | 1.8673 KOps/s | |
test_to_module_speed[True] | 2.4548ms | 1.3548ms | 738.1003 Ops/s | 729.8391 Ops/s | |
test_to_module_speed[False] | 2.1937ms | 1.3543ms | 738.3897 Ops/s | 762.0032 Ops/s | |
test_tc_init | 88.9750μs | 46.4037μs | 21.5500 KOps/s | 21.5807 KOps/s | |
test_tc_init_nested | 0.1699ms | 93.9529μs | 10.6436 KOps/s | 10.7354 KOps/s | |
test_tc_first_layer_tensor | 31.1380μs | 1.6020μs | 624.2298 KOps/s | 625.6424 KOps/s | |
test_tc_first_layer_nontensor | 32.9610μs | 4.6057μs | 217.1206 KOps/s | 211.5221 KOps/s | |
test_tc_second_layer_tensor | 23.8240μs | 2.8577μs | 349.9261 KOps/s | 348.5311 KOps/s | |
test_tc_second_layer_nontensor | 55.2530μs | 5.9729μs | 167.4230 KOps/s | 164.2193 KOps/s | |
test_unbind | 0.2749s | 16.2281ms | 61.6216 Ops/s | 50.8657 Ops/s | |
test_full_like | 16.8880ms | 11.5881ms | 86.2951 Ops/s | 71.2935 Ops/s | |
test_zeros_like | 7.8381ms | 4.4552ms | 224.4584 Ops/s | 244.7646 Ops/s | |
test_ones_like | 7.2662ms | 5.0027ms | 199.8922 Ops/s | 129.4478 Ops/s | |
test_clone | 12.1308ms | 8.1177ms | 123.1879 Ops/s | 93.0662 Ops/s | |
test_squeeze | 76.7230μs | 12.2337μs | 81.7413 KOps/s | 81.3912 KOps/s | |
test_unsqueeze | 0.2018ms | 93.6463μs | 10.6785 KOps/s | 10.7109 KOps/s | |
test_split | 0.5395ms | 0.1980ms | 5.0517 KOps/s | 5.0024 KOps/s | |
test_permute | 0.4062ms | 0.2022ms | 4.9460 KOps/s | 4.9657 KOps/s | |
test_stack | 39.6256ms | 31.7117ms | 31.5341 Ops/s | 30.6699 Ops/s | |
test_cat | 45.3636ms | 32.3152ms | 30.9452 Ops/s | 32.2363 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.3967ms | 11.1975μs | 89.3057 KOps/s | 78.2651 KOps/s | |
test_plain_set_stack_nested | 37.7610μs | 11.5410μs | 86.6476 KOps/s | 77.0228 KOps/s | |
test_plain_set_nested_inplace | 0.3989ms | 12.4017μs | 80.6340 KOps/s | 72.2581 KOps/s | |
test_plain_set_stack_nested_inplace | 55.0220μs | 12.4905μs | 80.0607 KOps/s | 71.9302 KOps/s | |
test_items | 31.5310μs | 2.9032μs | 344.4514 KOps/s | 336.7721 KOps/s | |
test_items_nested | 0.7522ms | 0.3612ms | 2.7687 KOps/s | 2.7802 KOps/s | |
test_items_nested_locked | 0.7613ms | 0.3644ms | 2.7443 KOps/s | 2.7529 KOps/s | |
test_items_nested_leaf | 88.6340μs | 57.8702μs | 17.2800 KOps/s | 17.1509 KOps/s | |
test_items_stack_nested | 0.7623ms | 0.3643ms | 2.7448 KOps/s | 2.8023 KOps/s | |
test_items_stack_nested_leaf | 0.4678ms | 58.9768μs | 16.9558 KOps/s | 16.5219 KOps/s | |
test_items_stack_nested_locked | 0.4991ms | 0.3625ms | 2.7589 KOps/s | 2.7769 KOps/s | |
test_keys | 35.4410μs | 3.4949μs | 286.1332 KOps/s | 291.8173 KOps/s | |
test_keys_nested | 0.1333ms | 88.2676μs | 11.3292 KOps/s | 12.1831 KOps/s | |
test_keys_nested_locked | 0.8146ms | 93.6598μs | 10.6769 KOps/s | 11.3990 KOps/s | |
test_keys_nested_leaf | 0.1272ms | 78.3781μs | 12.7587 KOps/s | 13.7230 KOps/s | |
test_keys_stack_nested | 0.1395ms | 89.3664μs | 11.1899 KOps/s | 12.0634 KOps/s | |
test_keys_stack_nested_leaf | 0.1295ms | 79.9151μs | 12.5133 KOps/s | 13.5795 KOps/s | |
test_keys_stack_nested_locked | 0.1611ms | 96.0333μs | 10.4131 KOps/s | 11.3289 KOps/s | |
test_values | 5.3103μs | 0.8535μs | 1.1716 MOps/s | 1.1776 MOps/s | |
test_values_nested | 67.6220μs | 37.2980μs | 26.8111 KOps/s | 29.0247 KOps/s | |
test_values_nested_locked | 86.6930μs | 38.6941μs | 25.8438 KOps/s | 27.6203 KOps/s | |
test_values_nested_leaf | 69.8530μs | 41.4700μs | 24.1138 KOps/s | 25.3669 KOps/s | |
test_values_stack_nested | 79.5630μs | 38.0046μs | 26.3126 KOps/s | 28.5240 KOps/s | |
test_values_stack_nested_leaf | 0.1143ms | 41.8545μs | 23.8923 KOps/s | 25.4482 KOps/s | |
test_values_stack_nested_locked | 74.3130μs | 39.7158μs | 25.1789 KOps/s | 27.4248 KOps/s | |
test_membership | 1.9226μs | 0.5082μs | 1.9677 MOps/s | 1.9418 MOps/s | |
test_membership_nested | 18.2560μs | 1.9526μs | 512.1270 KOps/s | 512.6203 KOps/s | |
test_membership_nested_leaf | 18.8210μs | 1.9836μs | 504.1352 KOps/s | 505.3926 KOps/s | |
test_membership_stacked_nested | 32.2420μs | 2.0388μs | 490.4956 KOps/s | 484.6061 KOps/s | |
test_membership_stacked_nested_leaf | 23.5200μs | 2.0179μs | 495.5566 KOps/s | 483.9393 KOps/s | |
test_membership_nested_last | 39.5510μs | 3.0685μs | 325.8872 KOps/s | 327.2809 KOps/s | |
test_membership_nested_leaf_last | 37.5320μs | 3.0846μs | 324.1913 KOps/s | 321.2412 KOps/s | |
test_membership_stacked_nested_last | 35.7920μs | 3.0639μs | 326.3836 KOps/s | 120.9242 KOps/s | |
test_membership_stacked_nested_leaf_last | 42.5220μs | 3.0847μs | 324.1833 KOps/s | 121.9812 KOps/s | |
test_nested_getleaf | 36.7910μs | 6.0891μs | 164.2267 KOps/s | 162.3881 KOps/s | |
test_nested_get | 30.7620μs | 5.7869μs | 172.8046 KOps/s | 171.9933 KOps/s | |
test_stacked_getleaf | 42.2510μs | 6.1172μs | 163.4733 KOps/s | 163.7006 KOps/s | |
test_stacked_get | 27.3010μs | 5.7772μs | 173.0931 KOps/s | 172.8298 KOps/s | |
test_nested_getitemleaf | 32.2810μs | 6.3617μs | 157.1896 KOps/s | 161.7074 KOps/s | |
test_nested_getitem | 31.7010μs | 6.0338μs | 165.7322 KOps/s | 166.7548 KOps/s | |
test_stacked_getitemleaf | 40.6910μs | 6.3833μs | 156.6583 KOps/s | 160.0527 KOps/s | |
test_stacked_getitem | 31.2310μs | 6.0743μs | 164.6270 KOps/s | 168.2540 KOps/s | |
test_lock_nested | 0.6993ms | 0.3733ms | 2.6791 KOps/s | 2.6434 KOps/s | |
test_lock_stack_nested | 0.3967ms | 0.3454ms | 2.8951 KOps/s | 2.9426 KOps/s | |
test_unlock_nested | 0.6251ms | 0.3154ms | 3.1709 KOps/s | 3.1577 KOps/s | |
test_unlock_stack_nested | 0.3375ms | 0.2862ms | 3.4939 KOps/s | 3.5999 KOps/s | |
test_flatten_speed | 0.1423ms | 73.0494μs | 13.6894 KOps/s | 13.4292 KOps/s | |
test_unflatten_speed | 0.4388ms | 0.3158ms | 3.1663 KOps/s | 3.0967 KOps/s | |
test_common_ops | 1.5616ms | 0.5806ms | 1.7223 KOps/s | 1.6022 KOps/s | |
test_creation | 0.1929ms | 1.7389μs | 575.0670 KOps/s | 569.5233 KOps/s | |
test_creation_empty | 36.8420μs | 6.9528μs | 143.8261 KOps/s | 108.5713 KOps/s | |
test_creation_nested_1 | 33.1910μs | 8.5789μs | 116.5650 KOps/s | 90.8440 KOps/s | |
test_creation_nested_2 | 44.0520μs | 11.3194μs | 88.3438 KOps/s | 72.5512 KOps/s | |
test_clone | 45.8620μs | 10.9446μs | 91.3691 KOps/s | 97.7012 KOps/s | |
test_getitem[int] | 1.8032ms | 11.3495μs | 88.1096 KOps/s | 91.1449 KOps/s | |
test_getitem[slice_int] | 0.1132ms | 21.3129μs | 46.9200 KOps/s | 47.7795 KOps/s | |
test_getitem[range] | 0.1368ms | 37.2299μs | 26.8601 KOps/s | 25.8509 KOps/s | |
test_getitem[tuple] | 0.1078ms | 18.7957μs | 53.2035 KOps/s | 54.0529 KOps/s | |
test_getitem[list] | 0.1597ms | 32.4661μs | 30.8014 KOps/s | 31.1136 KOps/s | |
test_setitem_dim[int] | 37.2310μs | 19.5029μs | 51.2743 KOps/s | 53.9591 KOps/s | |
test_setitem_dim[slice_int] | 66.1730μs | 38.4305μs | 26.0210 KOps/s | 25.9404 KOps/s | |
test_setitem_dim[range] | 81.8830μs | 52.0890μs | 19.1979 KOps/s | 18.8058 KOps/s | |
test_setitem_dim[tuple] | 58.9920μs | 32.1073μs | 31.1456 KOps/s | 32.5284 KOps/s | |
test_setitem | 49.9520μs | 14.3795μs | 69.5432 KOps/s | 66.7330 KOps/s | |
test_set | 49.4020μs | 13.7543μs | 72.7044 KOps/s | 67.9367 KOps/s | |
test_set_shared | 1.6608ms | 0.1515ms | 6.5992 KOps/s | 6.6792 KOps/s | |
test_update | 0.4668ms | 15.9222μs | 62.8056 KOps/s | 55.3758 KOps/s | |
test_update_nested | 0.1184ms | 21.3477μs | 46.8435 KOps/s | 42.4369 KOps/s | |
test_update__nested | 1.2862ms | 25.4112μs | 39.3527 KOps/s | 40.5874 KOps/s | |
test_set_nested | 0.1247ms | 15.1475μs | 66.0176 KOps/s | 63.3865 KOps/s | |
test_set_nested_new | 0.1203ms | 17.7960μs | 56.1923 KOps/s | 54.5334 KOps/s | |
test_select | 83.2130μs | 29.3188μs | 34.1078 KOps/s | 33.1654 KOps/s | |
test_select_nested | 77.2130μs | 44.2448μs | 22.6015 KOps/s | 22.5653 KOps/s | |
test_exclude_nested | 0.1141ms | 62.9298μs | 15.8907 KOps/s | 15.9458 KOps/s | |
test_empty[True] | 0.6891ms | 0.2951ms | 3.3887 KOps/s | 3.4427 KOps/s | |
test_empty[False] | 41.2455μs | 0.8272μs | 1.2089 MOps/s | 1.2223 MOps/s | |
test_to | 88.1340μs | 57.0337μs | 17.5335 KOps/s | 17.6578 KOps/s | |
test_to_nonblocking | 0.9618ms | 48.9686μs | 20.4213 KOps/s | 21.1963 KOps/s | |
test_unbind_speed | 0.6614ms | 0.2410ms | 4.1492 KOps/s | 4.2059 KOps/s | |
test_unbind_speed_stack0 | 0.6464ms | 0.2431ms | 4.1134 KOps/s | 4.2600 KOps/s | |
test_unbind_speed_stack1 | 94.2779ms | 0.6710ms | 1.4903 KOps/s | 1.5041 KOps/s | |
test_split | 95.5320ms | 1.7071ms | 585.7878 Ops/s | 617.3459 Ops/s | |
test_chunk | 95.9011ms | 1.7130ms | 583.7676 Ops/s | 613.9080 Ops/s | |
test_consolidate[False-None] | 98.0917ms | 3.0344ms | 329.5538 Ops/s | 332.6700 Ops/s | |
test_consolidate[default-None] | 1.9227ms | 1.7687ms | 565.3947 Ops/s | 586.5856 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8735ms | 1.7783ms | 562.3458 Ops/s | 567.9303 Ops/s | |
test_consolidate_njt[False-None] | 0.3028s | 8.2237ms | 121.5991 Ops/s | 113.0538 Ops/s | |
test_to[False-False-None] | 1.8406ms | 1.7265ms | 579.2150 Ops/s | 573.7178 Ops/s | |
test_to[True-False-None] | 1.4263ms | 1.3247ms | 754.9050 Ops/s | 768.2462 Ops/s | |
test_to[within-False-None] | 4.3801ms | 4.1481ms | 241.0715 Ops/s | 241.4042 Ops/s | |
test_to[True-default-None] | 5.2621ms | 5.1317ms | 194.8685 Ops/s | 189.9998 Ops/s | |
test_to_njt[False-False-None] | 6.8793ms | 6.7480ms | 148.1914 Ops/s | 144.2342 Ops/s | |
test_to_njt[True-False-None] | 5.7425ms | 5.3315ms | 187.5659 Ops/s | 179.7203 Ops/s | |
test_to_njt[within-False-None] | 11.8247ms | 11.7224ms | 85.3067 Ops/s | 81.6011 Ops/s | |
test_creation[device0] | 0.4638ms | 79.6123μs | 12.5609 KOps/s | 12.4708 KOps/s | |
test_creation_from_tensor | 0.4706ms | 83.0086μs | 12.0469 KOps/s | 11.8355 KOps/s | |
test_add_one[memmap_tensor0] | 0.4678ms | 6.4886μs | 154.1153 KOps/s | 159.0490 KOps/s | |
test_contiguous[memmap_tensor0] | 4.7337μs | 0.4015μs | 2.4906 MOps/s | 2.5390 MOps/s | |
test_stack[memmap_tensor0] | 42.4510μs | 4.8773μs | 205.0323 KOps/s | 213.3569 KOps/s | |
test_memmaptd_index | 2.0705ms | 0.2613ms | 3.8268 KOps/s | 3.8734 KOps/s | |
test_memmaptd_index_astensor | 0.8222ms | 0.3213ms | 3.1122 KOps/s | 3.1130 KOps/s | |
test_memmaptd_index_op | 1.4494ms | 0.5584ms | 1.7907 KOps/s | 1.6831 KOps/s | |
test_serialize_model | 0.1322s | 0.1311s | 7.6292 Ops/s | 7.6213 Ops/s | |
test_serialize_model_pickle | 1.3494s | 1.2163s | 0.8222 Ops/s | 0.8236 Ops/s | |
test_serialize_weights | 0.1340s | 0.1308s | 7.6472 Ops/s | 7.6515 Ops/s | |
test_serialize_weights_returnearly | 0.3443s | 60.8175ms | 16.4426 Ops/s | 12.6522 Ops/s | |
test_serialize_weights_pickle | 1.3564s | 1.2127s | 0.8246 Ops/s | 0.8211 Ops/s | |
test_reshape_pytree | 69.6830μs | 22.1808μs | 45.0841 KOps/s | 45.7273 KOps/s | |
test_reshape_td | 57.8620μs | 26.6657μs | 37.5014 KOps/s | 36.6081 KOps/s | |
test_view_pytree | 53.4720μs | 22.0353μs | 45.3816 KOps/s | 45.8446 KOps/s | |
test_view_td | 60.2030μs | 29.3418μs | 34.0810 KOps/s | 33.2689 KOps/s | |
test_unbind_pytree | 68.9120μs | 28.4337μs | 35.1695 KOps/s | 36.0727 KOps/s | |
test_unbind_td | 0.6059ms | 36.5805μs | 27.3370 KOps/s | 27.5051 KOps/s | |
test_split_pytree | 61.2520μs | 30.5973μs | 32.6826 KOps/s | 32.6778 KOps/s | |
test_split_td | 0.7775ms | 40.7569μs | 24.5357 KOps/s | 25.9045 KOps/s | |
test_add_pytree | 63.4120μs | 33.2455μs | 30.0793 KOps/s | 30.2455 KOps/s | |
test_add_td | 0.1925ms | 47.6821μs | 20.9722 KOps/s | 19.7487 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1735ms | 0.1179ms | 8.4782 KOps/s | 8.1308 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2301ms | 0.1305ms | 7.6651 KOps/s | 7.5263 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2043ms | 94.3155μs | 10.6027 KOps/s | 10.5679 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.0774ms | 0.1509ms | 6.6281 KOps/s | 6.6773 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 63.7320μs | 22.9187μs | 43.6325 KOps/s | 30.1951 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 62.0520μs | 29.6432μs | 33.7346 KOps/s | 33.6078 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4431ms | 64.0379μs | 15.6158 KOps/s | 15.2413 KOps/s | |
test_compile_copy_nested[pytree-eager] | 83.4730μs | 48.7642μs | 20.5069 KOps/s | 20.2471 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1827ms | 0.1414ms | 7.0723 KOps/s | 7.1110 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3189ms | 0.2159ms | 4.6319 KOps/s | 4.6275 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1431ms | 96.8478μs | 10.3255 KOps/s | 10.3239 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1131ms | 55.3411μs | 18.0697 KOps/s | 18.1951 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2427ms | 0.1351ms | 7.4027 KOps/s | 7.4688 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5303ms | 0.4897ms | 2.0422 KOps/s | 2.0611 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4118ms | 0.2582ms | 3.8729 KOps/s | 3.8136 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2006ms | 0.1426ms | 7.0104 KOps/s | 7.1636 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1606ms | 66.4871μs | 15.0405 KOps/s | 15.0314 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1336ms | 96.6467μs | 10.3470 KOps/s | 10.2373 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4801ms | 0.4247ms | 2.3546 KOps/s | 2.4507 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2045ms | 0.1329ms | 7.5217 KOps/s | 7.4990 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 79.7730μs | 18.5820μs | 53.8154 KOps/s | 28.7793 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 55.0720μs | 30.7652μs | 32.5043 KOps/s | 31.6139 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2140ms | 70.5348μs | 14.1774 KOps/s | 14.3632 KOps/s | |
test_compile_copy_flat[pytree-eager] | 87.0230μs | 52.0480μs | 19.2130 KOps/s | 19.6424 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6276ms | 0.3904ms | 2.5617 KOps/s | 2.2024 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8218ms | 2.6039ms | 384.0373 Ops/s | 389.7622 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6003ms | 0.4358ms | 2.2947 KOps/s | 2.2689 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7618ms | 2.6584ms | 376.1629 Ops/s | 384.9034 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2187ms | 0.1161ms | 8.6140 KOps/s | 8.9580 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6093ms | 80.4755μs | 12.4261 KOps/s | 12.9202 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2374ms | 0.1036ms | 9.6555 KOps/s | 9.7005 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1183ms | 67.8409μs | 14.7404 KOps/s | 14.4759 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1989ms | 0.1102ms | 9.0706 KOps/s | 9.4812 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1602ms | 69.3091μs | 14.4281 KOps/s | 14.3931 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2506ms | 0.1050ms | 9.5254 KOps/s | 9.7458 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1435ms | 17.1472μs | 58.3187 KOps/s | 58.1011 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1546ms | 97.6665μs | 10.2389 KOps/s | 10.4085 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1043ms | 15.8510μs | 63.0874 KOps/s | 63.3007 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2015ms | 98.6285μs | 10.1391 KOps/s | 10.2586 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1075ms | 15.9271μs | 62.7862 KOps/s | 64.3393 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1725ms | 0.1031ms | 9.7039 KOps/s | 9.8823 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6068ms | 17.1647μs | 58.2592 KOps/s | 58.6559 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2012ms | 98.8809μs | 10.1132 KOps/s | 10.3095 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1049ms | 15.9131μs | 62.8413 KOps/s | 63.8137 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2155ms | 99.9613μs | 10.0039 KOps/s | 10.3156 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1031ms | 15.8108μs | 63.2481 KOps/s | 64.3232 KOps/s | |
test_mod_add[eager] | 0.2080ms | 35.4478μs | 28.2105 KOps/s | 24.4867 KOps/s | |
test_mod_add[compile] | 0.1849ms | 77.8906μs | 12.8385 KOps/s | 12.2481 KOps/s | |
test_mod_add[compile-overhead] | 0.3253ms | 0.1658ms | 6.0324 KOps/s | 5.6966 KOps/s | |
test_mod_wrap[eager] | 0.3445ms | 0.2382ms | 4.1983 KOps/s | 3.9099 KOps/s | |
test_mod_wrap[compile] | 0.3810ms | 0.2825ms | 3.5397 KOps/s | 3.5261 KOps/s | |
test_mod_wrap[compile-overhead] | 6.9529ms | 3.7510ms | 266.5935 Ops/s | 267.5136 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.6992ms | 1.3422ms | 745.0369 Ops/s | 697.5581 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3851ms | 1.2657ms | 790.0522 Ops/s | 728.7287 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3551ms | 0.9158ms | 1.0919 KOps/s | 953.7447 Ops/s | |
test_seq_add[eager] | 0.1756ms | 0.1112ms | 8.9920 KOps/s | 8.5530 KOps/s | |
test_seq_add[compile] | 0.1581ms | 92.0681μs | 10.8615 KOps/s | 11.3565 KOps/s | |
test_seq_add[compile-overhead] | 0.1773ms | 0.1316ms | 7.5962 KOps/s | 7.8283 KOps/s | |
test_seq_wrap[eager] | 0.8242ms | 0.4206ms | 2.3777 KOps/s | 2.3767 KOps/s | |
test_seq_wrap[compile] | 0.7200ms | 0.3086ms | 3.2401 KOps/s | 3.3353 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2697ms | 0.2211ms | 4.5227 KOps/s | 4.4563 KOps/s | |
test_func_call_runtime[False-eager] | 0.8014ms | 0.7068ms | 1.4148 KOps/s | 1.3922 KOps/s | |
test_func_call_runtime[False-compile] | 1.2232ms | 0.7664ms | 1.3048 KOps/s | 1.3492 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.7659ms | 0.3724ms | 2.6856 KOps/s | 2.7726 KOps/s | |
test_func_call_runtime[True-eager] | 1.3210ms | 0.9102ms | 1.0987 KOps/s | 1.1384 KOps/s | |
test_func_call_runtime[True-compile] | 1.2279ms | 0.7844ms | 1.2748 KOps/s | 1.3224 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4424ms | 0.3783ms | 2.6433 KOps/s | 2.6313 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8108ms | 0.7038ms | 1.4208 KOps/s | 1.3910 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8147ms | 0.7363ms | 1.3582 KOps/s | 1.3309 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4570ms | 0.3606ms | 2.7730 KOps/s | 2.7316 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1194ms | 0.9728ms | 1.0279 KOps/s | 1.0098 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9412ms | 0.7792ms | 1.2834 KOps/s | 1.2571 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8208ms | 0.4059ms | 2.4636 KOps/s | 2.4558 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4641ms | 2.0177ms | 495.6225 Ops/s | 489.2664 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.1960ms | 0.7907ms | 1.2647 KOps/s | 1.2364 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.8485ms | 0.4088ms | 2.4459 KOps/s | 2.4424 KOps/s | |
test_distributed | 4.4408ms | 0.1800ms | 5.5556 KOps/s | 8.3538 KOps/s | |
test_tdmodule | 0.3045ms | 18.9542μs | 52.7588 KOps/s | 47.5214 KOps/s | |
test_tdmodule_dispatch | 59.0930μs | 33.4583μs | 29.8880 KOps/s | 27.0124 KOps/s | |
test_tdseq | 44.3310μs | 19.5473μs | 51.1579 KOps/s | 47.1525 KOps/s | |
test_tdseq_dispatch | 60.8030μs | 36.3552μs | 27.5064 KOps/s | 25.2633 KOps/s | |
test_instantiation_functorch | 1.6068ms | 1.5333ms | 652.1730 Ops/s | 652.0140 Ops/s | |
test_exec_functorch | 0.1856ms | 0.1429ms | 6.9958 KOps/s | 7.1209 KOps/s | |
test_exec_functional_call | 0.1756ms | 0.1334ms | 7.4960 KOps/s | 7.6157 KOps/s | |
test_exec_td_decorator | 0.3836ms | 0.1801ms | 5.5535 KOps/s | 5.6046 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7861ms | 0.6671ms | 1.4991 KOps/s | 1.4860 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7539ms | 0.6609ms | 1.5130 KOps/s | 1.4933 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7023ms | 0.5791ms | 1.7268 KOps/s | 1.7209 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7199ms | 0.5848ms | 1.7099 KOps/s | 1.7162 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.0862ms | 18.7659ms | 53.2880 Ops/s | 53.1251 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.4739ms | 18.7997ms | 53.1925 Ops/s | 53.1897 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.7307ms | 18.8049ms | 53.1775 Ops/s | 53.6781 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.0227ms | 18.6289ms | 53.6800 Ops/s | 53.5237 Ops/s | |
test_to_module_speed[True] | 1.0787ms | 0.9736ms | 1.0271 KOps/s | 1.0308 KOps/s | |
test_to_module_speed[False] | 1.5131ms | 0.9391ms | 1.0648 KOps/s | 1.0372 KOps/s | |
test_tc_init | 62.9830μs | 34.5004μs | 28.9851 KOps/s | 27.4256 KOps/s | |
test_tc_init_nested | 0.1051ms | 71.9114μs | 13.9060 KOps/s | 13.9148 KOps/s | |
test_tc_first_layer_tensor | 32.7220μs | 0.7981μs | 1.2530 MOps/s | 1.2304 MOps/s | |
test_tc_first_layer_nontensor | 20.0810μs | 2.2615μs | 442.1940 KOps/s | 451.0266 KOps/s | |
test_tc_second_layer_tensor | 31.2510μs | 1.4944μs | 669.1583 KOps/s | 705.8785 KOps/s | |
test_tc_second_layer_nontensor | 33.6310μs | 2.9997μs | 333.3643 KOps/s | 338.8064 KOps/s | |
test_unbind | 0.2305s | 10.0063ms | 99.9367 Ops/s | 143.7064 Ops/s | |
test_full_like | 9.7356ms | 9.1479ms | 109.3142 Ops/s | 109.1881 Ops/s | |
test_zeros_like | 6.6437ms | 4.3602ms | 229.3470 Ops/s | 140.8856 Ops/s | |
test_ones_like | 9.2830ms | 7.3040ms | 136.9109 Ops/s | 138.5971 Ops/s | |
test_clone | 6.6458ms | 6.4393ms | 155.2975 Ops/s | 109.4984 Ops/s | |
test_squeeze | 53.1620μs | 9.4885μs | 105.3907 KOps/s | 104.6228 KOps/s | |
test_unsqueeze | 0.1299ms | 72.3765μs | 13.8166 KOps/s | 13.6171 KOps/s | |
test_split | 0.3893ms | 0.1590ms | 6.2887 KOps/s | 5.9441 KOps/s | |
test_permute | 0.2781ms | 0.1723ms | 5.8039 KOps/s | 5.7556 KOps/s | |
test_stack | 50.7087ms | 50.3082ms | 19.8775 Ops/s | 19.6853 Ops/s | |
test_cat | 50.8993ms | 50.2622ms | 19.8957 Ops/s | 19.8337 Ops/s |
vmoens
added a commit
that referenced
this pull request
Jan 9, 2025
ghstack-source-id: 611d21a58aabf24ec2e7843d637d5f20ccd04a3b Pull Request resolved: #1170
vmoens
added a commit
that referenced
this pull request
Jan 9, 2025
ghstack-source-id: d3c5067beeb099d1ae080752bc6e218d543c7515 Pull Request resolved: #1170
vmoens
added a commit
that referenced
this pull request
Jan 9, 2025
ghstack-source-id: fa25726d61e913a725a71f1579eb06b09455e7c8 Pull Request resolved: #1170
vmoens
added a commit
that referenced
this pull request
Jan 9, 2025
ghstack-source-id: fa25726d61e913a725a71f1579eb06b09455e7c8 Pull Request resolved: #1170
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
__torch_function__
#1169