Releases: NVIDIA/warp
Releases · NVIDIA/warp
v1.4.1
Changelog
[1.4.1] - 2024-10-15
Fixed
- Fix
iter_reverse()
not working as expected for ranges with steps other than 1 (GH-311). - Fix potential out-of-bounds memory access when a
wp.sparse.BsrMatrix
object is reused for storing matrices of different shapes. - Fix robustness to very low desired tolerance in
wp.fem.utils.symmetric_eigenvalues_qr
. - Fix invalid code generation error messages when nesting dynamic and static for-loops.
- Fix caching of kernels with static expressions.
- Fix
ModelBuilder.add_builder(builder)
to correctly updatearticulation_start
and therebyarticulation_count
whenbuilder
contains more than one articulation. - Re-introduced the
wp.rand*()
,wp.sample*()
, andwp.poisson()
onto the Python scope to revert a breaking change.
v.1.4.0
CHANGELOG
[1.4.0] - 2024-10-01
Added
- Support for a new
wp.static(expr)
function that allows arbitrary Python expressions to be evaluated at the time of
function/kernel definition (docs). - Support for stream priorities to hint to the device that it should process pending work
in high-priority streams over pending work in low-priority streams when possible
(docs). - Adaptive sparse grid geometry to
warp.fem
(docs). - Support for defining
wp.kernel
andwp.func
objects from within closures. - Support for defining multiple versions of kernels, functions, and structs without manually assigning unique keys.
- Support for default argument values for user functions decorated with
wp.func
. - Allow passing custom launch dimensions to
jax_kernel()
(GH-310). - JAX interoperability examples for sharding and matrix multiplication (docs).
- Interoperability support for the PaddlePaddle ML framework (GH-318).
- Support
wp.mod()
for vector types (GH-282). - Expose the modulo operator
%
to Python's runtime scalar and vector types. - Support for fp64
atomic_add
,atomic_max
, andatomic_min
(GH-284). - Support for quaternion indexing (e.g.
q.w
). - Support shadowing builtin functions (GH-308).
- Support for redefining function overloads.
- Add an ocean sample to the
omni.warp
extension. warp.sim.VBDIntegrator
now supports body-particle collision.- Add a contributing guide to the Sphinx docs .
- Add documentation for dynamic code generation (docs).
Changed
wp.sim.Model.edge_indices
now includes boundary edges.- Unexposed
wp.rand*()
,wp.sample*()
, andwp.poisson()
from the Python scope. - Skip unused functions in module code generation, improving performance.
- Avoid reloading modules if their content does not change, improving performance.
wp.Mesh.points
is now a property instead of a raw data member, its reference can be changed after the mesh is initialized.- Improve error message when invalid objects are referenced in a Warp kernel.
if
/else
/elif
statements with constant conditions are resolved at compile time with no branches being inserted in the generated code.- Include all non-hidden builtins in the stub file.
- Improve accuracy of symmetric eigenvalues routine in
warp.fem
.
Fixed
- Fix for
wp.func
erroring out when defining aTuple
as a return type hint (GH-302). - Fix array in-place op (
+=
,-=
) adjoints to compute gradients correctly in the backwards pass - Fix vector, matrix in-place assignment adjoints to compute gradients correctly in the backwards pass, e.g.:
v[1] = x
- Fix a bug in which Python docstrings would be created as local function variables in generated code.
- Fix a bug with autograd array access validation in functions from different modules.
- Fix a rare crash during error reporting on some systems due to glibc mismatches.
- Handle
--num_tiles 1
inexample_render_opengl.py
(GH-306). - Fix the computation of body contact forces in
FeatherstoneIntegrator
when bodies and particles collide. - Fix bug in
FeatherstoneIntegrator
whereeval_rigid_jacobian
could give incorrect results or reach an infinite
loop when the body and joint indices were not in the same order. AddedModel.joint_ancestor
to fix the indexing
from a joint to its parent joint in the articulation. - Fix wrong vertex index passed to
add_edges()
called fromModelBuilder.add_cloth_mesh()
(GH-319). - Add a workaround for uninitialized memory read warning in the
compute-sanitizer
initcheck tool when usingwp.Mesh
. - Fix name clashes when Warp functions and structs are returned from Python functions multiple times.
- Fix name clashes between Warp functions and structs defined in different modules.
- Fix code generation errors when overloading generic kernels defined in a Python function.
- Fix issues with unrelated functions being treated as overloads (e.g., closures).
- Fix handling of
stream
argument inarray.__dlpack__()
. - Fix a bug related to reloading CPU modules.
- Fix a crash when kernel functions are not found in CPU modules.
- Fix conditions not being evaluated as expected in
while
statements. - Fix printing Boolean and 8-bit integer values.
- Fix array interface type strings used for Boolean and 8-bit integer values.
- Fix initialization error when setting struct members.
- Fix Warp not being initialized upon entering a
wp.Tape
context. - Use
kDLBool
instead ofkDLUInt
for DLPack interop of Booleans.
v1.3.3
[1.3.3] - 2024-09-04
- Bug fixes
- Fix an aliasing issue with zero-copy array initialization from NumPy introduced in Warp 1.3.0.
- Fix
wp.Volume.load_from_numpy()
behavior whenbg_value
is a sequence of values.
[1.3.2] - 2024-08-30
- Bug fixes
- Fix accuracy of 3x3 SVD
wp.svd3
with fp64 numbers (GH-281). - Fix module hashing when a kernel argument contained a struct array (GH-287).
- Fix a bug in
wp.bvh_query_ray()
where the direction instead of the reciprocal direction was used
(GH-288). - Fix errors when launching a CUDA graph after a module is reloaded. Modules that were used during graph capture
will no longer be unloaded before the graph is released. - Fix a bug in
wp.sim.collide.triangle_closest_point_barycentric()
where the returned barycentric coordinates may be
incorrect when the closest point lies on an edge. - Fix 32-bit overflow when array shape is specified using
np.int32
. - Fix handling of integer indices in the
input_output_mask
argument toautograd.jacobian
and
autograd.jacobian_fd
(GH-289). - Fix
ModelBuilder.collapse_fixed_joints()
to correctly update the body centers of mass and the
ModelBuilder.articulation_start
array. - Fix precedence of closure constants over global constants.
- Fix quadrature point indexing in
wp.fem.ExplicitQuadrature
(regression from 1.3.0).
- Fix accuracy of 3x3 SVD
- Documentation improvements
- Add missing return types for built-in functions.
- Clarify that atomic operations also return the previous value.
- Clarify that
wp.bvh_query_aabb()
returns parts that overlap the bounding volume.
[1.3.1] - 2024-07-27
- Remove
wp.synchronize()
from PyTorch autograd function example Tape.check_kernel_array_access()
andTape.reset_array_read_flags()
are now private methods.- Fix reporting unmatched argument types
[1.3.0] - 2024-07-25
-
Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
compiled(compiled)
, loaded from the cache(cached)
, or was unable to be
loaded(error)
. wp.config.verbose = True
now also prints out a message upon the entry to awp.ScopedTimer
.- Add
wp.clear_kernel_cache()
to the public API. This is equivalent towp.build.clear_kernel_cache()
. - Add code-completion support for
wp.config
variables. - Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update
wp.matmul()
CPU fallback to use dtype explicitly innp.matmul()
call - Add support for PEP 563's
from __future__ import annotations
(GH-256). - Allow passing external arrays/tensors to
wp.launch()
directly via__cuda_array_interface__
and__array_interface__
, up to 2.5x faster conversion from PyTorch - Add faster Torch interop path using
return_ctype
argument towp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add
wp.abs()
andwp.sign()
for vector types - Expose scalar arithmetic operators to Python's runtime (e.g.:
wp.float16(1.23) * wp.float16(2.34)
) - Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating
wp.copy()
,wp.clone()
, andarray.assign()
differentiability - Add
__new__()
methods for all class__del__()
methods to handle when a class instance is created but not instantiated before garbage collection - Implement the assignment operator for
wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their
_t
suffix:wp.BVHQuery
,wp.HashGridQuery
,wp.MeshQueryAABB
,wp.MeshQueryPoint
, andwp.MeshQueryRay
- Add
wp.array(ptr=...)
to allow initializing arrays from pointer addresses inside of kernels (GH-206)
-
warp.autograd
improvements:- New
warp.autograd
module with utility functionsgradcheck()
,jacobian()
, andjacobian_fd()
for debugging kernel Jacobians (docs) - Add array overwrite detection, if
wp.config.verify_autograd_array_access
is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs) - Fix bug where modification of
@wp.func_replay
functions and native snippets would not trigger module recompilation - Add documentation for dynamic loop autograd limitations
- New
-
warp.sim
improvements:- Improve memory usage and performance for rigid body contact handling when
self.rigid_mesh_contact_max
is zero (default behavior). - The
mask
argument towp.sim.eval_fk()
now accepts both integer and boolean arrays to mask articulations. - Fix handling of
ModelBuilder.joint_act
inModelBuilder.collapse_fixed_joints()
(affected floating-base systems) - Fix and improve implementation of
ModelBuilder.plot_articulation()
to visualize the articulation tree of a rigid-body mechanism - Fix ShapeInstancer
__new__()
method (missing instance return and*args
parameter) - Fix handling of
upaxis
variable inModelBuilder
and the rendering thereof inOpenGLRenderer
- Improve memory usage and performance for rigid body contact handling when
-
warp.sparse
improvements:- Sparse matrix allocations (from
bsr_from_triplets()
,bsr_axpy()
, etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously. bsr_assign()
now supports changing block shape (including CSR/BSR conversions)- Add Python operator overloads for common sparse matrix operations, e.g
A += 0.5 * B
,y = x @ C
- Sparse matrix allocations (from
-
warp.fem
new features and fixes:- Support for variable number of nodes per element
- Global
wp.fem.lookup()
operator now supportswp.fem.Tetmesh
andwp.fem.Trimesh2D
geometries - Simplified defining custom subdomains (
wp.fem.Subdomain
), free-slip boundary conditions - New field types:
wp.fem.UniformField
,wp.fem.ImplicitField
andwp.fem.NonconformingField
- New
streamlines
,magnetostatics
andnonconforming_contact
examples, updatedmixed_elasticity
to use a nonlinear model - Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of
wp.fem.PicQuadrature
w.r.t. positions and measures
v1.3.2
[1.3.2] - 2024-08-30
- Bug fixes
- Fix accuracy of 3x3 SVD
wp.svd3
with fp64 numbers (GH-281). - Fix module hashing when a kernel argument contained a struct array (GH-287).
- Fix a bug in
wp.bvh_query_ray()
where the direction instead of the reciprocal direction was used
(GH-288). - Fix errors when launching a CUDA graph after a module is reloaded. Modules that were used during graph capture
will no longer be unloaded before the graph is released. - Fix a bug in
wp.sim.collide.triangle_closest_point_barycentric()
where the returned barycentric coordinates may be
incorrect when the closest point lies on an edge. - Fix 32-bit overflow when array shape is specified using
np.int32
. - Fix handling of integer indices in the
input_output_mask
argument toautograd.jacobian
and
autograd.jacobian_fd
(GH-289). - Fix
ModelBuilder.collapse_fixed_joints()
to correctly update the body centers of mass and the
ModelBuilder.articulation_start
array. - Fix precedence of closure constants over global constants.
- Fix quadrature point indexing in
wp.fem.ExplicitQuadrature
(regression from 1.3.0).
- Fix accuracy of 3x3 SVD
- Documentation improvements
- Add missing return types for built-in functions.
- Clarify that atomic operations also return the previous value.
- Clarify that
wp.bvh_query_aabb()
returns parts that overlap the bounding volume.
[1.3.1] - 2024-07-27
- Remove
wp.synchronize()
from PyTorch autograd function example Tape.check_kernel_array_access()
andTape.reset_array_read_flags()
are now private methods.- Fix reporting unmatched argument types
[1.3.0] - 2024-07-25
- Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
compiled(compiled)
, loaded from the cache(cached)
, or was unable to be
loaded(error)
. wp.config.verbose = True
now also prints out a message upon the entry to awp.ScopedTimer
.- Add
wp.clear_kernel_cache()
to the public API. This is equivalent towp.build.clear_kernel_cache()
. - Add code-completion support for
wp.config
variables. - Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update
wp.matmul()
CPU fallback to use dtype explicitly innp.matmul()
call - Add support for PEP 563's
from __future__ import annotations
(GH-256). - Allow passing external arrays/tensors to
wp.launch()
directly via__cuda_array_interface__
and__array_interface__
, up to 2.5x faster conversion from PyTorch - Add faster Torch interop path using
return_ctype
argument towp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add
wp.abs()
andwp.sign()
for vector types - Expose scalar arithmetic operators to Python's runtime (e.g.:
wp.float16(1.23) * wp.float16(2.34)
) - Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating
wp.copy()
,wp.clone()
, andarray.assign()
differentiability - Add
__new__()
methods for all class__del__()
methods to handle when a class instance is created but not instantiated before garbage collection - Implement the assignment operator for
wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their
_t
suffix:wp.BVHQuery
,wp.HashGridQuery
,wp.MeshQueryAABB
,wp.MeshQueryPoint
, andwp.MeshQueryRay
- Add
wp.array(ptr=...)
to allow initializing arrays from pointer addresses inside of kernels (GH-206)
v1.3.1
[1.3.1] - 2024-07-27
- Remove
wp.synchronize()
from PyTorch autograd function example Tape.check_kernel_array_access()
andTape.reset_array_read_flags()
are now private methods.- Fix reporting unmatched argument types
[1.3.0] - 2024-07-25
-
Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
compiled(compiled)
, loaded from the cache(cached)
, or was unable to be
loaded(error)
. wp.config.verbose = True
now also prints out a message upon the entry to awp.ScopedTimer
.- Add
wp.clear_kernel_cache()
to the public API. This is equivalent towp.build.clear_kernel_cache()
. - Add code-completion support for
wp.config
variables. - Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update
wp.matmul()
CPU fallback to use dtype explicitly innp.matmul()
call - Add support for PEP 563's
from __future__ import annotations
(GH-256). - Allow passing external arrays/tensors to
wp.launch()
directly via__cuda_array_interface__
and__array_interface__
, up to 2.5x faster conversion from PyTorch - Add faster Torch interop path using
return_ctype
argument towp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add
wp.abs()
andwp.sign()
for vector types - Expose scalar arithmetic operators to Python's runtime (e.g.:
wp.float16(1.23) * wp.float16(2.34)
) - Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating
wp.copy()
,wp.clone()
, andarray.assign()
differentiability - Add
__new__()
methods for all class__del__()
methods to handle when a class instance is created but not instantiated before garbage collection - Implement the assignment operator for
wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their
_t
suffix:wp.BVHQuery
,wp.HashGridQuery
,wp.MeshQueryAABB
,wp.MeshQueryPoint
, andwp.MeshQueryRay
- Add
wp.array(ptr=...)
to allow initializing arrays from pointer addresses inside of kernels (GH-206)
-
warp.autograd
improvements:- New
warp.autograd
module with utility functionsgradcheck()
,jacobian()
, andjacobian_fd()
for debugging kernel Jacobians (docs) - Add array overwrite detection, if
wp.config.verify_autograd_array_access
is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs) - Fix bug where modification of
@wp.func_replay
functions and native snippets would not trigger module recompilation - Add documentation for dynamic loop autograd limitations
- New
-
warp.sim
improvements:- Improve memory usage and performance for rigid body contact handling when
self.rigid_mesh_contact_max
is zero (default behavior). - The
mask
argument towp.sim.eval_fk()
now accepts both integer and boolean arrays to mask articulations. - Fix handling of
ModelBuilder.joint_act
inModelBuilder.collapse_fixed_joints()
(affected floating-base systems) - Fix and improve implementation of
ModelBuilder.plot_articulation()
to visualize the articulation tree of a rigid-body mechanism - Fix ShapeInstancer
__new__()
method (missing instance return and*args
parameter) - Fix handling of
upaxis
variable inModelBuilder
and the rendering thereof inOpenGLRenderer
- Improve memory usage and performance for rigid body contact handling when
-
warp.sparse
improvements:- Sparse matrix allocations (from
bsr_from_triplets()
,bsr_axpy()
, etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously. bsr_assign()
now supports changing block shape (including CSR/BSR conversions)- Add Python operator overloads for common sparse matrix operations, e.g
A += 0.5 * B
,y = x @ C
- Sparse matrix allocations (from
-
warp.fem
new features and fixes:- Support for variable number of nodes per element
- Global
wp.fem.lookup()
operator now supportswp.fem.Tetmesh
andwp.fem.Trimesh2D
geometries - Simplified defining custom subdomains (
wp.fem.Subdomain
), free-slip boundary conditions - New field types:
wp.fem.UniformField
,wp.fem.ImplicitField
andwp.fem.NonconformingField
- New
streamlines
,magnetostatics
andnonconforming_contact
examples, updatedmixed_elasticity
to use a nonlinear model - Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of
wp.fem.PicQuadrature
w.r.t. positions and measures
v1.3.0
[1.3.0] - 2024-07-25
-
Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
compiled(compiled)
, loaded from the cache(cached)
, or was unable to be
loaded(error)
. wp.config.verbose = True
now also prints out a message upon the entry to awp.ScopedTimer
.- Add
wp.clear_kernel_cache()
to the public API. This is equivalent towp.build.clear_kernel_cache()
. - Add code-completion support for
wp.config
variables. - Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update
wp.matmul()
CPU fallback to use dtype explicitly innp.matmul()
call - Add support for PEP 563's
from __future__ import annotations
(GH-256). - Allow passing external arrays/tensors to
wp.launch()
directly via__cuda_array_interface__
and__array_interface__
, up to 2.5x faster conversion from PyTorch - Add faster Torch interop path using
return_ctype
argument towp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add
wp.abs()
andwp.sign()
for vector types - Expose scalar arithmetic operators to Python's runtime (e.g.:
wp.float16(1.23) * wp.float16(2.34)
) - Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating
wp.copy()
,wp.clone()
, andarray.assign()
differentiability - Add
__new__()
methods for all class__del__()
methods to handle when a class instance is created but not instantiated before garbage collection - Implement the assignment operator for
wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their
_t
suffix:wp.BVHQuery
,wp.HashGridQuery
,wp.MeshQueryAABB
,wp.MeshQueryPoint
, andwp.MeshQueryRay
- Add
wp.array(ptr=...)
to allow initializing arrays from pointer addresses inside of kernels (GH-206)
-
warp.autograd
improvements:- New
warp.autograd
module with utility functionsgradcheck()
,jacobian()
, andjacobian_fd()
for debugging kernel Jacobians (docs) - Add array overwrite detection, if
wp.config.verify_autograd_array_access
is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs) - Fix bug where modification of
@wp.func_replay
functions and native snippets would not trigger module recompilation - Add documentation for dynamic loop autograd limitations
- New
-
warp.sim
improvements:- Improve memory usage and performance for rigid body contact handling when
self.rigid_mesh_contact_max
is zero (default behavior). - The
mask
argument towp.sim.eval_fk()
now accepts both integer and boolean arrays to mask articulations. - Fix handling of
ModelBuilder.joint_act
inModelBuilder.collapse_fixed_joints()
(affected floating-base systems) - Fix and improve implementation of
ModelBuilder.plot_articulation()
to visualize the articulation tree of a rigid-body mechanism - Fix ShapeInstancer
__new__()
method (missing instance return and*args
parameter) - Fix handling of
upaxis
variable inModelBuilder
and the rendering thereof inOpenGLRenderer
- Improve memory usage and performance for rigid body contact handling when
-
warp.sparse
improvements:- Sparse matrix allocations (from
bsr_from_triplets()
,bsr_axpy()
, etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously. bsr_assign()
now supports changing block shape (including CSR/BSR conversions)- Add Python operator overloads for common sparse matrix operations, e.g
A += 0.5 * B
,y = x @ C
- Sparse matrix allocations (from
-
warp.fem
new features and fixes:- Support for variable number of nodes per element
- Global
wp.fem.lookup()
operator now supportswp.fem.Tetmesh
andwp.fem.Trimesh2D
geometries - Simplified defining custom subdomains (
wp.fem.Subdomain
), free-slip boundary conditions - New field types:
wp.fem.UniformField
,wp.fem.ImplicitField
andwp.fem.NonconformingField
- New
streamlines
,magnetostatics
andnonconforming_contact
examples, updatedmixed_elasticity
to use a nonlinear model - Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of
wp.fem.PicQuadrature
w.r.t. positions and measures
v1.2.2
[1.2.2] - 2024-07-04
- Support for NumPy >= 2.0
[1.2.1] - 2024-06-14
- Fix generic function caching
- Fix Warp not being initialized when constructing arrays with
wp.array()
- Fix
wp.is_mempool_access_supported()
not resolving the provided device arguments towp.context.Device
[1.2.0] - 2024-06-06
- Add a not-a-number floating-point constant that can be used as
wp.NAN
orwp.nan
. - Add
wp.isnan()
,wp.isinf()
, andwp.isfinite()
for scalars, vectors, matrices, etc. - Improve kernel cache reuse by hashing just the local module constants. Previously, a
module's hash was affected by allwp.constant()
variables declared in a Warp program. - Revised module compilation process to allow multiple processes to use the same kernel cache directory.
Cached kernels will now be stored in hash-specific subdirectory. - Add runtime checks for
wp.MarchingCubes
on field dimensions and size - Fix memory leak in
wp.Mesh
BVH (GH-225) - Use C++17 when building the Warp library and user kernels
- Increase PTX target architecture up to
sm_75
(fromsm_70
), enabling Turing ISA features - Extended NanoVDB support (see
warp.Volume
):- Add support for data-agnostic index grids, allocation at voxel granularity
- New
wp.volume_lookup_index()
,wp.volume_sample_index()
and genericwp.volume_sample()
/wp.volume_lookup()
/wp.volume_store()
kernel-level functions - Zero-copy aliasing of in-memory grids, support for multi-grid buffers
- Grid introspection and blind data access capabilities
warp.fem
can now work directly on NanoVDB grids usingwarp.fem.Nanogrid
- Fixed
wp.volume_sample_v()
andwp.volume_store_*()
adjoints - Prevent
wp.volume_store()
from overwriting grid background values
- Improve validation of user-provided fields and values in
warp.fem
- Support headless rendering of
wp.render.OpenGLRenderer
viapyglet.options["headless"] = True
wp.render.RegisteredGLBuffer
can fall back to CPU-bound copying if CUDA/OpenGL interop is not available- Clarify terms for external contributions, please see CONTRIBUTING.md for details
- Improve performance of
wp.sparse.bsr_mm()
by ~5x on benchmark problems - Fix for XPBD incorrectly indexing into of joint actuations
joint_act
arrays - Fix for mass matrix gradients computation in
wp.sim.FeatherstoneIntegrator()
- Fix for handling of
--msvc_path
in build scripts - Fix for
wp.copy()
params to record dest and src offset parameters onwp.Tape()
- Fix for
wp.randn()
to ensure return values are finite - Fix for slicing of arrays with gradients in kernels
- Fix for function overload caching, ensure module is rebuilt if any function overloads are modified
- Fix for handling of
bool
types in generic kernels - Publish CUDA 12.5 binaries for Hopper support, see https://github.com/nvidia/warp?tab=readme-ov-file#installing for details
v1.2.1
[1.2.1] - 2024-06-14
- Fix generic function caching
- Fix Warp not being initialized when constructing arrays with
wp.array()
- Fix
wp.is_mempool_access_supported()
not resolving the provided device arguments towp.context.Device
[1.2.0] - 2024-06-06
- Add a not-a-number floating-point constant that can be used as
wp.NAN
orwp.nan
. - Add
wp.isnan()
,wp.isinf()
, andwp.isfinite()
for scalars, vectors, matrices, etc. - Improve kernel cache reuse by hashing just the local module constants. Previously, a
module's hash was affected by allwp.constant()
variables declared in a Warp program. - Revised module compilation process to allow multiple processes to use the same kernel cache directory.
Cached kernels will now be stored in hash-specific subdirectory. - Add runtime checks for
wp.MarchingCubes
on field dimensions and size - Fix memory leak in
wp.Mesh
BVH (GH-225) - Use C++17 when building the Warp library and user kernels
- Increase PTX target architecture up to
sm_75
(fromsm_70
), enabling Turing ISA features - Extended NanoVDB support (see
warp.Volume
):- Add support for data-agnostic index grids, allocation at voxel granularity
- New
wp.volume_lookup_index()
,wp.volume_sample_index()
and genericwp.volume_sample()
/wp.volume_lookup()
/wp.volume_store()
kernel-level functions - Zero-copy aliasing of in-memory grids, support for multi-grid buffers
- Grid introspection and blind data access capabilities
warp.fem
can now work directly on NanoVDB grids usingwarp.fem.Nanogrid
- Fixed
wp.volume_sample_v()
andwp.volume_store_*()
adjoints - Prevent
wp.volume_store()
from overwriting grid background values
- Improve validation of user-provided fields and values in
warp.fem
- Support headless rendering of
wp.render.OpenGLRenderer
viapyglet.options["headless"] = True
wp.render.RegisteredGLBuffer
can fall back to CPU-bound copying if CUDA/OpenGL interop is not available- Clarify terms for external contributions, please see CONTRIBUTING.md for details
- Improve performance of
wp.sparse.bsr_mm()
by ~5x on benchmark problems - Fix for XPBD incorrectly indexing into of joint actuations
joint_act
arrays - Fix for mass matrix gradients computation in
wp.sim.FeatherstoneIntegrator()
- Fix for handling of
--msvc_path
in build scripts - Fix for
wp.copy()
params to record dest and src offset parameters onwp.Tape()
- Fix for
wp.randn()
to ensure return values are finite - Fix for slicing of arrays with gradients in kernels
- Fix for function overload caching, ensure module is rebuilt if any function overloads are modified
- Fix for handling of
bool
types in generic kernels - Publish CUDA 12.5 binaries for Hopper support, see https://github.com/nvidia/warp?tab=readme-ov-file#installing for details
v1.2.0
[1.2.0] - 2024-06-06
- Add a not-a-number floating-point constant that can be used as
wp.NAN
orwp.nan
. - Add
wp.isnan()
,wp.isinf()
, andwp.isfinite()
for scalars, vectors, matrices, etc. - Improve kernel cache reuse by hashing just the local module constants. Previously, a
module's hash was affected by allwp.constant()
variables declared in a Warp program. - Revised module compilation process to allow multiple processes to use the same kernel cache directory.
Cached kernels will now be stored in hash-specific subdirectory. - Add runtime checks for
wp.MarchingCubes
on field dimensions and size - Fix memory leak in
wp.Mesh
BVH (GH-225) - Use C++17 when building the Warp library and user kernels
- Increase PTX target architecture up to
sm_75
(fromsm_70
), enabling Turing ISA features - Extended NanoVDB support (see
warp.Volume
):- Add support for data-agnostic index grids, allocation at voxel granularity
- New
wp.volume_lookup_index()
,wp.volume_sample_index()
and genericwp.volume_sample()
/wp.volume_lookup()
/wp.volume_store()
kernel-level functions - Zero-copy aliasing of in-memory grids, support for multi-grid buffers
- Grid introspection and blind data access capabilities
warp.fem
can now work directly on NanoVDB grids usingwarp.fem.Nanogrid
- Fixed
wp.volume_sample_v()
andwp.volume_store_*()
adjoints - Prevent
wp.volume_store()
from overwriting grid background values
- Improve validation of user-provided fields and values in
warp.fem
- Support headless rendering of
wp.render.OpenGLRenderer
viapyglet.options["headless"] = True
wp.render.RegisteredGLBuffer
can fall back to CPU-bound copying if CUDA/OpenGL interop is not available- Clarify terms for external contributions, please see CONTRIBUTING.md for details
- Improve performance of
wp.sparse.bsr_mm()
by ~5x on benchmark problems - Fix for XPBD incorrectly indexing into of joint actuations
joint_act
arrays - Fix for mass matrix gradients computation in
wp.sim.FeatherstoneIntegrator()
- Fix for handling of
--msvc_path
in build scripts - Fix for
wp.copy()
params to record dest and src offset parameters onwp.Tape()
- Fix for
wp.randn()
to ensure return values are finite - Fix for slicing of arrays with gradients in kernels
- Fix for function overload caching, ensure module is rebuilt if any function overloads are modified
- Fix for handling of
bool
types in generic kernels - Publish CUDA 12.5 binaries for Hopper support, see https://github.com/nvidia/warp?tab=readme-ov-file#installing for details
[1.1.1] - 2024-05-24
wp.init()
is no longer required to be called explicitly and will be performed on first call to the API- Speed up
omni.warp.core
's startup time
v1.1.0
[1.1.0] - 2024-05-09
- Support returning a value from
@wp.func_native
CUDA functions using type hints - Improved differentiability of the
wp.sim.FeatherstoneIntegrator
- Fix gradient propagation for rigid body contacts in
wp.sim.collide()
- Added support for event-based timing, see
wp.ScopedTimer()
- Added Tape visualization and debugging functions, see
wp.Tape.visualize()
- Support constructing Warp arrays from objects that define the
__cuda_array_interface__
attribute - Support copying a struct to another device, use
struct.to(device)
to migrate struct arrays - Allow rigid shapes to not have any collisions with other shapes in
wp.sim.Model
- Change default test behavior to test redundant GPUs (up to 2x)
- Test each example in an individual subprocess
- Polish and optimize various examples and tests
- Allow non-contiguous point arrays to be passed to
wp.HashGrid.build()
- Upgrade LLVM to 18.1.3 for from-source builds and Linux x86-64 builds
- Build DLL source code as C++17 and require GCC 9.4 as a minimum
- Array clone, assign, and copy are now differentiable
- Use
Ruff
for formatting and linting - Various documentation improvements (infinity, math constants, etc.)
- Improve URDF importer, handle joint armature
- Allow builtins.bool to be used in Warp data structures
- Use external gradient arrays in backward passes when passed to
wp.launch()
- Add Conjugate Residual linear solver, see
wp.optim.linear.cr()
- Fix propagation of gradients on aliased copy of variables in kernels
- Facilitate debugging and speed up
import warp
by eliminating raising any exceptions - Improve support for nested vec/mat assignments in structs
- Recommend Python 3.9 or higher, which is required for JAX and soon PyTorch.
- Support gradient propagation for indexing sliced multi-dimensional arrays, i.e.
a[i][j]
vs.a[i, j]
- Provide an informative message if setting DLL C-types failed, instructing to try rebuilding the library
[1.0.3] - 2024-04-17
- Add a
support_level
entry to the configuration file of the extensions