Speed / accuracy comparison against `torch.linalg.solve` #1

lezcano · 2024-08-16T12:11:44Z

I would expect this Python implementation to be quite a bit slower than linalg.solve. If this is the case, the applications of this would be to use it with custom operators, which PyTorch doesn't currently support natively (see pytorch/pytorch#28341)

The text was updated successfully, but these errors were encountered:

devzhk · 2024-08-16T23:27:29Z

I'd like to elaborate on more fundamental reasons why CG and GMRES are needed and where they may be useful.

Limitations of torch.linalg.solve: it directly computes the matrix inversion, which only works for dense and invertible matrices. It cannot solve general linear systems. It's also impractical when dealing with large-scale linear systems due to memory $O(n^2)$ and time complexity $O(n^3)$.
CG is the fastest algorithm when solving large-scale, symmetric, positive semidefinite linear systems. The space complexity is $\mathcal{O}(nm)$, where $n$ is the matrix dimension, $m$ is the number of iterations. The convergence rate is the fastest among other iterative solvers, even without preconditioning. See this paper for more details.
GMRES is a more general iterative solver that can handle non-symmetric and indefinite systems. See this paper for more details.

In summary,

both CG and GMRES can solve a larger class of problems that torch.linalg.solve cannot solve.
As iterative algorithms, they offer the flexibility to control the tolerance and perform early stopping. This provides an explicit accuracy-speed tradeoff, which is particularly valuable for large-scale systems.
They may open up more research possibilities in designing algorithms for neural networks. As a proof of concept, we used CG for a new optimizer in our previous ICML paper mplicit competitive regularization in GANs.

devzhk · 2024-08-16T23:31:16Z

The specific implementation can always be optimized later; however, it is probably better to consider whether it's good to incorporate these new algorithms from a more strategic, long-term perspective for the development of PyTorch.

lezcano · 2024-08-17T12:15:17Z

For generic systems we have lstsq. For symmetric non-definite systems we have https://pytorch.org/docs/stable/generated/torch.linalg.ldl_factor.html.

That being said, I do agree that these solvers could potentially be of interest, but before adding them to PyTorch core, we would need to:

Find whether there are efficient implementations of these in cusolver/magma/blas. This is why I mentioned benchmarks in the OP.
Discuss what would be a good API to expose them
Discuss whether these belong in PyTorch core or a third party library that also supports the concept of LinearOperator.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed / accuracy comparison against `torch.linalg.solve` #1

Speed / accuracy comparison against `torch.linalg.solve` #1

lezcano commented Aug 16, 2024

devzhk commented Aug 16, 2024 •

edited

Loading

devzhk commented Aug 16, 2024

lezcano commented Aug 17, 2024 •

edited

Loading

Speed / accuracy comparison against torch.linalg.solve #1

Speed / accuracy comparison against torch.linalg.solve #1

Comments

lezcano commented Aug 16, 2024

devzhk commented Aug 16, 2024 • edited Loading

devzhk commented Aug 16, 2024

lezcano commented Aug 17, 2024 • edited Loading

Speed / accuracy comparison against `torch.linalg.solve` #1

Speed / accuracy comparison against `torch.linalg.solve` #1

devzhk commented Aug 16, 2024 •

edited

Loading

lezcano commented Aug 17, 2024 •

edited

Loading