Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed / accuracy comparison against torch.linalg.solve #1

Open
lezcano opened this issue Aug 16, 2024 · 3 comments
Open

Speed / accuracy comparison against torch.linalg.solve #1

lezcano opened this issue Aug 16, 2024 · 3 comments

Comments

@lezcano
Copy link

lezcano commented Aug 16, 2024

I would expect this Python implementation to be quite a bit slower than linalg.solve. If this is the case, the applications of this would be to use it with custom operators, which PyTorch doesn't currently support natively (see pytorch/pytorch#28341)

@devzhk
Copy link
Owner

devzhk commented Aug 16, 2024

I'd like to elaborate on more fundamental reasons why CG and GMRES are needed and where they may be useful.

  • Limitations of torch.linalg.solve: it directly computes the matrix inversion, which only works for dense and invertible matrices. It cannot solve general linear systems. It's also impractical when dealing with large-scale linear systems due to memory $O(n^2)$ and time complexity $O(n^3)$.
  • CG is the fastest algorithm when solving large-scale, symmetric, positive semidefinite linear systems. The space complexity is $\mathcal{O}(nm)$, where $n$ is the matrix dimension, $m$ is the number of iterations. The convergence rate is the fastest among other iterative solvers, even without preconditioning. See this paper for more details.
  • GMRES is a more general iterative solver that can handle non-symmetric and indefinite systems. See this paper for more details.

In summary,

  • both CG and GMRES can solve a larger class of problems that torch.linalg.solve cannot solve.
  • As iterative algorithms, they offer the flexibility to control the tolerance and perform early stopping. This provides an explicit accuracy-speed tradeoff, which is particularly valuable for large-scale systems.
  • They may open up more research possibilities in designing algorithms for neural networks. As a proof of concept, we used CG for a new optimizer in our previous ICML paper mplicit competitive regularization in GANs.

@devzhk
Copy link
Owner

devzhk commented Aug 16, 2024

The specific implementation can always be optimized later; however, it is probably better to consider whether it's good to incorporate these new algorithms from a more strategic, long-term perspective for the development of PyTorch.

@lezcano
Copy link
Author

lezcano commented Aug 17, 2024

For generic systems we have lstsq. For symmetric non-definite systems we have https://pytorch.org/docs/stable/generated/torch.linalg.ldl_factor.html.

That being said, I do agree that these solvers could potentially be of interest, but before adding them to PyTorch core, we would need to:

  • Find whether there are efficient implementations of these in cusolver/magma/blas. This is why I mentioned benchmarks in the OP.
  • Discuss what would be a good API to expose them
  • Discuss whether these belong in PyTorch core or a third party library that also supports the concept of LinearOperator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants