On this page
torch.linalg.svd
torch.linalg.svd(A, full_matrices=True, *, driver=None, out=None)-
Computes the singular value decomposition (SVD) of a matrix.
Letting be or , the full SVD of a matrix , if
k = min(m,n), is defined aswhere , is the conjugate transpose when is complex, and the transpose when is real-valued. The matrices , (and thus ) are orthogonal in the real case, and unitary in the complex case.
When
m > n(resp.m < n) we can drop the lastm - n(resp.n - m) columns ofU(resp.V) to form the reduced SVD:where . In this case, and also have orthonormal columns.
Supports input of float, double, cfloat and cdouble dtypes. Also supports batches of matrices, and if
Ais a batch of matrices then the output has the same batch dimensions.The returned decomposition is a named tuple
(U, S, Vh)which corresponds to , , above.The singular values are returned in descending order.
The parameter
full_matriceschooses between the full (default) and reduced SVD.The
driverkwarg may be used in CUDA with a cuSOLVER backend to choose the algorithm used to compute the SVD. The choice of a driver is a trade-off between accuracy and speed.If
Ais well-conditioned (its condition number is not too large), or you do not mind some precision loss.- For a general matrix:
‘gesvdj’(Jacobi method) - If
Ais tall or wide (m >> norm << n):‘gesvda’(Approximate method)
- For a general matrix:
- If
Ais not well-conditioned or precision is relevant:‘gesvd’(QR based)
By default (
driver= None), we call‘gesvdj’and, if it fails, we fallback to‘gesvd’.Differences with
numpy.linalg.svd:- Unlike
numpy.linalg.svd, this function always returns a tuple of three tensors and it doesn’t supportcompute_uvargument. Please usetorch.linalg.svdvals(), which computes only the singular values, instead ofcompute_uv=False.
Note
When
full_matrices= True, the gradients with respect toU[…, :, min(m, n):]andVh[…, min(m, n):, :]will be ignored, as those vectors can be arbitrary bases of the corresponding subspaces.Warning
The returned tensors
UandVare not unique, nor are they continuous with respect toA. Due to this lack of uniqueness, different hardware and software may compute different singular vectors.This non-uniqueness is caused by the fact that multiplying any pair of singular vectors by
-1in the real case or by in the complex case produces another two valid singular vectors of the matrix. For this reason, the loss function shall not depend on this quantity, as it is not well-defined. This is checked for complex inputs when computing the gradients of this function. As such, when inputs are complex and are on a CUDA device, the computation of the gradients of this function synchronizes that device with the CPU.Warning
Gradients computed using
UorVhwill only be finite whenAdoes not have repeated singular values. IfAis rectangular, additionally, zero must also not be one of its singular values. Furthermore, if the distance between any two singular values is close to zero, the gradient will be numerically unstable, as it depends on the singular values through the computation of . In the rectangular case, the gradient will also be numerically unstable whenAhas small singular values, as it also depends on the computation of .See also
torch.linalg.svdvals()computes only the singular values. Unliketorch.linalg.svd(), the gradients ofsvdvals()are always numerically stable.torch.linalg.eig()for a function that computes another type of spectral decomposition of a matrix. The eigendecomposition works just on square matrices.torch.linalg.eigh()for a (faster) function that computes the eigenvalue decomposition for Hermitian and symmetric matrices.torch.linalg.qr()for another (much faster) decomposition that works on general matrices.- Parameters
- Keyword Arguments
- Returns
-
A named tuple
(U, S, Vh)which corresponds to , , above.Swill always be real-valued, even whenAis complex. It will also be ordered in descending order.UandVhwill have the same dtype asA. The left / right singular vectors will be given by the columns ofUand the rows ofVhrespectively.
Examples:
>>> A = torch.randn(5, 3) >>> U, S, Vh = torch.linalg.svd(A, full_matrices=False) >>> U.shape, S.shape, Vh.shape (torch.Size([5, 3]), torch.Size([3]), torch.Size([3, 3])) >>> torch.dist(A, U @ torch.diag(S) @ Vh) tensor(1.0486e-06) >>> U, S, Vh = torch.linalg.svd(A) >>> U.shape, S.shape, Vh.shape (torch.Size([5, 5]), torch.Size([3]), torch.Size([3, 3])) >>> torch.dist(A, U[:, :3] @ torch.diag(S) @ Vh) tensor(1.0486e-06) >>> A = torch.randn(7, 5, 3) >>> U, S, Vh = torch.linalg.svd(A, full_matrices=False) >>> torch.dist(A, U @ torch.diag_embed(S) @ Vh) tensor(3.0957e-06)
© 2024, PyTorch Contributors
PyTorch has a BSD-style license, as found in the LICENSE file.
https://pytorch.org/docs/2.1/generated/torch.linalg.svd.html