I was wondering if any with experience with PyTorch rounding error or the PositiveDefinite check on the 2×2 covariance matrix could help me understand this issue better?
Error was that…
Exception has occurred: ValueError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
Expected parameter covariance_matrix (Tensor of shape (117, 100, 20, 2, 2)) of distribution MultivariateNormal(loc: torch.Size([117, 100, 20, 2]), covariance_matrix: torch.Size([117, 100, 20, 2, 2])) to satisfy the constraint PositiveDefinite(), but found invalid values:…
Covariance matrix is constructed like this and is passed into MultivariateNormal class. Basically, from sigma_x, sigma_y and rho_xy, construct the covariance matrix in the standard steps from textbooks. By definition, a matrix constructed this way is positive-definite.
sigma_x_hat = Gaussian_mixture_params_reshaped[:, :, :, 3]
sigma_x = torch.exp(sigma_x_hat) * math.sqrt(temperature)
sigma_y_hat = Gaussian_mixture_params_reshaped[:, :, :, 4]
sigma_y = torch.exp(sigma_y_hat) * math.sqrt(temperature)
rho_xy_hat = Gaussian_mixture_params_reshaped[:, :, :, 5]
rho_xy = torch.nn.functional.tanh(rho_xy_hat)
covar_matrix_for_all_Gaussians = torch.zeros(size=(seq_len, batch_size, self.num_Gaussians, 2, 2))
covar_matrix_for_all_Gaussians = covar_matrix_for_all_Gaussians.to(current_device)
covar_matrix_for_all_Gaussians[:, :, :, 0, 0] = sigma_x**2
covar_matrix_for_all_Gaussians[:, :, :, 1, 1] = sigma_y**2
covar = rho_xy * (sigma_x) * (sigma_y)
covar_matrix_for_all_Gaussians[:, :, :, 0, 1] = covar
covar_matrix_for_all_Gaussians[:, :, :, 1, 0] = covar
Gaussian_distributions = torch.distributions.MultivariateNormal(loc=mu_for_all_Gaussians, covariance_matrix=covar_matrix_for_all_Gaussians)
After this covar_matrix_for_all_Gaussians has shape (117, 100, 20, 2, 2). Where the last two dimensions represent a 2×2 covariance matrix, each for all 20 Gaussian distributions, for 117 time steps, and for 100 sequences in a batch.
But this covar_matrix_for_all_Gaussians tensor failed the PositiveDefinite check (within torch.distributions.constraints)
class _PositiveDefinite(_Symmetric):
"""
Constrain to positive-definite matrices.
"""
def check(self, value):
sym_check = super().check(value)
if not sym_check.all():
return sym_check
return torch.linalg.cholesky_ex(value).info.eq(0)
when I go into the MultivariateNormal class that is being initialized and checked some values in the debugging console, I was very confused.
Below, “self” refers to the MultivariateNormal object being initialized. Thus, self.covariance_matrix is the [117, 100, 20, 2, 2] tensor that we constructed above.
The PositiveDefinite check failed on the entire tensor. But when we find out which specific indices give the failure (in this case the 2×2 matrix at the [114, 29, 1]-indices), and perform the same check on that 2×2 matrix only, then the check succeeds. Clearly, the batch processing somehow leads to this error?
self.covariance_matrix.shape
>> torch.Size([117, 100, 20, 2, 2])
torch.distributions.constraints._PositiveDefinite().check(self.covariance_matrix).all() # the PositiveDefinite check on the entire tensor failed
>> torch.Size([])_tensor(False, device='cuda:0')
valid_or_not = torch.distributions.constraints._PositiveDefinite().check(self.covariance_matrix) # let’s find out which specific indices give the failure
valid_or_not.shape
>> torch.Size([117, 100, 20])
invalid_indices = torch.nonzero(~valid_or_not)
invalid_indices
>> torch.Size([1, 3])_tensor([[114, 29, 1]], device='cuda:0')
valid_or_not[114, 29, 1]
>> torch.Size([])_tensor(False, device='cuda:0')
torch.distributions.constraints._PositiveDefinite().check(self.covariance_matrix)[114, 29, 1] # indeed, accessing the result of the check at this specific indices gives “False”
>> torch.Size([])_tensor(False, device='cuda:0')
self.covariance_matrix[114, 29, 1]
>> torch.Size([2, 2])_tensor([[0.0023, 0.0067],
[0.0067, 0.0194]], device='cuda:0', grad_fn=<SelectBackward0>)
torch.distributions.constraints._PositiveDefinite().check(self.covariance_matrix[114, 29, 1]) # but when the check is performed on that specific 2x2 matrix, then the check succeeds!
>> torch.Size([])_tensor(True, device='cuda:0')
torch.distributions.constraints._PositiveDefinite().check(self.covariance_matrix)[114, 29, 1] # again, if we perform the check on entire tensor, the access the 114, 29, 1-indexed result, then we get a failure
>> torch.Size([])_tensor(False, device='cuda:0')
Later, when I tried to find the eigenvalues for this specific 2×2 matrix at [114, 29, 1] index, then I get an eigenvalue that is super close to 0. So perhaps numerical rounding leads to some matrix inaccurately concluded to be not PositiveDefinite?
solve_quadratic(1, -sx2-sy2, sx2*sy2*(1-(rosxsy / math.sqrt(sx2 * sy2))**2))
>> (0.02172636674661413, 3.172062317674529e-10)
Any insights are appreciated. Thanks! I don’t know a way to add some epsilons to ensure that my constructed 2×2 covariance matrices to ensure that they are positive-definite. The only solution for me now is to modify the MultivariateNormal class’s code to change the check from PositiveDefinite to PositiveSemidefinite. What do you think?
Steps taken:
Please see my debugging console outputs above. The PositiveDefinite check failed on the entire tensor. But when we find out which specific indices give the failure (in this case the 2×2 matrix at the [114, 29, 1]-indices), and perform the same check on that 2×2 matrix only, then the check succeeds. Clearly, the batch processing somehow leads to this error?
Thuan Nguyen is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.