I have a torch-Tensor of shape [k, l, m]
, with k
being the batch size, and [l, m]
forming a spectrogram. Now, I would like to apply both a time-mask and a frequency-mask on each spectrogram. Currently, I have to loop over all entries with the following snippet of code:
local_frequency_mask = torchaudio.transforms.FrequencyMasking(freq_mask_param = 12)
local_time_mask = torchaudio.transforms.TimeMasking(freq_mask_param = 12)
for block_no in range(test_data.shape[0]):
cur_block = test_data[block_no, :, :]
cur_mask_val = cur_block.mean()
for _ in range(number_of_frequency_masks):
cur_block = local_frequency_mask(cur_block, cur_mask_val)
for _ in range(number_of_time_masks):
cur_block = local_time_mask(cur_block, cur_mask_val)
test_data[block_no, :, :] = cur_block
Is there any way I could skip the loop and apply those transforms directly onto my 3d-tensor, especially as every block has a different mean()
-value? According to the source code/documentation the value for cur_mask_val
has to be a float, and can’t be a tensor either.