Question: Write a function to process the audio object ( torch tensor with its sample rate ) to the augmented mel spectrogram ( torch tensor )
Write a function to process the audio object torch tensor with its sample rate to the
augmented mel spectrogram torch tensor
Input parameters contain an audio object, the resample rate, the density of both
masking, the number of frequency masks and the number of time masks
Use the resample rate newsr to adjust the sample rate of the audio object
No need to change the audio object to mono or stereo channels
Construct the mel spectrogram torch tensor based on the adjusted audio object
Use maxmaskpct nfreqmasks and ntimemasks to create the frequency
masks and the time masks on the mel spectrogram
Return the augmented mel spectrogram torch tensor
As randomness is involved in masking, your result may be different from the
specification. However, the shape of the augmented mel spectrogram should be the
same.
This problem covers audio signal processing.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
