Question: Write a function to process the audio object ( torch tensor with its sample rate ) to the augmented mel spectrogram ( torch tensor )

Write a function to process the audio object (torch tensor with its sample rate) to the
augmented mel spectrogram (torch tensor).(15%)
Input parameters contain an audio object, the re-sample rate, the density of both
masking, the number of frequency masks and the number of time masks
Use the re-sample rate new_sr to adjust the sample rate of the audio object
No need to change the audio object to mono or stereo channels
Construct the mel spectrogram (torch tensor) based on the adjusted audio object
Use max_mask_pct, n_freq_masks and n_time_masks to create the frequency
masks and the time masks on the mel spectrogram
Return the augmented mel spectrogram (torch tensor)
As randomness is involved in masking, your result may be different from the
specification. However, the shape of the augmented mel spectrogram should be the
same.
This problem covers audio signal processing.
Write a function to process the audio object (

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!