Implement compute spectrogram to generate the spectrogram from an audio signal Use audio utils local peaks ( ) to detect local peaks Apply thresholding to select the largest peaks Build a database of peak pairs using the specified constraints and store them using a hash table Implement the song identification function, comparing the fingerprint of a recorded or loaded clip against the database Create a command line interface to control the program s behavior Test your implementation using the provided unit tests audio utils import numpy as np import pyaudio from scipy import ndimage, signal from scipy import io as spio CHUNK 1 0 2 4 FORMAT pyaudio paInt 1 6 RATE 4 8 0 0 0 CHANNELS 1 def record audio ( duration , f s ) Sets up the audio stream and records audio for a given duration Parameters duration float duration of the recording in seconds f s int sampling rate in Hz Returns ndarray mono audio signal sampled at f s Hz and normalized to have a zero mean p pyaudio PyAudio ( ) print ( p get default input device info ( ) ) stream p open ( format FORMAT, channels CHANNELS, rate RATE, input True, frames per buffer CHUNK, ) print ( recording ) frames for in range ( 0 , int ( RATE CHUNK duration ) ) data stream read ( CHUNK ) frames append ( np frombuffer ( data , dtype np int 1 6 ) ) print ( done recording ) stream stop stream ( ) stream close ( ) p terminate ( ) audio np concatenate ( frames ) astype ( float ) audio resample ( audio , RATE, f s ) audio np mean ( audio ) return audio def wav processing ( wav file, f s ) Loads a wav file and resamples it to f s Hz Parameters wav file str path to the wave file f s int desired sampling rate in Hz Returns ndarray audio signal sampled at f s Hz and normalized to have a zero mean Load the audio file audio all spio wavfile read ( wav file ) f s orig audio all 0 audio audio all 1 astype ( float ) combine channels if len ( audio shape ) 1 audio np mean ( audio , axis 1 ) remove the mean audio np mean ( audio ) resample the audio file audio resample ( audio , f s orig, f s ) return audio def local peaks ( spectrum , gs ) Finds the location of the local peaks in a gs x gs neighborhood Parameters spectrum ndarray 2 D array, the spectrogram of the signal gs int neighborhood size Returns ndarray boolean array, the local peaks create a footprint for the maximum filter footprint np ones ( ( gs , gs ) ) apply the maximum filter to the spectrum max spectrum ndimage maximum filter ( spectrum , footprint footprint ) the local peaks are where the original spectrum is equal to the filtered spectrum local peak spectrum max spectrum return local peak def resample ( x t , f s old, f s new ) resample the audio file to 8 kHz denom np gcd ( f s old, f s new ) L f s old denom M f s new denom x t signal resample poly ( x t , M , L ) return x t if name main f s 1 0 0 0 duration 1 0 audio test record audio ( duration , f s ) spio wavfile write ( test wav , f s , audio test astype ( np int 1 6 ) )

The Answer is in the image, click to view ...

Question: Implement compute _ spectrogram to generate the spectrogram from an audio signal. Use audio _ utils.local _ peaks ( ) to detect local peaks. Apply

Implement compute

_

spectrogram to generate the spectrogram from an audio signal.

Use audio

_

utils.local

_

peaks

()

to detect local peaks.

Apply thresholding to select the largest peaks.

Build a database of peak pairs using the specified constraints and store them using a hash table.

Implement the song identification function, comparing the fingerprint of a recorded or loaded clip against the database.

Create a command

-

line interface to control the program

s behavior.

Test your implementation using the provided unit tests. audio

_

utils:

import numpy as np

import pyaudio

from scipy import ndimage, signal

from scipy import io as spio

CHUNK

= 1024

FORMAT

=

pyaudio.paInt

16

RATE

= 48000

CHANNELS

= 1

def record

_

audio

(

duration

,

_

)

" " "

Sets up the audio stream and records audio for a given duration.

Parameters

- - - - - - - - - -

duration : float

duration of the recording in seconds

_

s : int

sampling rate in Hz

Returns

- - - - - - -

ndarray

mono audio signal sampled at f

_

s Hz and normalized to have a zero mean

" " "

=

pyaudio.PyAudio

()

# print

(

.

get

_

default

_

input

_

device

_

info

())

stream

=

.

open

(

format

=

FORMAT,

channels

=

CHANNELS,

rate

=

RATE,

input

=

True,

frames

_

per

_

buffer

=

CHUNK,

)

(" *

recording"

)

frames

= []

for

_

in range

(0,

int

(

RATE

/

CHUNK

*

duration

))

data

=

stream.read

(

CHUNK

)

frames.append

(

.

frombuffer

(

data

,

dtype

=

.

int

16))

(" *

done recording"

)

stream.stop

_

stream

()

stream.close

()

.

terminate

()

audio

=

.

concatenate

(

frames

) .

astype

(

float

)

audio

=_

resample

(

audio

,

RATE, f

_

)

audio

- =

.

mean

(

audio

)

return audio

def wav

_

processing

(

wav

_

file, f

_

)

" " "

Loads a wav file and resamples it to f

_

s Hz

.

Parameters

- - - - - - - - - -

wav

_

file : str

path to the wave file

_

s : int

desired sampling rate in Hz

Returns

- - - - - - -

ndarray

audio signal sampled at f

_

s Hz and normalized to have a zero mean

" " "

# Load the audio file

audio

_

all

=

spio.wavfile.read

(

wav

_

file

)

_

_

orig

=

audio

_

all

[0]

audio

=

audio

_

all

[1] .

astype

(

float

)

# combine channels

if len

(

audio

.

shape

) > 1

audio

=

.

mean

(

audio

,

axis

= 1)

# remove the mean

audio

- =

.

mean

(

audio

)

# resample the audio file

audio

=_

resample

(

audio

,

_

_

orig, f

_

)

return audio

def local

_

peaks

(

spectrum

,

)

" " "

Finds the location of the local peaks in a gs x gs neighborhood.

Parameters

- - - - - - - - - -

spectrum : ndarray

2

D array, the spectrogram of the signal

gs : int

neighborhood size

Returns

- - - - - - -

ndarray

boolean array, the local peaks

" " "

# create a footprint for the maximum filter

footprint

=

.

ones

((

,

))

# apply the maximum filter to the spectrum

max

_

spectrum

=

ndimage.maximum

_

filter

(

spectrum

,

footprint

=

footprint

)

# the local peaks are where the original spectrum is equal to the filtered spectrum

local

_

peak

=

spectrum

= =

max

_

spectrum

return local

_

peak

def

_

resample

(

_

,

_

_

old, f

_

_

new

)

# resample the audio file to

8

kHz

denom

=

.

gcd

(

_

_

old, f

_

_

new

)

=

_

_

old

/ /

denom

=

_

_

new

/ /

denom

_

=

signal.resample

_

poly

(

_

,

,

)

return x

_

__

name

__= = "__

main

__"

_

= 1000

duration

= 10

audio

_

test

=

record

_

audio

(

duration

,

_

)

spio.wavfile.write

("

test

.

wav", f

_

,

audio

_

test.astype

(

.

int

16))

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

In this section of the lab, you will sample a car horn signal, analyze it in MATLAB and use that your analysis to generate a simulated horn signal similar to the car horn you sampled. You should be...

provide plots of all requested signals as well as your MATLAB code. PLEASE LABEL ALL X AND Y AXES IN PLOTS. 1. The signal in the file safety.wav contains the speech signal for the word "safety...

MATLAB. PLEASE COMPLETE 4.2. FOLLOW THE CODE OUTLINE PROVIDED AND FILL IN THE '?' OF THE CODE. ALSO ADD TO THE CODE TO MAKE THE DUAL TONE SIGNALS. PLEASE SHOW YOUR CODE AS WELL AS THE PLOTTED...

KINDLY JUST ANSWER / WRITE CODES FOR THE GIVEN QUESTIONS, The data represent the log - transformed Mel spectrograms derived from the GTZAN dataset. The original GTZAN dataset contains 3 0 - second...

The data represent the log - transformed Mel spectrograms derived from the GTZAN dataset. The original GTZAN dataset contains 3 0 - second audio files of 1 , 0 0 0 songs associated with 1 0 different...

COMPLETE 4.2. PLEASE SHOW YOUR CODE AND YOUR PLOTTED SPECTROGRAM. 4.2 Touch-Tone Dial Function Write a function, DTMFdial.m, to implement a Touch-Tone dialer based on the frequency table defined in...

ML in a nutshell Optimization, and machine learning, are intimately connected. At a very coarse level, ML works as follows. First, you come up somehow with a very complicated model y = M(x, 0), which...

Task 1: Distance Map Requires: knowing how to design and implement a class In file distance_map.py, use the Class Design Recipe to define a class called DistanceMap that lets client code store and...

use python do the work in every space where it says #YOUR CODE HERE A Sock Drawer Let us model a sock drawer. How does a sock drawer work? Well, when you do the washing you end up with a bunch of...

Objectives 1. Working on a project that utilizes a broad spectrum of knowledge from this course 2. Using various forms of indirect addressing 3. Passing parameters on the stack 4. Extensive...

Assume a trader enters into an arrangement to purchase some shares on margin. The trader enters the trade to purchase 2,8000 shares at $13.50 each on margin which requires an initial margin of 40%...

Define a change in estimate. What is the proper accounting for a change in estimate?

Sterile Processing Professional Assessment Av 2 Time Remaining : 3 5 Minutes 5 1 Seconds Question 1 9 of 6 5 Hydrogen peroxide gas plasma is created during which phase? Vacuum Injection Diffusion...

A 30 60 90 triangle has a hypotenuse of 7 2 Find the other sides Be careful A 30-60-90 triangle has a hypotenuse of 7472. Find the other sides. (Be careful!!!)