import os import numpy as np import librosa from sklearn ensemble import RandomForestClassifier, GradientBoostingClassifier from sklearn metrics import confusion matrix, classification report import pickle import soundfile as sf from sklearn model selection import train test split, GridSearchCV, StratifiedKFold import sounddevice as sd from sklearn preprocessing import StandardScaler from sklearn feature selection import SelectKBest, f classif Paths to audio files audio files Ifrah C software test voices Ifra 5 wav , Sharonne C software test voices Sharonne 1 wav , Talha C software test voices Talha converted wav Define a function to load an audio file and extract features def extract features ( audio path ) y , sr librosa load ( audio path ) return extract features from array ( y , sr ) Extract features from an audio array def extract features from array ( y , sr ) mfccs np mean ( librosa feature mfcc ( y y , sr sr , n mfcc 1 3 ) T , axis 0 ) chroma np mean ( librosa feature chroma stft ( y y , sr sr ) T , axis 0 ) spectral contrast np mean ( librosa feature spectral contrast ( y y , sr sr ) T , axis 0 ) zero crossings np mean ( librosa feature zero crossing rate ( y y ) T , axis 0 ) spectral rolloff np mean ( librosa feature spectral rolloff ( y y , sr sr ) T , axis 0 ) rms np mean ( librosa feature rms ( y y ) T , axis 0 ) mel spectrogram np mean ( librosa feature melspectrogram ( y y , sr sr ) T , axis 0 ) spectral bandwidth np mean ( librosa feature spectral bandwidth ( y y , sr sr ) T , axis 0 ) mfcc delta np mean ( librosa feature delta ( librosa feature mfcc ( y y , sr sr , n mfcc 1 3 ) ) T , axis 0 ) features np hstack ( mfccs , chroma, spectral contrast, zero crossings, spectral rolloff, rms , mel spectrogram, spectral bandwidth, mfcc delta ) return features Augment audio data def augment audio ( y , sr ) noise np random normal ( 0 , 0 0 0 5 , len ( y ) ) Reduced noise level y noisy y noise y pitched librosa effects pitch shift ( y , sr sr , n steps 1 ) Smaller pitch shift y speed librosa effects time stretch ( y astype ( ' float 3 2 ' ) , rate 1 0 5 ) Less aggressive time stretch return y noisy, y pitched, y speed Function to record and save audio for a speaker def record and save ( ) duration 1 0 seconds print ( Please record your audio for speaker identification ) recording sd rec ( int ( duration 1 6 0 0 0 ) , samplerate 1 6 0 0 0 , channels 1 , dtype 'float 3 2 ' ) sd wait ( ) filename speaker recorded wav sf write ( filename , recording, 1 6 0 0 0 ) print ( f Recording saved as filename ) return filename Extract features from uploaded files data labels for label, audio path in audio files items ( ) y , sr librosa load ( audio path ) Original features features extract features ( audio path ) data append ( features ) labels append ( label ) Augmented features augmented audios augment audio ( y , sr ) for aug y in augmented audios aug features extract features from array ( aug y , sr ) data append ( aug features ) labels append ( label ) Convert data and labels to NumPy arrays data np array ( data ) labels np array ( labels ) Split data into training and testing sets X train, X test, y train, y test train test split ( data , labels, test size 0 3 , random state 4 2 ) Normalize features scaler StandardScaler ( ) X train scaler fit transform ( X train ) X test scaler transform ( X test ) Feature selection to reduce dimensionality and remove noise selector SelectKBest ( score func f classif, k 5 0 ) Increased k to retain more features X train selector fit transform ( X train, y train ) X test selector transform ( X test ) Train a Random Forest Classifier with GridSearch for hyperparameter tuning param grid ' n estimators' 5 0 , 1 0 0 , 2 0 0 , 'max depth' None , 1 0 , 2 0 , 'min samples split' 5 , 1 0 , Increased to prevent overfitting 'min samples leaf' 2 , 4 Increased to prevent overfitting kfold StratifiedKFold ( n splits 3 ) grid GridSearchCV ( RandomForestClassifier ( random state 4 2 ) , param grid, cv kfold, refit True, verbose 3 ) grid fit ( X train, y train ) Use the best model from grid search rf classifier grid best estimator Train a Gradient Boosting Classifier for comparison gb classifier GradientBoostingClassifier ( random state 4 2 ) gb classifier fit ( X train, y train ) Predict on the test set with Random Forest y pred rf rf classifier predict ( X test ) Predict on the test set with Gradient Boosting y pred gb gb classifier predict ( X test ) Evaluate the Random Forest model conf matrix rf confusion matrix ( y test, y pred rf ) class report rf classification report ( y test, y pred rf ) print ( Random Forest Confusion Matrix ) print ( conf matrix rf ) print ( Random Forest Classification Report ) print ( class report rf ) Evaluate the Gradient Boosting model conf matrix gb confusion matrix ( y test, y pred gb ) class report gb classification report ( y test, y pred gb ) print ( Gradient Boosting Confusion Matrix ) print ( conf matrix gb ) print ( Gradient Boosting Classification Report ) print ( class report gb ) Save the trained models and scaler with open ( rf classifier model pkl , wb ) as model file pickle dump ( rf classifier, model file ) with open ( gb classifier model pkl , wb ) as model file pickle dump ( gb classifier, model file ) with open ( scaler pkl , wb ) as scaler file pickle dump ( scaler , scaler file ) Save the feature selector with open ( selector pkl , wb ) as selector file pickle dump ( selector , selector file ) Function to record and predict the speaker def record and predict ( ) Record audio filename record and save ( ) Extract features from the recorded audio features extract features ( filename ) reshape ( 1 , 1 ) Load scaler, selector, and model for prediction with open ( scaler pkl , rb ) as scaler file scaler pickle load ( scaler file ) features scaler transform ( features ) with open ( selector pkl , rb ) as selector file selector pickle load ( selector file ) features selector transform ( features ) with open ( rf classifier model pkl , rb ) as model file model pickle load ( model file ) prediction model predict ( features ) print ( f The speaker is predicted to be prediction 0 ) Record and predict speaker once def main ( ) while True record and predict ( ) cont input ( Do you want to identify another speaker ( yes no ) ) strip ( ) lower ( ) if cont 'yes' break Run the main function if name main main ( ) Can you correct this code in order to allow the machine to correctly identify the person who records the Audio from the audio that we put in the code to teach the machine our voice the machine must allow the user to record an audio and identify if it is ifrah or sharonne or talha

The Answer is in the image, click to view ...

Question: import os import numpy as np import librosa from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier from sklearn.metrics import confusion _ matrix, classification _ report import pickle import

import os import numpy as np import librosa from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier from sklearn.metrics import confusion

_

matrix, classification

_

report import pickle import soundfile as sf from sklearn.model

_

selection import train

_

test

_

split, GridSearchCV, StratifiedKFold import sounddevice as sd from sklearn.preprocessing import StandardScaler from sklearn.feature

_

selection import SelectKBest, f

_

classif # Paths to audio files audio

_

files

= {

"Ifrah":

"

/

software test

/

voices

/

Ifra

5 .

wav", "Sharonne":

"

/

software test

/

voices

/

Sharonne

1 .

wav", "Talha":

"

/

software test

/

voices

/

Talha

_

converted.wav"

}

# Define a function to load an audio file and extract features def extract

_

features

(

audio

_

path

)

: y

,

=

librosa.load

(

audio

_

path

)

return extract

_

features

_

from

_

array

(

,

)

# Extract features from an audio array def extract

_

features

_

from

_

array

(

,

)

: mfccs

=

.

mean

(

librosa

.

feature.mfcc

(

=

,

=

,

_

mfcc

= 13) .

,

axis

= 0)

chroma

=

.

mean

(

librosa

.

feature.chroma

_

stft

(

=

,

=

) .

,

axis

= 0)

spectral

_

contrast

=

.

mean

(

librosa

.

feature.spectral

_

contrast

(

=

,

=

) .

,

axis

= 0)

zero

_

crossings

=

.

mean

(

librosa

.

feature.zero

_

crossing

_

rate

(

=

) .

,

axis

= 0)

spectral

_

rolloff

=

.

mean

(

librosa

.

feature.spectral

_

rolloff

(

=

,

=

) .

,

axis

= 0)

rms

=

.

mean

(

librosa

.

feature.rms

(

=

) .

,

axis

= 0)

mel

_

spectrogram

=

.

mean

(

librosa

.

feature.melspectrogram

(

=

,

=

) .

,

axis

= 0)

spectral

_

bandwidth

=

.

mean

(

librosa

.

feature.spectral

_

bandwidth

(

=

,

=

) .

,

axis

= 0)

mfcc

_

delta

=

.

mean

(

librosa

.

feature.delta

(

librosa

.

feature.mfcc

(

=

,

=

,

_

mfcc

= 13)) .

,

axis

= 0)

features

=

.

hstack

([

mfccs

,

chroma, spectral

_

contrast, zero

_

crossings, spectral

_

rolloff, rms

,

mel

_

spectrogram, spectral

_

bandwidth, mfcc

_

delta

])

return features # Augment audio data def augment

_

audio

(

,

)

: noise

=

.

random.normal

(0, 0.005,

len

(

))

# Reduced noise level y

_

noisy

=

+

noise y

_

pitched

=

librosa.effects.pitch

_

shift

(

,

=

,

_

steps

= 1)

# Smaller pitch shift y

_

speed

=

librosa.effects.time

_

stretch

(

.

astype

('

float

32'),

rate

= 1.05)

# Less aggressive time stretch return

[

_

noisy, y

_

pitched, y

_

speed

]

# Function to record and save audio for a speaker def record

_

and

_

save

()

: duration

= 10

# seconds print

("

Please record your audio for speaker identification..."

)

recording

=

.

rec

(

int

(

duration

* 16000),

samplerate

= 16000,

channels

= 1,

dtype

=

'float

32')

.

wait

()

filename

=

"speaker

_

recorded.wav" sf

.

write

(

filename

,

recording,

16000)

(

"

Recording saved as

{

filename

} ")

return filename # Extract features from uploaded files data

= []

labels

= []

for label, audio

_

path in audio

_

files.items

()

: y

,

=

librosa.load

(

audio

_

path

)

# Original features features

=

extract

_

features

(

audio

_

path

)

data.append

(

features

)

labels.append

(

label

)

# Augmented features augmented

_

audios

=

augment

_

audio

(

,

)

for aug

_

y in augmented

_

audios: aug

_

features

=

extract

_

features

_

from

_

array

(

aug

_

,

)

data.append

(

aug

_

features

)

labels.append

(

label

)

# Convert data and labels to NumPy arrays data

=

.

array

(

data

)

labels

=

.

array

(

labels

)

# Split data into training and testing sets X

_

train, X

_

test, y

_

train, y

_

test

=

train

_

test

_

split

(

data

,

labels, test

_

size

= 0.3,

random

_

state

= 42)

# Normalize features scaler

=

StandardScaler

()

_

train

=

scaler.fit

_

transform

(

_

train

)

_

test

=

scaler.transform

(

_

test

)

# Feature selection to reduce dimensionality and remove noise selector

=

SelectKBest

(

score

_

func

=

_

classif, k

= 50)

# Increased k to retain more features X

_

train

=

selector.fit

_

transform

(

_

train, y

_

train

)

_

test

=

selector.transform

(

_

test

)

# Train a Random Forest Classifier with GridSearch for hyperparameter tuning param

_

grid

= {'

_

estimators':

[50, 100, 200],

'max

_

depth':

[

None

, 10, 20],

'min

_

samples

_

split':

[5, 10],

# Increased to prevent overfitting 'min

_

samples

_

leaf':

[2, 4]

# Increased to prevent overfitting

}

kfold

=

StratifiedKFold

(

_

splits

= 3)

grid

=

GridSearchCV

(

RandomForestClassifier

(

random

_

state

= 42),

param

_

grid, cv

=

kfold, refit

=

True, verbose

= 3)

grid.fit

(

_

train, y

_

train

)

# Use the best model from grid search rf

_

classifier

=

grid.best

_

estimator

_

# Train a Gradient Boosting Classifier for comparison gb

_

classifier

=

GradientBoostingClassifier

(

random

_

state

= 42)

_

classifier.fit

(

_

train, y

_

train

)

# Predict on the test set with Random Forest y

_

pred

_

=

_

classifier.predict

(

_

test

)

# Predict on the test set with Gradient Boosting y

_

pred

_

=

_

classifier.predict

(

_

test

)

# Evaluate the Random Forest model conf

_

matrix

_

=

confusion

_

matrix

(

_

test, y

_

pred

_

)

class

_

report

_

=

classification

_

report

(

_

test, y

_

pred

_

)

("

Random Forest Confusion Matrix:"

)

(

conf

_

matrix

_

)

("

Random Forest Classification Report:"

)

(

class

_

report

_

)

# Evaluate the Gradient Boosting model conf

_

matrix

_

=

confusion

_

matrix

(

_

test, y

_

pred

_

)

class

_

report

_

=

classification

_

report

(

_

test, y

_

pred

_

)

("

Gradient Boosting Confusion Matrix:"

)

(

conf

_

matrix

_

)

("

Gradient Boosting Classification Report:"

)

(

class

_

report

_

)

# Save the trained models and scaler with open

("

_

classifier

_

model.pkl

", "

")

as model

_

file: pickle.dump

(

_

classifier, model

_

file

)

with open

("

_

classifier

_

model.pkl

", "

")

as model

_

file: pickle.dump

(

_

classifier, model

_

file

)

with open

("

scaler

.

pkl

", "

")

as scaler

_

file: pickle.dump

(

scaler

,

scaler

_

file

)

# Save the feature selector with open

("

selector

.

pkl

", "

")

as selector

_

file: pickle.dump

(

selector

,

selector

_

file

)

# Function to record and predict the speaker def record

_

and

_

predict

()

: # Record audio filename

=

record

_

and

_

save

()

# Extract features from the recorded audio features

=

extract

_

features

(

filename

) .

reshape

(1, - 1)

# Load scaler, selector, and model for prediction with open

("

scaler

.

pkl

", "

")

as scaler

_

file: scaler

=

pickle.load

(

scaler

_

file

)

features

=

scaler.transform

(

features

)

with open

("

selector

.

pkl

", "

")

as selector

_

file: selector

=

pickle.load

(

selector

_

file

)

features

=

selector.transform

(

features

)

with open

("

_

classifier

_

model.pkl

", "

")

as model

_

file: model

=

pickle.load

(

model

_

file

)

prediction

=

model.predict

(

features

)

(

"

The speaker is predicted to be:

{

prediction

[0]}

")

# Record and predict speaker once def main

()

: while True: record

_

and

_

predict

()

cont

=

input

("

Do you want to identify another speaker?

(

yes

/

)

") .

strip

() .

lower

()

if cont

! =

'yes': break # Run the main function if

__

name

__= = "__

main

__"

: main

()

Can you correct this code in order to allow the machine to correctly identify the person who records the Audio from the audio that we put in the code to teach the machine our voice the machine must allow the user to record an audio and identify if it is ifrah or sharonne or talha

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Data Science, Python, Jupyter Notebook I have a term project for my Capstone class in Data Science. Below is the syllabus, dataset, and the Jupiter Notebook. I am creating a Classification model to...

USE JUPYTER LAB, below is the provided code and at the end are the questions: import numpy as np import pandas as pd import seaborn as sns import math from sklearn import preprocessing from sklearn...

I am working on this code for my project, but the accuracy is 0.9516013654843938, I need to improve the accuracy by using feature selection and pre-processing to get a higher result, could you please...

Can you also explain how to call P1 from P2 and use the functions created in P1 in P2. P1 Make use of the scikit-learn (sklearn) python package in your function implementations Complete the Following...

Digit Classification with KNN and Naive Bayes # This tells matplotlib not to try opening a new window for each plot. %matplotlib inline # Import a bunch of libraries. import time import numpy as np...

Using Python to do this work: For your solution please include screenshots like i did for better understanding. These are instructions: TWITTER AIRLINE SENTIMENT ANALYSIS In class, we studied the...

Hi! This is a Python course I'm currently taking. I'm having trouble trying to implement a for loop in my code code. The goal is to build knn models for these 9 k values 1,3,5,7,9,12,15,17,19 and...

I can not run my python code. Please give a fix of my code and provide the correct run screenshots: here are my all python codes: import pandas as pd import numpy as np import matplotlib.pyplot as...

Let's break down the tasks and solve them step by step. First, let's choose a task to focus on . Here are the three tasks: Explanation Create a neural network to recognize your face and clearly...

Allendorf Company used $30,000 in cash to repay a portion of its bank loan (see PE 3-1). For simplicity, assume that there is no interest on the loan. a. List the accounts impacted by the...

Identify the three business-level strategies. Describe how they differ from one another.

PHONES 1 At Ile time the data was collected, the USA had about 3 1 4 million cell plones in use and about 1 6 0 million main lines in use. What was the residual for the USA?Enter a number of...

Which of the following kinds of things should be looked at in a case study? Group of answer choices Problems people are facing and how they are addressing them Relationship between individuals or grou