Question: Hello. I m getting a couple of errors below when I try to create base line RNN model and second RNN model. Any sugguestions on

Hello. I

m getting a couple of errors below when I try to create base line RNN model and second RNN model. Any sugguestions on how to fix?

You are a data scientist in an AI company. You are given a dataset of restaurant reviews. This is the sentiment

140

dataset. It contains

1, 600, 000

tweets with six columns:

1 .

target: the polarity of the tweet

(0 =

negative,

2 =

neutral,

4 =

positive

)

2 .

ids: The id of the tweet

(

for example,

2087)

3 .

date: the date of the tweet

(

for example, Sat May

16 23

58

44

UTC

2009)

4 .

flag: The query

(

lyx

) .

If there is no query, then this value is NO

_

QUERY.

5 .

user: the user that tweeted

(

for example, robotickilldozr

)

6 .

text: the text of the tweet

(

for example,

"

Lyx is cool"

)

Data Source: Sentiment

140

DatasetLinks to an external site.

The target is the polarity of the tweet. The features are the text. You are asked to perform sentiment analysis using deep learning by using the attached Jupyter Notebook and writing a Python script, and running all the cells. You only need to submit a JupyterNotebook.

1 .

Download the dataset that is about

81

MB from Kaggle into the local disk and unzip it

.

2 .

Clean and preprocess the text data and split into training and test dataset.

3 .

Build a baseline RNN model using embedding layer and GRU on the training dataset and evaluate it on the test dataset.

4 .

Build a second RNN model using embedding layer and LSTM and evaluate it on the test dataset.

5 .

Build a third RNN model using embedding layer and GRU and LSTM and evaluate it on the test dataset.

6 .

Which model do you recommend for the model in Q

3,

4,

and Q

5 ?

Justify your answer.

import zipfile

# Unzip the downloaded file

with zipfile.ZipFile

('

path

_

_

downloaded

_

file.zip',

'

')

as zip

_

ref:

zip

_

ref.extractall

('

path

_

_

extract

_

dataset'

)

import pandas as pd

from sklearn.model

_

selection import train

_

test

_

split

import re

from tensorflow.keras.preprocessing.text import Tokenizer

from tensorflow.keras.preprocessing.sequence import pad

_

sequences

# Load the dataset into a pandas DataFrame

data

_

path

=

'path

_

_

extract

_

dataset

/

training

. 1600000 .

processed.noemoticon.csv

'

columns

= ['

target

',

'ids', 'date', 'flag', 'user', 'text'

]

=

.

read

_

csv

(

data

_

path, encoding

=

'latin

- 1',

header

=

None, names

=

columns

)

# Clean the text by removing unnecessary characters

['

text

'] =

['

text

'] .

apply

(

lambda x: re

.

sub

(

' [^\

\

]','',

str

(

) .

lower

()))

# Split the data into training and test datasets

train

_

,

test

_

=

train

_

test

_

split

(

,

test

_

size

= 0.2,

random

_

state

= 42)

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Embedding, GRU, Dense

# Tokenize the text data

tokenizer

=

Tokenizer

()

tokenizer.fit

_

_

texts

(

train

_

['

text

'])

vocab

_

size

=

len

(

tokenizer

.

word

_

index

) + 1

# Convert text data to sequences

train

_

sequences

=

tokenizer.texts

_

_

sequences

(

train

_

['

text

'])

test

_

sequences

=

tokenizer.texts

_

_

sequences

(

test

_

['

text

'])

# Pad sequences to a fixed length

max

_

len

= 100

train

_

data

=

pad

_

sequences

(

train

_

sequences, maxlen

=

max

_

len, padding

=

'post'

)

test

_

data

=

pad

_

sequences

(

test

_

sequences, maxlen

=

max

_

len, padding

=

'post'

)

# Build the baseline RNN model

model

1 =

Sequential

()

model

1 .

add

(

Embedding

(

vocab

_

size,

100,

input

_

length

=

max

_

len

))

model

1 .

add

(

GRU

(64))

model

1 .

add

(

Dense

(1,

activation

=

'sigmoid'

))

model

1 .

compile

(

optimizer

=

'adam', loss

=

'binary

_

crossentropy', metrics

= ['

accuracy

'])

# Train the model

model

1 .

fit

(

train

_

data, train

_

['

target

'],

validation

_

data

= (

test

_

data, test

_

['

target

']),

epochs

= 5)

Error

* * * * *

above

-

ValueError: Unrecognized keyword arguments passed to Embedding:

{'

input

_

lenght':

100}

from tensorflow.keras.layers import LSTM

# Build the second RNN model

model

2 =

Sequential

()

model

2 .

add

(

Embedding

(

vocab

_

size,

100,

input

_

length

=

max

_

len

))

model

2 .

add

(

LSTM

(64))

model

2 .

add

(

Dense

(1,

activation

=

'sigmoid'

))

model

2 .

compile

(

optimizer

=

'adam', loss

=

'binary

_

crossentropy', metrics

= ['

accuracy

'])

# Train the model

model

2 .

fit

(

train

_

data, train

_

['

target

'],

validation

_

data

= (

test

_

data, test

_

['

target

']),

epochs

= 5)

ERROR ABOVE

* * *

ValueError: Unrecognized keyword arguments passed to Embedding:

{'

input

_

length':

100}

# Evaluate the models on the test dataset

_,

acc

1 =

model

1 .

evaluate

(

test

_

data, test

_

['

target

'])

_,

acc

2 =

model

2 .

evaluate

(

test

_

data, test

_

['

target

'])

_,

acc

3 =

model

3 .

evaluate

(

test

_

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Hello could you please answer the following questions and give steps as to how you came to these answers and how to solve them for the correct answer, I'm having difficulty with these review...

Question: Listen to the podcast and identify 3 topics from chapters 1, 2, 3, or 5. Demonstrates that students can identify and connect topics discussed in the course with topics discussed in the...

look into the transcription of the videos that we coded for the Dominance project and identify if there are any errors (grammar and spelling and etc): 1673015005057.mp4 Baseline 181 Baseline181 Hi,...

Please analyze and contrast the Introduction and Literature Review of the articles below. Paul Krugman by Jason Briggeman30 Paul Krugman (1953-) was raised on Long Island "in a safe middle-class...

Why Hospitals Don't Learn from Failures: ORGANIZATIONAL AND PSYCHOLOGICAL DYNAMICS THAT INHIBIT SYSTEM CHANGE Anita L. Tucker Amy C. Edmondson T he importance of hospitals learning from their...

Module Case Study Information A Module Case Study is a critical analysis and evaluation of a specific case or subject. For this course a Module Case Study must: Be two pages in length, double-spaced....

Overview Nineteen67apparel is an apparel e-commerce company in London Ontario. This town is heavily populated with university students and young professionals, the total population of university...

Steve Raucci Please Read Please Summarize and use theses in summary. How would you best describe Steves management style? What was the most surprising thing about the case? What would have done if...

What are the biggest ah-ha! moments from Oracy Development? 6 English-Language Oracy Development Learning Outcomes After reading this chapter, you should be able to ... . Describe the basics of...

Instructions: Answer ALL questions Part A Read the case below and answer question 1 to question 3. Traeger's CEO on Cleaning Up a Toxic Culture By Jeremy Andrus One moming in October of 2014 I pulled...

A layer of granite 1 m thick is present in a sequence of sedimentary rocks, parallel to the bedding in the sedimentary rocks (Fig. 3.8). Is it a dike, a sill, or a stock?

Describe how to estimate the parameters of the model where t is a serially uncorrelated, homoscedastic, classical disturbance. X y = a + = yL k 2 -8 + 1-6L +8,

8. Record the following transactions of Fashion Park in a general journal. Fashion Park must charge 9 percent sales tax on all sales. DATE 20X1 April 2 3 4 6 30 TRANSACTIONS Sold merchandise for...

Bay City Company's fixed budget performance report for July follows. The $587,000 budgeted total expenses include $400,000 variable expenses and $187,000 fixed expenses. Actual expenses include...