( a ) Please identify the appropriate data transformation methods for the following situations Give a brief description about your answers 4 Consider a dataset containing information about student performance in two subjects Math and English The Math scores range from 3 5 and 2 0 0 ( mean 8 8 , standard deviation 1 8 ) , while the English scores range from 8 4 to 1 1 2 ( mean 9 8 6 , standard deviation 0 9 5 ) For each feature, apply normalization ( transformed data has x ' i n 0 , 1 ) and calculate the new mean and new standard deviation of the normalized feature Compare their means and standard deviations And for each feature, apply standardization to it and show the range of transformed data and compare their ranges 4 During the design of an artificial neural network, we sometimes need to transform a variable x that has a range of ( , ) to an open set zin ( 1 , 1 ) Note that z monotonically increases as x increases in this transformation Please specify a proper function for such transformation ( b ) In natural language processing ( NLP ) , there are diverse ways to represent words such as one hot encoding, bag of words, T F IDF, and distributed word representations In one hot encoding, a bit vector whose length is the size of the vocabulary of words is created, where only the associated word bit is on ( i e , 1 ) while all other bits are off ( i e , 0 ) Here is a toy example suppose there is a 5 dimensional feature vector to represent a vocabulary of five words king , queen, man, woman, power In this case, 'king' is encoded into 1 , 0 , 0 , 0 , 0 , 'queen' is encoded into 0 , 1 , 0 , 0 , 0 , etc Due to the nature of this representation, the feature vector encodes the vocabulary of a sentence where all words are equally distant On the other hand, in distributed word vectors, a real valued vector whose length is defined by some common properties of words is created, then each word can be represented as a linear combination of the defined properties Using the toy example above, given a 3 dimensional feature vector of man , woman, power as the common properties, then words such as 'king', 'queen', 'man', and 'woman' could be encoded into 0 9 8 , 0 1 , 0 8 0 , 0 9 9 , 0 8 5 0 9 , 0 , 0 5 , and 0 , 0 9 7 , 0 5 , respectively In this case, if you subtract a vector of 'man' from a vector of 'king', and add a vector of 'woman', then you will get a vector close to a vector of 'queen' 4 What is a major advantage disadvantage of one hot encoding as compared to distributed word vectors Briefly justify your answer 4 What is a major advantage disadvantage of distributed word vectors as com pared to one hot encoding Briefly justify your answer

The Answer is in the image, click to view ...

Question: ( a ) Please identify the appropriate data transformation methods for the following situations. Give a brief description about your answers: [ 4 ] Consider

(

)

Please identify the appropriate data transformation methods for the following situations.

Give a brief description about your answers:

[4]

Consider a dataset containing information about student performance in two

subjects: Math and English. The Math scores range from

35

and

200 (

mean

= 88,

standard deviation

= 18),

while the English scores range from

84

112 (

mean

=

98.6,

standard deviation

= 0.95) .

For each feature, apply normalization

(

transformed data has:

x^{'} i n [0, 1])

and

calculate the new mean and new standard deviation of the normalized feature.

Compare their means and standard deviations. And

for each feature, apply standardization to it and show the range of transformed

data and compare their ranges.

[4]

During the design of an artificial neural network, we sometimes need to transform

a variable

x

that has a range of

(-,)

to an open set zin

(- 1, 1) .

Note that

z

monotonically increases as

x

increases in this transformation. Please specify a

proper function for such transformation.

(

)

In natural language processing

(

NLP

),

there are diverse ways to represent words such

as one

-

hot encoding, bag of words,

T F^{*}

IDF, and distributed word representations. In

one hot encoding, a bit vector whose length is the size of the vocabulary of words is

created, where only the associated word bit is on

(

.

., 1)

while all other bits are off

(

.

.,

0) .

Here is a toy example: suppose there is a

5 -

dimensional feature vector to represent

a vocabulary of five words:

[

king

,

queen, man, woman, power

] .

In this case, 'king' is

encoded into

1, 0, 0, 0, 0,

'queen' is encoded into

0, 1, 0, 0, 0,

etc. Due to the nature of this

representation, the feature vector encodes the vocabulary of a sentence where all words

are equally distant. On the other hand, in distributed word vectors, a real

-

valued

vector whose length is defined by some common properties of words is created, then

each word can be represented as a linear combination of the defined properties. Using

the toy example above, given a

3 -

dimensional feature vector of

[

man

,

woman, power

]

the common properties, then words such as 'king', 'queen', 'man', and 'woman' could be

encoded into

0.98, 0.1, 0.8

0, 0.99, 0.85

0.9, 0, 0.5,

and

0, 0.97, 0.5,

respectively.

In this case, if you subtract a vector of 'man' from a vector of 'king', and add a vector

of 'woman', then you will get a vector close to a vector of 'queen'.

[4]

What is a major advantage

/

disadvantage of one hot encoding as compared to

distributed word vectors. Briefly justify your answer.

[4]

What is a major advantage

/

disadvantage of distributed word vectors as com

-

pared to one hot encoding. Briefly justify your answer.

( a ) Please identify the appropriate data

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

I need to see the SPSS output. You need to have all z-scores, all charts, all descriptives data from SPSS, everything you used to answer the questions. I am sending you what the previous tutor sent...

Object 1 WebAssign Welcome, jeremy.guillory@grandcanyon (log out) Friday, November 18 2016 03:11 AM MST Home My Assignments Grades Communication Calendar My eBooks My Ebooks Introduction to the...

Table of Contents Introduction. Hypothesis. Methods ..5 148 194714) Results.. Table I Western Governor Township Race by Family History of Heart Disease. Table 3 Analysis of Variance Difference in...

Object 1 WebAssign Welcome, jeremy.guillory@grandcanyon (log out) Friday, November 18 2016 03:11 AM MST Home My Assignments Grades Communication Calendar My eBooks My Ebooks Introduction to the...

CS 112 Project 5 Dictionaries and File IO Due Date: Sunday, April 23rd, 11:59pm Last chance to use tokens! (P6 won't allow late submissions) The purpose of this assignment is to explore dictionaries...

P e e r -R e v ie w e d O p tim iz in g Safety Engineering, Systems, Human Factors: Part 1 By Vladimir Ivensky T safety program is to reduce or eliminate in cidents that result in harm to people or...

Please! Please! read all the below information carefully before creating Field Report . please don't skip any questions. Make a Collaborative Informational Field Report. Incorporate the necessary...

In a distributed electronic conference application each participant has a replica of a shared whiteboard. Only one user at a time may write to the whiteboard, after which the user's update is...

Bos 4025 Under 29 CFR 1904, are activities of self-employed individuals covered by the Occupational Safety and Health Act? Yes, since the Act governs all establishments affecting commerce. No,...

What is the business model of Craigslist? In this circumstance, what is the appropriate legal rule? Who are Craiglist's stakeholders and what are their role and responsibilities? For the exclusive...

Ellen and Gary form a corporation. Ellen receives 85% of the stock (worth $85,000) in exchange for land worth S85,000. Her basis in the land was $60,000. (a) How much gain or loss does she recognize?...

Assume the same facts as PA12-4, except for the income statement and additional data item (a). The new income statement is shown on the following page. Instead of item (a) from PA12-4, assume that...

The Securities Act of 1 9 3 3 Multiple Choice regulates the auditing of financial statements for publicly - traded companies. regulates the initial offering of securities . regulates which services...

Using the example among the TSWANA of Botswana, discuss gender relations in pre-colonial Botswana