Consider a fully connected autoencoder (each hidden node is connected to all inputs and all outputs)...

Fantastic news! We've Found the answer you've been seeking!

Question:

Transcribed Image Text:

Consider a fully connected autoencoder (each hidden node is connected to all inputs and all outputs) with two dimensional binary input and one hidden layer with softplus activation function. At iteration t, the weights are shown in the following autoencoder architecture along with the input vector (x1=0, x2=1). All bias values are zero. Assume learning rate = 0.25 and momentum constant = 0.75. Also assume at t-1, w1= -0.5, w2-0.5, w3-0.5 and w4= -0.5. w3=1 w4-0 * wl=0 w2=1 1. What activation function will you choose at the output node? 2. What loss function will you choose for training this autoencoder? What will be the value of loss function at iteration t? 3. What will be the weights w4 and w2 in iteration t+1 assuming backpropagation with ordinary gradient descent is used? 4. What will be the weights w4 and w2 in iteration t+1 assuming backpropagation with momentum based gradient descent is used? B. Calculate the KL divergence KLD(p(x,y) || q(x,y)) of two continuous distributions p(x,y) and g(x,y) in bits where p(x, y) = 1 24 m (x-3)² 2+9 q(x, y) = — — — e 2 (y-4)² 2-16 12²-2² Consider a fully connected autoencoder (each hidden node is connected to all inputs and all outputs) with two dimensional binary input and one hidden layer with softplus activation function. At iteration t, the weights are shown in the following autoencoder architecture along with the input vector (x1=0, x2=1). All bias values are zero. Assume learning rate = 0.25 and momentum constant = 0.75. Also assume at t-1, w1= -0.5, w2-0.5, w3-0.5 and w4= -0.5. w3=1 w4-0 * wl=0 w2=1 1. What activation function will you choose at the output node? 2. What loss function will you choose for training this autoencoder? What will be the value of loss function at iteration t? 3. What will be the weights w4 and w2 in iteration t+1 assuming backpropagation with ordinary gradient descent is used? 4. What will be the weights w4 and w2 in iteration t+1 assuming backpropagation with momentum based gradient descent is used? B. Calculate the KL divergence KLD(p(x,y) || q(x,y)) of two continuous distributions p(x,y) and g(x,y) in bits where p(x, y) = 1 24 m (x-3)² 2+9 q(x, y) = — — — e 2 (y-4)² 2-16 12²-2²

Related Book For answer-question

answer-question

Microeconomics An Intuitive Approach with Calculus

Microeconomics An Intuitive Approach with Calculus

ISBN: 978-0538453257

1st edition

Authors: Thomas Nechyba

See More Books

Posted Date: Dec 02, 2022 07:15 AM

See More Questions