Question: Consider a neural network in which a vectored node v feeds into two distinct vectored nodes h1 and h2 computing different functions. The functions computed

Consider a neural network in which a vectored node v feeds into two distinct vectored nodes h1 and h2 computing different functions. The functions computed at the nodes are h1 = ReLU(W1v) and h2 = sigmoid(W2v). We do not know anything about the values of the variables in other parts of the network, but we know that h1 = [2,−1, 3]T and h2 = [0.2, 0.5, 0.3]T , that are connected to the node v = [2, 3, 5, 1]T . Furthermore, the loss gradients are ∂L

∂h1

= [−2, 1, 4]T and ∂L

∂h2

= [1, 3,−2]T , respectively. Show that the backpropagated loss gradient ∂L

∂v can be computed in terms of W1 and W2 as follows:

∂L

∂v

= WT 1

⎡

⎣

−2 0

⎤

⎦

+WT 2

⎡

⎣

0.16 0.75

−0.42

⎤

⎦

What are the sizes of W1, W2, and ∂L

∂v ?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Management And Artificial Intelligence Questions!

Give Correct ANSWERS Human-Computer Interaction (a) If you had been one of the original inventors of the WIMP interface, and engineers on the technical team had been sceptical about the advantages...

Submitted to Management Science manuscript MS-0001-1922.65 Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title....

Supply Chain Management Introduction Outline What is supply chain management? Significance of supply chain management. Push vs. Pull processes utdallas.edu/~metin 1 A Generic Supply Chain Sources:...

QUIZ... Let D be a poset and let f : D D be a monotone function. (i) Give the definition of the least pre-fixed point, fix (f), of f. Show that fix (f) is a fixed point of f. [5 marks] (ii) Show that...

can someone solve this Modern workstations typically have memory systems that incorporate two or three levels of caching. Explain why they are designed like this. [4 marks] In order to investigate...

Write an alternative definition that is tail-recursive (iterative) and makes use of accumulator variables. [10 marks] Explain why your alternative definition executes more efficiently. [3 marks] 1...

ANSI-SPARC6 Programming Language Compilation Write notes on each of the following topics: (a) the implementation of labels and jumps in a recursive, block structured programming language [7 marks]...

re Regular Languages and Finite Automata (a) Let L be the set of all strings over the alphabet {a, b} that end in a and do not contain the substring bb. Describe a deterministic finite automaton...

Que 1 (a) A Cassandra deployment with 6 (N1 through N6) nodes across three racks: N1 and N2 are in rack 1; N3 and N4 in rack 2; N5 and N6 in rack 3. The Cassandra ring has the nodes in the following...

Describe how to construct the function cpo ((D E), v) of two cpos (D, vD) and (E, vE). Prove that ((D E), v) is a cpo. (You may use facts about least upper bounds provided you state them clearly.)...

In Problem, graph each inequality subject to the non negative restrictions. 75x + 25y > -600, x 0, y > 0

1 The tax written-down value of plant and machinery brought forward from the year of assessment 2015/16 is as follows: 30% pool 20% pool 60,000 80,000 2 Movements of plant and machinery in the year...

Blaine Enterprises needs someone to supply it with 1 3 0 , 0 0 0 cartons of machine screws per year to support its manufacturing needs over the next four years, and you've decided to bid on the...