E Considering the following neural network with inputs (x1, x2), outputs (21, 22, 23), and parameters...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
E Considering the following neural network with inputs (x1, x2), outputs (21, 22, 23), and parameters = (a, b, c, d, e, f, i, j, k, l, m, n, o, p, q), 91 a b (2) = ( ))+(}). 92 == (h) = 21 22 = 23 C ReLU(91)), ReLU(92) i j k l 6)-(490-8 m n + P h2 A. (15 points) For a minibatch containing a single training sample (x1, x2, y = 2), apply softmax and write down the cross-entropy loss function J(6) as a ӘЈ ӘЈ ӘЈ function of (21, 22, 23). Compute as functions of (21, 22, 23). მმმ3 B. (20 points) Base on A., apply backprop to compute ӘЈ ӘЈ ӘЈ ӘЈ ӘЈ ӘЈ On do' Op' Oq Əh₁' Əh₂ , ᎧᎫ ᎧᎫ ᎧᎫ ᎧᎫ ᎧᎫ ai' aj ak' al' m' C. (20 points) Base on B., apply backprop to compute дЈ af ӘЈ ӘЈ ӘЈ ӘЈ ӘЈ да дь деда де ӘЈ дл Explain why you don't need to compute and Əx1 дх (Hint: use the step function u(x) as the derivative of ReLU(x).) D. (15 points) For the learning rate e, show the equation to apply the simple SGD algorithm to update 0 for this minibatch. E Considering the following neural network with inputs (x1, x2), outputs (21, 22, 23), and parameters = (a, b, c, d, e, f, i, j, k, l, m, n, o, p, q), 91 a b (2) = ( ))+(}). 92 == (h) = 21 22 = 23 C ReLU(91)), ReLU(92) i j k l 6)-(490-8 m n + P h2 A. (15 points) For a minibatch containing a single training sample (x1, x2, y = 2), apply softmax and write down the cross-entropy loss function J(6) as a ӘЈ ӘЈ ӘЈ function of (21, 22, 23). Compute as functions of (21, 22, 23). მმმ3 B. (20 points) Base on A., apply backprop to compute ӘЈ ӘЈ ӘЈ ӘЈ ӘЈ ӘЈ On do' Op' Oq Əh₁' Əh₂ , ᎧᎫ ᎧᎫ ᎧᎫ ᎧᎫ ᎧᎫ ai' aj ak' al' m' C. (20 points) Base on B., apply backprop to compute дЈ af ӘЈ ӘЈ ӘЈ ӘЈ ӘЈ да дь деда де ӘЈ дл Explain why you don't need to compute and Əx1 дх (Hint: use the step function u(x) as the derivative of ReLU(x).) D. (15 points) For the learning rate e, show the equation to apply the simple SGD algorithm to update 0 for this minibatch.
Expert Answer:
Related Book For
Logic And Computer Design Fundamentals
ISBN: 9780133760637
5th Edition
Authors: M. Morris Mano, Charles Kime, Tom Martin
Posted Date:
Students also viewed these programming questions
-
if we choose statistic as our keyword, our cipher would be determined as follows: method i. write the word statistic without the repeated letters. then complete the cipher with the unused alphabet...
-
Write a Python program to demonstrate multithreading. Your code shall first create a file for writing called "synch.txt" (autograder will be looking for this file). Then it will create two separate...
-
Data Corporation has four employees and provides group term life insurance coverage for all four employees. Coverage is nondiscriminatory and is as follows: a. How much may Data Corporation deduct...
-
For selected years from 1960 to 2018, total U.S. personal income I (in billions of dollars) can be approximated by the formula I = 492.4(1.070)t where t is the number of years past 1960. (a) What...
-
In a Kademlia network, the size of the identifier space is 1024. What is the height of the binary tree (the distance between the root and each leaf)? What is the number of leaves? What is the number...
-
In the months leading up to the 2016 election, Christopher Steele, a former British intelligence agent, was hired by a Washington, D.C., research firm to investigate whether then-candidate Donald...
-
An analysis of the accounts of Roberts Company reveals the following manufacturing cost data for the month ended June 30, 2014. Costs incurred: raw materials purchases $54,000, direct labor $47,000,...
-
Lab: Structured Query Language (SQL) Queries with Data Manipulation Language (DML) Assignment Instructions LAB: STRUCTURED QUERY LANGUAGE (SQL) QUERIES WITH DATA MANIPULATION LANGUAGE (DML)...
-
Orion Controls is a leading manufacturer of industrial valve systems, and Nathan Armstrong, head of Marketing, had been contacted by Andre Gide, EVP of Avion Chemical to place an order for 50 of...
-
1.You write a short put option giving the purchaser the right to sell 100 shares of Rothbard Corporation for a premium of $1,300. The strike price of the option is $20 and the final stock price is...
-
amount of Elaine's Canada caregiver tax credit, if any, for 2021. Determine the
-
A mobile crane is lifting a large concrete block W at a constuction site, as shown in the figure below. The crane body has a weight of W = 8500 lb and the crane boom has a weight of W = 1000 lb,...
-
A toy manufacturer makes its own wind-up motors, which are then put into its toys. While the toy manufacturing process is continuous, the motors are intermittent flow. Data on the manufacture of the...
-
Ahmed Company has 8.5 million shares of common stock outstanding, 250,000 shares of 5 percent preferred stock outstanding, and 135,000 7.5 percent semiannual bonds outstanding, par value $5,000 each....
-
Assuming the expectations theory is the correct theory of the term structure, calculate the interest rates in the term structure for maturities of one to four years, and plot the resulting yield...
-
Q2: Why is evaluating training important for an organization?
-
The diameter of a sphere is 18 in. Find the largest volume of regular pyramid of altitude 15 in. that can be cut from the sphere if the pyramid is (a) square, (b) pentagonal, (c) hexagonal, and (d)...
-
The 8-bit ASCII word Bye is to be transmitted to a device address 39 and endpoint 2. List the Output and Data 0 packets and the Handshake packet for a Stall for this transmission prior to NRZI...
-
Repeat Problem 9-28 with A = 10100100 and B = 10101001. Problem 9-28 The program in a computer compares two signed 2s complement numbers A and B by performing subtraction A - B and updating the...
-
A set of waveforms applied to two D lip- lops is shown in Figure 4-56. These waveforms are applied to the lip- lops shown along with the values of their timing parameters. Figure 4-56 (a) List the...
-
In a hypothesis test the p value is 0.043. This means that we can find statistical significance at: (1) both the 0.05 and 0.01 levels (2) the 0.05 but not at the 0.01 level (3) the 0.01 but not at...
-
In testing the null hypothesis that p = 0:3 against the alternative that p 0:3, the probability of a type II error is ______ when the true p = 0:4 than when p = 0:6. (1) the same (2) smaller (3)...
-
An article states there is no significant evidence that median income increased. The implied null hypothesis is: (1) Median income increased. (2) Median income changed. (3) Median income did not...
Study smarter with the SolutionInn App