Part III Artificial Neural Networks (60 points) In this problem, we consider a bi-dimensional dataset of...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Part III Artificial Neural Networks (60 points) In this problem, we consider a bi-dimensional dataset of random points belonging to two classes. The first class of points is inside the circle centered at 0 with radius 1, whereas the second class is inside a ring located between radius 1.5 and radius 2.5. All points are perturbed with a Gaussian noise of zero mean and standard deviation 0.1. A Bernoulli(p=0.6) process allows us to select 40% of points from the first class (red points) and 60% of points from the second class (blue points). 4000 points of the training dataset are plotted in Figure 5. We also generated a validation dataset of size 800 points. The two datasets can be uploaded to TensorFlow from the files: training_circular_bidim_dataset.csv and test_circular_bidim_dataset.csv 3 2 W = 0 -3 -3 0 X W = Figure 5: A circular bi-dimensional dataset with 2 classes (2 categories/2 clusters). The binary classification is performed via a 2-6-3-1 dense neural network depicted in Figure 6. The input has 2 dimensions, the model has 2 hidden layers with tanh activation functions, and the output has one unit only with a sigmoid. The output sigmoid is well suited to our binary classification problem. (a) Suppose the model input is x = = (x1, x)t = (0.5, 0.5). The 6 x 2 weight matrix of the first hidden layer is set to 0.50 -0.05 -0.25 0.25 0.04 0.75 0.04 -0.06 0.64 0.04 -0.64 0.08 2 The 3 x 6 weight matrix of the second hidden layer is set to 3 6 0.50 1.00 -0.05 0.40 -0.06 1.00 1.00 -0.25 0.25 0.02 0.01 -0.40 0.04 0.75 1.00 -0.50 0.01 0.64 (7) (8) |||||| || | // Figure 6: A 2-6-3-1 model to make the binary classification for our dataset. Finally, the weights of the output layer are set to W3 (2.0, 0.5, -1.0). We denote by h (h11, h12, ..., h16)t the output of the first hidden layer after the tanh activation. We also denote by h (h21, h22, h23) the output of the second hidden layer after the tanh activation. The model output is denoted by the letter o. For simplicity, it is assumed that the model units have no bias in question (a). = = = Using the linear-algebra equations of forward propagation, without a bias, we calcu- late hi (Wx), h (Wh), and the value of the output o (W3h), where (x) and (x) are the tanh and the sigmoid functions respectively. We get h (0.22127, 0.0, 0.37566,-0.00999, 0.32748,-0.27290)t, h = (-0.20188, 0.40317, 0.21473), and o= : 0.39725. = (a) Based on the values computed by the above forward propagation, compute the gra- dient of the output o with respect to the weight parameter w1 between the first input and the first unit on the first hidden layer. Use the rules of backpropagation. dw11 (b) If Xavier initialization is used for the 12 parameters in W with a Gaussian distribu- tion, what should be the distribution variance? (c) Build the 2-6-3-1 neural network of Figure 6 in the Jupyter notebook template re- ceived with the final exam PDF file. Set the activation functions to tanh for the hidden layers. The output activation should be a sigmoid. Write the model construction via TensorFlow/Keras. Save your Jupyter notebook before sending it back to the instructor by email. After model.summary(), the neural network should look like this: Model: "sequential" Layer (type) dense (Dense) dense 1 (Dense) dense 2 (Dense) Total params: 43 Trainable params: 43 Non-trainable params: 0 Output Shape (None, 6) (None, 3) (None, 1) Param # 18 21 4 ====== Set the epochs to 100, the batch size to 32, the learning rate to 0.0002, and the optimizer to Adam. Run model.compile with the correct arguments and then launch the learning by model.fit (train_points, train_labels, epochs epochs). The training accuracy should be above 99%. The validation accuracy is most likely of 100%. (d) Replace the tanh activation of the hidden layers by ReLu. Try also sigmoids instead of tanh. What are the accuracy results after changing the activation functions? Part III Artificial Neural Networks (60 points) In this problem, we consider a bi-dimensional dataset of random points belonging to two classes. The first class of points is inside the circle centered at 0 with radius 1, whereas the second class is inside a ring located between radius 1.5 and radius 2.5. All points are perturbed with a Gaussian noise of zero mean and standard deviation 0.1. A Bernoulli(p=0.6) process allows us to select 40% of points from the first class (red points) and 60% of points from the second class (blue points). 4000 points of the training dataset are plotted in Figure 5. We also generated a validation dataset of size 800 points. The two datasets can be uploaded to TensorFlow from the files: training_circular_bidim_dataset.csv and test_circular_bidim_dataset.csv 3 2 W = 0 -3 -3 0 X W = Figure 5: A circular bi-dimensional dataset with 2 classes (2 categories/2 clusters). The binary classification is performed via a 2-6-3-1 dense neural network depicted in Figure 6. The input has 2 dimensions, the model has 2 hidden layers with tanh activation functions, and the output has one unit only with a sigmoid. The output sigmoid is well suited to our binary classification problem. (a) Suppose the model input is x = = (x1, x)t = (0.5, 0.5). The 6 x 2 weight matrix of the first hidden layer is set to 0.50 -0.05 -0.25 0.25 0.04 0.75 0.04 -0.06 0.64 0.04 -0.64 0.08 2 The 3 x 6 weight matrix of the second hidden layer is set to 3 6 0.50 1.00 -0.05 0.40 -0.06 1.00 1.00 -0.25 0.25 0.02 0.01 -0.40 0.04 0.75 1.00 -0.50 0.01 0.64 (7) (8) |||||| || | // Figure 6: A 2-6-3-1 model to make the binary classification for our dataset. Finally, the weights of the output layer are set to W3 (2.0, 0.5, -1.0). We denote by h (h11, h12, ..., h16)t the output of the first hidden layer after the tanh activation. We also denote by h (h21, h22, h23) the output of the second hidden layer after the tanh activation. The model output is denoted by the letter o. For simplicity, it is assumed that the model units have no bias in question (a). = = = Using the linear-algebra equations of forward propagation, without a bias, we calcu- late hi (Wx), h (Wh), and the value of the output o (W3h), where (x) and (x) are the tanh and the sigmoid functions respectively. We get h (0.22127, 0.0, 0.37566,-0.00999, 0.32748,-0.27290)t, h = (-0.20188, 0.40317, 0.21473), and o= : 0.39725. = (a) Based on the values computed by the above forward propagation, compute the gra- dient of the output o with respect to the weight parameter w1 between the first input and the first unit on the first hidden layer. Use the rules of backpropagation. dw11 (b) If Xavier initialization is used for the 12 parameters in W with a Gaussian distribu- tion, what should be the distribution variance? (c) Build the 2-6-3-1 neural network of Figure 6 in the Jupyter notebook template re- ceived with the final exam PDF file. Set the activation functions to tanh for the hidden layers. The output activation should be a sigmoid. Write the model construction via TensorFlow/Keras. Save your Jupyter notebook before sending it back to the instructor by email. After model.summary(), the neural network should look like this: Model: "sequential" Layer (type) dense (Dense) dense 1 (Dense) dense 2 (Dense) Total params: 43 Trainable params: 43 Non-trainable params: 0 Output Shape (None, 6) (None, 3) (None, 1) Param # 18 21 4 ====== Set the epochs to 100, the batch size to 32, the learning rate to 0.0002, and the optimizer to Adam. Run model.compile with the correct arguments and then launch the learning by model.fit (train_points, train_labels, epochs epochs). The training accuracy should be above 99%. The validation accuracy is most likely of 100%. (d) Replace the tanh activation of the hidden layers by ReLu. Try also sigmoids instead of tanh. What are the accuracy results after changing the activation functions?
Expert Answer:
Answer rating: 100% (QA)
The classes are distributed in a circular pattern with the first class inside a circle with radius 1 ... View the full answer
Related Book For
Posted Date:
Students also viewed these programming questions
-
1.Choose from among the definitions/notions that you could explain fully and with examples. 2. Be able to explain the details of input and output of governance framework. 3. Be able to research for...
-
:{"foster", "enthusiasm", "wagon", "ally", "lehigh", "programming", "dog", "cat", "Ally", "smile", "pet" }; a. Suppose you perform insertion sort in order to sort A in ascending order. How many...
-
In this problem we consider annual U.S. lumber production over 30 years. The data were obtained from the U.S. Department of Commerce Survey of Current Business and are presented in Table 16.5 a. Plot...
-
k) Assume that one of these portfolio's is the Market Portfolio and all portfolios, except Portfolio G, are fairly priced according to the CAPM. What is the highest utility score that can be achieved...
-
A point mass rotates in a circle with 1= 2, Calculate the magnitude of its angular momentum and the possible projections of the angular momentum on an arbitrary axis.
-
What is the basic prerequisite for applying FDMA? How does this factor increase complexity compared to TDMA systems? How is MAC distributed if we consider the whole frequency space as presented in...
-
An axial flow compressor stage shown in Fig. P12.66 has the inlet and outlet velocity diagrams shown. Calculate the work per unit mass. Quantities are \(U_{1}=U_{2}=U=762 \mathrm{ft} / \mathrm{s},...
-
Rapture Corporation had the following transactions. 1. Issued $200,000 of bonds payable. 2. Paid utilities expense. 3. Issued 500 shares of preferred stock for $45,000. 4. Sold land and a building...
-
The following T-accounts record the operations of Vaughn Co.: Direct Materials Beginning Balance 29,000 ? 244,000 Ending Balance 18,000 Beginning Balance Work in Process 37,000 Direct Material ?...
-
Alpha and Beta are divisions within the same company. The managers of both divisions are evaluated based on their own divisions return on investment (ROI). Assume the following information relative...
-
A contractor decided to bid for a major commercial project. The total price of her bid is $10 million. Estimate the total cost of estimating and preparing the bid proposal.
-
Simplify 9-2(4x-12) + 8x.
-
John withdrew $16,000 from a 529 plan for his 12-year-old son, John Jr. John Jr. is in 6th grade and attends a private middle school. Assuming that the $16,000 is all earnings in the 529 plan, how...
-
f(x)=5x+3, find f-1(x).
-
Mel is a trucker, who works twelve hour days, four days a week. Because he is on the road over supper time, his employer provides him with an $60 a week that they label "supper money." Last year that...
-
At the beginning of 2022, Company C has a book value of $100 Billion. You forecast Company C will earn $13 Billion of net income for 2022. Your estimate of Company C's cost of equity is 10%. What is...
-
Sehat (Pty) Ltd, a resident company that was formed in 2005, has a 31 March financial year-end. The company distributed the following amounts/assets to its sole shareholder, Mr Ahmed, on 1 July 2021:...
-
Use nodal analysis to determine voltages v1, v2, and v3 in the circuit Fig. 3.76. Figure 3.76 4 S 3i, 2 A 4A
-
Blake field, Inc. has grown significantly over the past decade through innovation and acquisition. Information on several of its divisions follows. The OlliePods division sells children's...
-
Lancaster Orthopedics specializes in hip, knee, and shoulder replacement surgery. In addition to the actual surgery, the company provides its patients with preoperative and postoperative inpatient...
-
Forrest Gump was one of the biggest movie hits of 1994. The movie's fortunes continued to climb in 1995, as it took home Oscars in six of 13 categories in which it was nominated, including best...
-
The position of a particle undergoing simple harmonic motion is given by \(x(t)=20 \cos (8 \pi t)\), where \(x\) is in millimeters and \(t\) is in seconds. For this motion, what are the (a)...
-
Fill in the blanks to make the following statements correct. a. It is difficult to compare two or more data series when absolute numbers and different units are used. For that reason we construct...
-
Fill in the blanks to make the following statements correct. a. On a graph with Y on the vertical axis and X on the horizontal axis, the slope of a straight line is calculated as ___________. b. In...
Study smarter with the SolutionInn App