Question: An artificial neural network ( ANN ) , especially in its modern form, is the underpinning technology that has enabled modern AI systems. A simple

An artificial neural network (ANN), especially in its modern form, is the underpinning technology
that has enabled modern AI systems. A simple ANN model, denoted by FAN N (x, W), can be
formulated as the below:
FAN N (x, W)=
m
j=1
(w
j x), for any x in d (1)
where W =(wj : j =1,..., m) with wj in d are fitting parameters of the ANN model and
: is called an activation function. We let m =20 for this project (20 activation function).
Common choice of the activation function can be the linear function, (x) := x for all x in , or
the ReLU function, (x) := max{0, x} for all x in .
We often train an NN model to fit the training data available by solving the following optimiza-
tion problem:
min
W:=(wj )
1
2N
N
i=1
(FAN N (xi, W) yi)2+
m
j=1
wj 1(2)
where (xi, yi), i =1,..., N , is a sequence of training data inputs (or designs/predictors) xi and
labels (or response variables) yi. Here, v1 for any vector v =(vk) is the 1-norm of v; namely,
v1=
k |vk |
While simple, this ANN can be a powerful tool for many applications such as handwriting
recognition.
1.2 Problem Statements
Implementations in a programming language of your choice are asked to solve the following problems.
Training data will be provided in separate file(s). An example for the implementation will be written
in both Matlab and Python.
Question for Student 1: Assume that (x)= x and =0.01. Train the ANN by solving (2)
using a gradient decent algorithm a constant step size. Note that 1 is not differentiable.
So, some transformation/derivation is needed to equivalently represent this problem to gain
differentiability.
Question for Student 2: Assume that (x)=(max{0, x})3 and =0. Train the ANN by
solving (2) using a gradient decent algorithm with a constant step size.
Question for Student 3: Consider a modified formulation:
min
W:=(wj )
1
2N
N
i=1
(FAN N (xi, W) yi)2+
m
j=1
wj 2
2(3)
Assume that (x)= x and =0.3. Train the ANN by solving (3) using a Newtons method.
Here, v2 for any vector v =(vk ) denotes the 2-norm of v; namely, v2=
k v2
k.
1
Question for Student 4: Assume that (x)= x. Train the ANN by solving (2) using the grid
search method.
Question for Student 5: Assume that (x)= x. Train the ANN by solving (2) using stochastic
approximation that operates under the assumption that each iteration of the algorithm can
only access one pair of sample data.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!