Question: Part 1 : Optimization ( 2 5 points ) In this part, you ll get some familiarity with function optimization. Q 1 ( 2 5

Part 1: Optimization (25 points)
In this part, youll get some familiarity with function optimization.
Q1(25 points) First we start with optimization.py, which defines a quadratic with two variables:
y =(x11)2+8(x21)2
This file contains a manual implementation of SGD for this function. Run:
python optimization.py --lr 1
to optimize the quadratic with a learning rate of 1. However, the code will crash, since the gradient hasnt
been implemented.
a) Implement the gradient of the provided quadratic function in quadratic grad. sgd test quadratic
will then call this function inside an SGD loop and show a visualization of the learning process. Note: you
should not use PyTorch for this part!
b) When initializing at the origin, what is the best step size to use? Set your step size so that it gets to
a distance of within 0.1 of the optimum within as few iterations as possible. Several answers are possible.
Hardcode this value into your code.
Exploration (optional) What is the tipping point of the step size parameter, where step sizes larger than
that cause SGD to diverge rather than find the optimum?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!