Question: (a) What is gradient descent? What functions can it be used on? What does it do with these functions? (b) Will gradient descent always work?
(a) What is gradient descent? What functions can it be used on? What does it do with these functions?
(b) Will gradient descent always work? Why or why not?
(c) What is a "hyperparameter" of a learning algorithm?
2. Using the second derivative test, classify each critical point as a local minimum, local maximum, or saddle point for the following functions.
(a) f(x)=3x4 +4x3 + 12x2 + 5
(b) g(x, y) = xy(1 x y)
4. Consider the function:
f(x,y) = x2 + 2y2 6x + 4y + 18
(a) Sketch the graph of level curves of f(x,y) on the domain 10 x 10,10 y 10
(b) Calculate the gradient f.
(c) Using the origin as a starting point and a learning rate of = 0.05, calculate the first ten points generated by gradient descent. Plot those ten points on your graph from part (a). [It is suggested to use Python for this part.]
(d) What point in the plane does it seem the five points from part (c) are converging to (if any)?
5. Using Lagrange multipliers, find the absolute minimum of f (x, y) = x2 + 2y2 on the circle x2+y2=1. Why is the method of gradient descent difficult when optimizing an objective function on a constraint curve (like in this problem)?
7. A farm co-op has 6000 acres available on which to plant corn and soybeans. The following table summarizes each crop's requirements for fertilizer/herbicide, harvesting labor hours, and the available amounts of these resources.
Corn | Soybeans | Available | |
Fertilizer | 9 gallons/acre | 3 gallons/acre | 40500 gallons |
Harvesting labor | 3/4 hour/acre | 1 hour/acre | 5250 hours |
The co-op's resources of land, fertilizer/herbicide, and harvest time are limited (con- strained). If profits per acre are $240 for corn and $160 for soybeans, how many acres of each crop should the co-op plant to maximize profit? What is the maximum profit? Set up and solve the linear programming problem associated to this word problem.
11. Draw a card at random from a standard deck of cards. The sample space S is the collection of 52 cards. Assume that the probability function assigns 1/52 to each of the 52 outcomes. Let
A = {x : x is a jack, queen, or king} B = {x : x is a 9, 10, or jack and x is red} C = {x : x is a club} D = {x : x is not a club}
Find the following probabilities:
(a) P(A)
(b) P(A B)
(c) P(A U B)
(d) P(C U D)
(e) P(C D)
(f) P(not B)
Step by Step Solution
3.40 Rating (156 Votes )
There are 3 Steps involved in it
Id be happy to help you with these questions 1 Gradient Descent a Gradient descent is an optimization algorithm used to find the minimum of a function ... View full answer
Get step-by-step solutions from verified subject matter experts
Document Format (2 attachments)
664271293e425_980386.pdf
180 KBs PDF File
664271293e425_980386.docx
120 KBs Word File
