Question: Question 3 . 1 We will first need to implement the softmax function, which is used as the predictor function by the softmax regression learner.

Question 3.1
We will first need to implement the softmax function, which is used as the predictor function by the softmax regression learner. In class we learned that the softmax function is defined as:
y(z)=y(z0,...,zK1)=ezyK1q=0ezq
wherez in RK,Kis the number of classes andy in Y.
If for some y, zy is large, then ezy will be very large and can cause an overflow error. To avoid this, we can rewrite the softmax function as:
y(z)=y(z0,...,zK1)=ezyK1q=0ezq=ezyK1q=0ezqemaxlzlemaxlzl=ezymaxlzlK1q=0ezqmaxlzl.
This way the exponential function will never take as input a number larger than 0, which will prevent overflow errors.
For similar reasons as we have discussed in Question 2.1, it is often preferred to use vectors or matrices as input and output for the softmax function. In particular, we can store all K different softmax outputs into a single vector and define a vector softmax function as:
(z)=(0(z),...,K1(z)) in [0,1]K.
We would also like to compute the softmax function for multiple data points at once. That means the input to thesoftmaxfunction will be a matrixZwithnrows andKcolumns, where each row represents a data point and each column represents a class.
In your implementation of the softmax function you should assume the input is a matrix represented as a numpy arrary. You should then return a matrix of the same shape where each row is the vector softmax function applied to the corresponding row of the input matrix.
You may find the functions: np.exp, np.sum, and np.max useful for this implementation.
Complete the implementation of softmax.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!