Question: Question 3 . 1 We will first need to implement the softmax function, which is used as the predictor function by the softmax regression learner.

Question

3.1

We will first need to implement the softmax function, which is used as the predictor function by the softmax regression learner. In class we learned that the softmax function is defined as:

(

) =

(

0, . . .,

1) =

ezy

1

= 0

ezq

wherez in RK

,

Kis the number of classes andy in Y

.

If for some y

,

zy is large, then ezy will be very large and can cause an overflow error. To avoid this, we can rewrite the softmax function as:

(

) =

(

0, . . .,

1) =

ezy

1

= 0

ezq

=

ezy

1

= 0

ezq

maxlzle

maxlzl

=

ezy

maxlzl

1

= 0

ezq

maxlzl

.

This way the exponential function will never take as input a number larger than

0,

which will prevent overflow errors.

For similar reasons as we have discussed in Question

2.1,

it is often preferred to use vectors or matrices as input and output for the softmax function. In particular, we can store all K different softmax outputs into a single vector and define a vector softmax function as:

(

) = (

0 (

), . . .,

1 (

))

[0, 1]

.

We would also like to compute the softmax function for multiple data points at once. That means the input to thesoftmaxfunction will be a matrixZwithnrows andKcolumns, where each row represents a data point and each column represents a class.

In your implementation of the softmax function you should assume the input is a matrix represented as a numpy arrary. You should then return a matrix of the same shape where each row is the vector softmax function applied to the corresponding row of the input matrix.

You may find the functions: np

.

exp, np

.

sum, and np

.

max useful for this implementation.

Complete the implementation of softmax.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Motion Detection We can use our averagerators to perform motion detection in a sequence of images captured by a webcam. The idea is this. Each image will be represented as a H x W x 3 3-d array; H...

CH A P TER 3 Learning and Motivation Chapter Learning Outcomes After reading this chapter, you should be able to: NEL define learning and describe learning outcomes describe the three stages of...

Help me answer the following activities below and follow the given pages. Will rate you helpful, thank you. A. 0.66 B. 0.67 C. 0.68 D. 0.69 7. What is the equation of the line of best fit or...

This question involves the use of AGGREGATE linear PYTHOIN regression on the Auto data set. (a) Perform a simple linear regression with mpg as the response and horsepower as the predictor. Describe...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Machine Learning - doing neural networks This is all to be written in Python Introduction In Part 1 of this assignment you will implement a basic neural net in numpy. You are not to use any libraries...

this is the questionplease help me thanksits a project about renovation one level of a building. project management case . This is a renovation project On the basis of group work as a personal...

ML in a nutshell Optimization, and machine learning, are intimately connected. At a very coarse level, ML works as follows. First, you come up somehow with a very complicated model y = M(x, 0), which...

import numpy as np from scipy.optimize import minimize from scipy.io import loadmat from numpy.linalg import det, inv from math import sqrt, pi import scipy.io import matplotlib.pyplot as plt import...

DETAILS What is the maximum segment length of a 100Base-FX netdwork, The last character ('X', etc) refers to the line code method used. Line code is a patdtern of voltage, current or photons used to...

NAMELAB is a San Francisco-based company that uses linguistic analysis and computers to invent catchy names for new products. The company is credited with the invention of Acura, Compaq, Sentra, and...

Alphas balance-of-payments data for 2008 are shown below. All figures are in billions of dollars. What are the (a) Balance on goods, (b) Balance on goods and services, (c) Balance on current account,...

An investor is consibering the acquastion of a "diaressed property" which is on Northiake Bank's REO list. The property is avaliabie for $ 2 0 3 , 4 0 0 and the investor estimates that he can borrow...

A company is working on a project with 9 work packages. The duration of each work package is as follows: The value of each work package and the Gantt chart that were proposed during the planning phase