Question 4 In logistic regression, we assume Y ( Y 1 , dots, Y n ) T are a collection of n binary observa tions For each Y i , we observe x i ( x i 1 , x i 2 , x i 3 ) T predictors We assume l o g p i 1 p i x i T , where ( 1 , dots, 3 ) T is the vector of regression coefficients a ) Formulate the overall loglikelihood of the dataset l ( Y ) b ) Derive the first derivative dell Y d e l 2 c ) Derive the second derivative d e l 2 l Y d e l 2 del 3 d ) Suppose hat ( ) c u r r e n t ( 0 1 , 0 2 , 0 3 ) T , and we have a new observation x n 1 ( 3 , 2 , 4 ) T Predict the probability p of success for this new observation e ) Suppose Y n 1 1 , calculate the first derivative vector dell Y n 1 d e l f ) Suppose Y n 1 1 , calculate the second order derivative matrix d e l 2 l Y n 1 d e l d e l T g ) If instead of using Newton's method, we decide to use stochastic gradient accent algorithm, which updates the parameter in the following manner hat ( ) n e w hat ( ) c u r r e n t d e l l Y n 1 d e l Please update the estimate for Use the learning rate 0 0 0 1 ( You may notice that we are maximizing the objective function ( loglikelihood ) , we are using stochastic gradient accent algorithm, which is very similar to the gradient descent algorithm for minimization purpose, which we learned a few weeks ago Accent means going up and descent means going down The two methods only differ by the sign of the second term, for maximization, we add the second term, for minimization we minus the second term

The Answer is in the image, click to view ...

Question: Question 4 : In logistic regression, we assume Y = ( Y 1 , dots, Y n ) T are a collection of n binary

Question

4

In logistic regression, we assume

Y = (Y_{1},

dots,

Y_{n})^{T}

are a collection of

n

binary observa

-

tions. For each

Y_{i},

we observe

x_{i} = (x_{i 1}, x_{i 2}, x_{i 3})^{T}

predictors. We assume

l o g \frac{p_{i}}{1 - p_{i}} = x_{i}^{T},

where

= (_{1},

dots,

_{3})^{T}

is the vector of regression coefficients.

)

Formulate the overall loglikelihood of the dataset

l (Y) .

)

Derive the first derivative dell

\frac{Y}{d} e l_{2} .

)

Derive the second derivative

d e l^{2} l \frac{Y}{d} e l_{2}

del

_{3} .

)

Suppose hat

()^{c u r r e n t} = (0.1, 0.2, 0.3)^{T},

and we have a new observation

x_{n + 1} = (3, 2, 4)^{T} .

Predict the probability

p

of success for this new observation.

)

Suppose

Y_{n + 1} = 1,

calculate the first derivative vector dell

\frac{Y_{n + 1}}{d} e l .

)

Suppose

Y_{n + 1} = 1,

calculate the second order derivative matrix

d e l^{2} l \frac{Y_{n + 1}}{d} e l d e l^{T} .

)

If instead of using Newton's method, we decide to use stochastic gradient accent

algorithm, which updates the parameter in the following manner:

hat

()^{n e w} =

hat

()^{c u r r e n t} + d e l l \frac{Y_{n + 1}}{d} e l

Please update the estimate for

.

Use the learning rate

= 0.001 .

(

You may notice that we are maximizing the objective function

(

loglikelihood

),

we are

using stochastic gradient accent algorithm, which is very similar to the gradient descent

algorithm for minimization purpose, which we learned a few weeks ago. "Accent" means

"going up

"

and "descent" means "going down". The two methods only differ by the sign of

the second term, for maximization, we add the second term, for minimization we minus the

second term.

Question 4: In logistic regression, we assume Y=(Y1,dots,Yn)T are a collection

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Question 4 : In logistic regression, we assume Y = ( Y 1 , dots, Y n ) T are a collection of n binary observa - tions. For each Y i , we observe x i = ( x i 1 , x i 2 , x i 3 ) T predictors. We...

Read the article:Zeng, T. (2016). Corporate Social Responsibility, Tax Aggressiveness, and Firm Market Value. Accounting Perspectives ,15(1), 7-30. doi:10.1111/1911-3838.12090 Does Corporate Social...

COVER PAGE STAT 608 Homework 05 Summer 2017 Please TYPE your name and email address below, then convert to PDF and attach as the first page of your homework upload. NAME: EMAIL: HOMEWORK NUMBER:...

Question: In this assignment, imagine that you work for a company as a data mining expert. You are given a dataset called Default.csv...

Done Green entrepren... Q g [D 14 @ M. D. VASILESCU ET AL. Green entrepreneurship Supplysilo . Demand-side Political and [actors \"\"0\" economic context ngement Figure 3. The influencing factors of...

I need to see the SPSS output. You need to have all z-scores, all charts, all descriptives data from SPSS, everything you used to answer the questions. I am sending you what the previous tutor sent...

Question 1 (10 points): Write down the logistic regression equation for the Natural Logarithm of the odds (of survival) Question 2 (10 points): What is the P-value for each of the partial regression...

Help with writing a short analytical summary of 150-200 words on each of the 2 articles below. Article 1: Exploring community-based options for reducing youth crime. The BackTrack program was...

Week 2: Understanding and Exploring Assumptions You will submit one Word document, including your SPSS output. 1. Why do we care whether the assumptions required for statistical tests are met? (Tip:...

Draw the logic circuits for the following Boolean expressions and simplify the logic expressions using theorems of Boolean algebra. (i) C(BD) + BD + ABC

Do They Compute? Calculate the quantities required in Problems 1-3 where And Or explain why they are undefined. 1. 2A 2. A + 2B 3. 2C - D? 321 2 121 -121 2 0 310 100 32

Changes in the net working capital requirements: can affect the cash flows of a project every year of the project's life. only affect the initial cash flows of a project. only affect the initial ? 2...

23.Among the polyethylene grades the one with higher stiffness, strength and melting point is (1 Point) a. LDPE b. HDPE c. PVC 24. The manufacturing processes most important associated with...

=+capital, labor, and total factor productivityhow much would you attribute to each source? Compare your results to the fi gures we found for the United States in Table 9-3.

=+total capital grow at 3.6 percent per year. Suppose further that the capital share of output is 1/3. If you used the growth-accounting equation to divide output growth into three sources

=+ 2. Labor productivity is defi ned as Y/L, the amount of output divided by the amount of labor input. Start with the growth-accounting equation and show that the growth in labor