We are given a data matrix X x (1) , , x (m) , with x (i) R n , i 1, ,m We assume that the data is centered x (1) x (m) 0 An (empirical) estimate of the covariance matrix is In practice, one often finds that the above estimate of the covariance matrix is noisy One way to remove noise is to approximate the covariance matrix as , where F is a n x k matrix, containing the so called factor loadings, with k n the number of factors, and 0 is the idiosyncratic noise variance The stochastic model that corresponds to this is setup is x F f e , where x is the (random) vector of centered observations, (f , e) is a random variable with zero mean and unit covariance matrix, and is the standard deviation of the idiosyncratic noise component e The interpretation of the stochastic model is that the observations are a combination of a small number k of factors, plus a noise part that affects each dimension independently To fit F, to data, we seek to solve 1 Assume l is known and less than k (the k th largest eigenvalue of the empirical covariance matrix ) Express an optimal F as a function of , which we denote F() In other words you are asked to solve for F, with fixed 2 Show that the error the matrix you found in the previous part, can be written as Find a closed form expression for the optimal l that minimizes the error, and summarize your solution to the estimation problem (13 35) 3 Assume that we wish to estimate the risk (as measured by variance) involved in a specific direction in data space Recall from Example 4 2 that, given a unit norm n vector w, the variance along direction is T Show that the rank k approximation to S results in an under estimate of the directional risk, as compared with using How about the approximation based on the factor model above Discuss M 1 m x(i) x(i) T m i 1

Question: We are given a data matrix X = [x (1) , . . . , x (m) ], with x (i) R n ,

We are given a data matrix X = [x⁽¹⁾, . . . , x^(m)], with x⁽ⁱ⁾ ∈ Rⁿ, i = 1, . . . ,m. We assume that the data is centered: x⁽¹⁾ + . . . + x^(m) = 0. An (empirical) estimate of the covariance matrix is

M 1 m [x(i) x(i) T m i=1

In practice, one often finds that the above estimate of the covariance matrix is noisy. One way to remove noise is to approximate the covariance matrix as , where F is a n x k matrix, containing the so-called “factor loadings,” with k

x = F f + σ_e,

where x is the (random) vector of centered observations, (f , e) is a random variable with zero mean and unit covariance matrix, and σ = √λ is the standard deviation of the idiosyncratic noise component σ_e. The interpretation of the stochastic model is that the observations are a combination of a small number k of factors, plus a noise part that affects each dimension independently. To fit F, λ to data, we seek to solve

1. Assume l is known and less than λ_k(the k-th largest eigenvalue of the empirical covariance matrix ∑). Express an optimal F as a function of λ, which we denote F(λ). In other words: you are asked to solve for F, with fixed λ.

2. Show that the error the matrix you found in the previous part, can be written as

Find a closed-form expression for the optimal l that minimizes the error, and summarize your solution to the estimation problem (13.35).

3. Assume that we wish to estimate the risk (as measured by variance) involved in a specific direction in data space. Recall from Example 4.2 that, given a unit-norm n-vector w, the variance along direction ω is ω^T∑ω. Show that the rank-k approximation to S results in an under-estimate of the directional risk, as compared with using ∑. How about the approximation based on the factor model above? Discuss.

M 1 m [x(i) x(i) T m i=1

Step by Step Solution

★★★★★

3.33 Rating (162 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock

1 Since k we see that the above is positive semidefinite It can be written as FF T with 2 Fro... View full answer

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Optimization Models Questions!

We are given a noisy signal x(t) = s(t) + Î·(t) where s(t) is the desired signal and Î·(t) is additive noise. From experience, we know that the average power of the desired...

We are given a directed graph G = (V, E) on which each edge (u, v) E has an associated value r(u, v), which is a real number in the range 0 r(u, v) 1 that represents the reliability of a...

Assume that the CEO is not sure whether all product will come from China or not but is sure that all plants will not make all products. What is your recommendation to the CEO for the location of the...

Topic: Applied Multivariate Statistics Given a data matrix X and the resulting sample correlation matrix R, consider the standardized observations (x_jk - x bar_k) / sqrt(s_kk), k = 1, 2, ..., p and...

Python broadcasting. Rewrite the following code without for-loops using vectorization and python broadcasting. (a) Given a data matrix X and vector beta compute a vector yhat: n = X. shape [0] yhat =...

Given a data matrix X in Rn dX in Rn d where dd is much smaller than nn and k = rank ( X ) k = rank ( X ) , if we project our data onto a kk - dimensional subspace using PCA, our projection will have...

Given a data matrix X n d , where d is much smaller than n and k = rank ( X ) , if we project our data onto a k - dimensional subspace using PCA, our projection will have zero reconstruction error (...

Could you please help me with this question? I don't understand how to calculate it. 2 2 Given a payoff matrix X = (1 1) and a price vector p=(1,0.5,3). 2 6 a. At least one of the securities is...

Given following data set, x 1 1 2 3 5 y 2 5 3 4 9 Notice that just answer won't be accepted. Show all work for full credit. (a) Find the value of the linear correlation coefficient (b) Find the...

According to a summary of the payroll of Guthrie Co., $600,000 was subject to the 6.0% social security tax and the 1.5% Medicare tax. Also, $56,000 was subject to state and federal unemployment...

Given P(A)-0.40, P(B)-0.50, P(AB)-0.15. Which of the following is true? A. A and B are independent B. C. D. A and B are mutually exclusive A and B are complements to each other A and B are not...

Analysis of preferred stock uses earnings after dividends to common stock earnings after interest but before taxes carnings after taxes operating income ( EBIT )

The file contains the amount that a sample of fifteen customers spent for lunch ($) at a fast-food restaurant: a. At the 0.05 level of significance, is there evidence that the mean amount spent for...

A researcher hypothesizes that individuals who listen to classical music will score differently from the general population on a test of spatial ability. On a standardized test of spatial ability, =...

You read in a health magazine about a study in which a new therapy technique for depression was examined. A group of depressed individuals volunteered to participate in the study, which lasted 9...

On the most recent exam in your biology class, every student earned an A. The professor claims that he must really be a good teacher for all of the students to have done so well. Given the confounds...

Mitchell Company has total current assets of $65,000 which includes inventory of $10,000, and current liabilities of $25,000. The company's current ratio is (Round your answer to one decimal place.)

Question 3: Please Answer ALL Parts ASAP. WILL LEAVE GOOD REVIEW! It is now 2019 and you need to raise additional capital to expand your business. You have decided to take your firm public through an...

The appropriate discount rate for the following cash flows is 10 percent compounded quarterly. Year Cash Flow 1 $ 880 2 960 3 0 4 1,550 What is the present value of the cash flows?