Question: Q 1 - 1 . Find the summary statistics ( mean , quartiles, min / max , etc ) of each of the variables. Answer:

Q 1-1. Find the summary statistics (mean, quartiles, min/max, etc) of each of the variables. Answer:
Q 1-2. Find all of the pairwise correlations in the data set. Can you find significant correlations between variables? Answer:
Step2: Linear Regression
Q 2-1. Try to fit a linear regression model in order to predict Today using other variables (excluding Direction). Explain the regression result. How can you interpret coefficients, their p-values, and R-squared?import statsmodels.api as sm ## Use OLS function to fit a linear regression model## X is a DataFrame (or numpy array) containing exogenous (independent) variables.## Y is a Series (or numpy array) of dependent variable.model = sm.OLS(Y, X).fit()predictions = model.predict(X)print_model = model.summary() Answer:
Q 2-2. Try to fit a linear regression model in order to predict Direction using other variables (excluding Today). Note that you need to convert the factor data Direction into numeric data type.Explain the regression result. How can you interpret coefficients, their p-values, and adjusted R-squared?Answer:
Q 2-3. Try to fit a linear regression model in order to predict Direction using Lag1, and Lag2. Explain the regression result. How can you interpret coefficients, their p-values, and adjusted R-squared?Answer:
Step3: Logistic Regression
Q 3-1. Try to fit a logistic regression model in order to predict Direction using other variables (excluding Today). Explain the regression result. How can you interpret coefficients and their p-values?Answer:
Q 3-2. Try to fit a logistic regression model in order to predict Direction using Lag1, and Lag2. Explain the regression result. How can you interpret coefficients and their p-values?Answer:
Q 3-3. Predict the probability that the market will go up, given values of the predictors.Then, in order to make a prediction as to whether the market will go up or down on a particular day, we must convert these predicted probabilities into class labels, Up or Down. So create a new vector of predictions based on whether the predicted probability of a market increase is greater than or less than 0.5. Then, tabulate the prediction vector to determine how many observations were correctly or incorrectly classified. What does this table imply about predictions of the logistic regression model?Answer:
Q 3-4. The previous result in Q3-3 is misleading because we trained and tested the model on the same set of 1,250 observations. In order to better assess the accuracy of the logistic regression, we can fit the model using part of the data, and then examine how well it predicts the held out data. Lets use data before 2005 for training and compute predictions for 2005.Now fit a logistic regression model using training data set, using the subset argument.Obtain predicted probabilities on the validation data set, that is for days in 2005. Tabulate the prediction vector to determine how many observations were correctly or incorrectly classified. How can you interpret the prediction accuracy?Answer: Q 3-5. Repeat Q3-4 varying the threshold (other than 0.5) to determine predictions based on whether the predicted probability of a market increase is greater than or less than the threshold.Answer:
 Q 1-1. Find the summary statistics (mean, quartiles, min/max, etc) of

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!