Question: 1. Classification involves predicting the value of a continuous variable. True False 2. K -Nearest Neighbors is a simple procedure that predicts the class of
1. Classification involves predicting the value of a continuous variable.
True
False
2. K-Nearest Neighbors is a simple procedure that predicts the class of an observation by assigning the majority class for a set of observations with the most similar characteristics (i.e., those with the closest predictor values).
True
False
3. The validation set approach is a simple way to estimate the test error associated with fitting a particular statistical learning method on a set of observations. The process of selecting observations for the training and test/validation sets MUST be random.
True
False
4. Though its performance is often inferior to many contemporary machine learning algorithms, a key benefit of logistic regression is that it is easier to explain than output from more sophisticated models.
True
False
5. The coefficients in logistic and linear regression output are interpreted the same way.
True
False
6. When training a model, the goal is to achieve the highest accuracy possible on the training set.
True
False
7. Cross-validation techniques help achieve an optimal bias-variance tradeoff.
True
False
8. Which of the following are scenarios where classification techniques should be applied (select all that apply)?
a. Predicting which customers are likely to respond to a promotion
b. Forecasting profit margins for next quarter
c. Calculating the probability of each employee being promoted within the next year
d. Estimating sales for a new product line
All of the above warrant classification techniques.
9. A confusion matrix is a classification tool used to evaluate which of the following (select all that apply)?
a. True Positive Rates (Sensitivity)
b. True Negative Rates (Specificity)
c. False Positive Rates
d. False Negative Rates
e.Overall Accuracy Rates
f. A and B Only
10. Which of the following are true statements regarding k-fold cross-validation (select all that apply)?
a. k-fold CV involves randomly separating observations into groups (or folds).
b. k represents the number of groups into which the observations are partitioned.
c. k-1 folds are used for model training; the remaining fold is used to evaluate model performance.
d. k is generally set to the number of observations in the dataset.
e. k-fold CV is a more rigorous method of evaluating model performance as compared with the validation set approach.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
