Question: Know the methods of adding a third variable to a 2D scatter plot, whether categorical or numeric. When faced with a 'compressed' plot, know how
- Know the methods of adding a third variable to a 2D scatter plot, whether categorical or numeric.
- When faced with a 'compressed' plot, know how to expand the visible information.
- Be able to interpret a missing value bar chart (what it shows and what it doesn't show).
- Be able to interpret a missing value bar chart (what it shows and what it doesn't show).
- Be able to interpret and apply categorical variable encoding for linear regression.
- Know the strengths of Python libraries regarding exploratory vs. predictive models.
- Know how to interpret a linear regression explanatory model summary output with categorical variables.
- Understand how ridge regression or the lasso can affect the bias-variance tradeoff.
- Understand the Python code RidgeCV and LassoCV and how they create models for selecting the best value of lambda (you may want to consult Sci-Kit Learn API documentation: https://scikitlearn.org/stable/modules/generated/sklearn.linear_model.LassoCV.html?highlight=lassocv#sklearn.linear_model.LassoCV ; https://scikitlearn.org/stable/modules/generated/sklearn.linear_model.RidgeCV.html?highlight=ridgecv#sklearn.linear_model.RidgeCV )
- Be able to interpret PCA components from a scatter plot.
- Know how PCA can be used for tasks other than as a data processing stage for additional machine learning models.
- Know the conditions under which PCA will perform well or poorly, both in terms of the data and the eigenvalue decomposition.
- Understand how to interpret a box plot, line graph, scatter plot, and histogram regarding distributions, relationships, and extreme values.
- Understand the different relationships that can be interpreted from a scatter plot (i.e., correlation, linearity, heteroscedasticity, and extreme values)
- Understand model selection criteria, namely Adjusted R-squared, AIC, and BIC.
- Know the difference between the minimum number of samples required for fitting a model vs. the number of samples required for testing the fitted model.
- Understand the advantages and disadvantages of 10-fold cross validation vs. LOOCV.
- Know how to visually interpret overfitting and underfitting from an error plot.
- Be able to recognize the equations that describe different cross validation schemes.
- Understand the relationship between PCA components and the bias-variance tradeoff.
- Understand the linear relationship between the components in a PCA model.
- Know the data preprocessing that may be required before building a PCA model.
- Be able to interpret proportion of variance and cumulative variance from a table.
Answer however many possible. If there is not time to do every question, I am ok with some questions being answered. thanks!
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
