Question: Problem 1: effect of sample size Generate training datasets withnObs=25,100and500observations such that two variables are associated with the outcome as parameterized above and three are

Problem 1: effect of sample size

Generate training datasets withnObs=25,100and500observations such that two variables are associated with the outcome as parameterized above and three are not associated and average difference between the two classes is the same as above (i.e.in the notation from the above codenClassVars=2,nNoiseVars=3anddeltaClass=1). Obtain random forest, LDA and KNN test error rates on a (for greater stability of the results, much larger, say, with 10K observations) test dataset simulated from the same model. Describe the differences between different methods and across the sample sizes used here.

The following example below illustrates the main ideas on a 3D dataset with two of the three attributes associated with the outcome:

# How many observations:

nObs <- 1000

# How many predictors are associated with outcome:

nClassVars <- 2

# How many predictors are not:

nNoiseVars <- 1

# To modulate average difference between two classes' predictor values:

deltaClass <- 1

# Simulate training and test datasets with an interaction

# between attribute levels associated with the outcome:

xyzTrain <- matrix(rnorm(nObs*(nClassVars+nNoiseVars)),nrow=nObs,ncol=nClassVars+nNoiseVars)

xyzTest <- matrix(rnorm(10*nObs*(nClassVars+nNoiseVars)),nrow=10*nObs,ncol=nClassVars+nNoiseVars)

classTrain <- 1

classTest <- 1

for ( iTmp in 1:nClassVars ) {

deltaTrain <- sample(deltaClass*c(-1,1),nObs,replace=TRUE)

xyzTrain[,iTmp] <- xyzTrain[,iTmp] + deltaTrain

classTrain <- classTrain * deltaTrain

deltaTest <- sample(deltaClass*c(-1,1),10*nObs,replace=TRUE)

xyzTest[,iTmp] <- xyzTest[,iTmp] + deltaTest

classTest <- classTest * deltaTest

}

classTrain <- factor(classTrain > 0)

table(classTrain)

# plot resulting attribute levels colored by outcome:

pairs(xyzTrain,col=as.numeric(classTrain))

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

A year ago Pod Limited bought 225,000 £1 fully paid ordinary shares of Pea Limited for a consideration of £500,000. Pea Limiteds share capital and share premium were each the same as at...

effect of sample size Generate training datasets with nObs = 2 5 , 1 0 0 and 5 0 0 observations in a way shown in the preface. Please have two variables associated with the outcome and three...

1. Calculate the sample size needed given these factors: one-tailed t-test with two independent groups of equal size small effect size (see Piasta, S.B., & Justice, L.M., 2010) alpha =.05 beta = .2...

Assignment 3: Attribute sampling problems 1 and 2. Problem 3-Probability proportionate to size problem Problem 4-variable sampling problems. Page 74-79 ABC, INC. A Comprehensive Decision Based Case,...

What changes in methods, testing, or acceptance of the engagement would any of the following have for you: 1) Significant global operations including 3 rd world countries 2) Significant new...

Assignment 4: Receivables questionnaire problems 1 and 2. Inventory questionnaire problems 1 and 2 Receivable questionnaire starts on page 97 to 100 and then answer the 2 problems Inventory...

page 71 Problem 4 Using the client?s financial statements and notes to the financial statements, determine what issues relating to GAAP to which the predecessor auditor may be referring. Ensure you...

Good afternoon, I need to answer the questionnaire "GENERAL BUSINESS RISK ASSESSMENT QUESTIONNAIRE" on section 3 of the case. Thank you ABC, INC. A Comprehensive Decision Based Case, 4th Edition,...

Section 9 Completing the audit problems 1, 2, and 5. Page 119 Section 10 Writing the report problems 1, 2, and 3Page 120 ABC, INC. A Comprehensive Decision Based Case, 3rd Edition, 2015 ISBN:...

Under what circumstances a transaction would be recorded as a troubled-debt restructuring by only one of the two parties to the transaction?

How are the cost of goods manufactured, the cost of goods sold, the income statement, and the balance sheet related for a manufacturing company? What specific items flow from one statement or...

13.53 Prove Theorem 13.2.

Ceteris paribus, if the Fed raises the reserve requirement, then Multiple Choice the money multiplier increases. the lending capacity of the banking system decreases. excess reserves increase....

Identify the different methods employed in the selection process.

Demonstrate the difference between ability and personality tests.

Understand the critical nature of the performance management process and the role of human resources in infl uencing the associated systems and operations.