# Question

Periodically, software engineers must provide estimates of their effort in developing new software. In the Journal of Empirical Software Engineering (Vol. 9, 2004), multiple regression was used to predict the accuracy of these effort estimates.

The dependent variable, defined as the relative error in estimating effort,

y = (Actual effort-Estimated effort)/(Actual effort)

Was determined for each in a sample of n = 49 software development tasks. Eight independent variables were evaluated as potential predictors of relative error using stepwise regression. Each of these was formulated as a dummy variable, as shown in the table.

Company role of estimator: x1 = 1 if developer, 0 if project leader

Task complexity: x2 = 1 if low, 0 if medium/high

Contract type: x3 = 1 if fixed price, 0 if hourly rate

Customer importance: x4 = 1 if high, 0 if low/medium

Customer priority: x5 = 1 if time of delivery, 0 if cost or quality

Level of knowledge: x6 = 1 if high, 0 if low/medium

Participation: x7 = 1 if estimator participates in work, 0 if not

Previous accuracy: x8 = 1 if more than 20% accurate, 0 if less than 20% accurate

a. In step 1 of the stepwise regression, how many different one-variable models are fitted to the data?

b. In step 1, the variable x1 is selected as the “best” one-variable predictor. How is this determined?

c. In step 2 of the stepwise regression, how many different two-variable models (where x 1 is one of the variables) are fitted to the data?

d. The only two variables selected for entry into the stepwise regression model were x1 and x8. The step wise regression yielded the following prediction equation:

y = .12 - .28x1 + .27x8

Give a practical interpretation of the b estimates multiplied by x1 and x8.

e. Why should a researcher be wary of using the model, part d, as the final model for predicting effort (y)?

The dependent variable, defined as the relative error in estimating effort,

y = (Actual effort-Estimated effort)/(Actual effort)

Was determined for each in a sample of n = 49 software development tasks. Eight independent variables were evaluated as potential predictors of relative error using stepwise regression. Each of these was formulated as a dummy variable, as shown in the table.

Company role of estimator: x1 = 1 if developer, 0 if project leader

Task complexity: x2 = 1 if low, 0 if medium/high

Contract type: x3 = 1 if fixed price, 0 if hourly rate

Customer importance: x4 = 1 if high, 0 if low/medium

Customer priority: x5 = 1 if time of delivery, 0 if cost or quality

Level of knowledge: x6 = 1 if high, 0 if low/medium

Participation: x7 = 1 if estimator participates in work, 0 if not

Previous accuracy: x8 = 1 if more than 20% accurate, 0 if less than 20% accurate

a. In step 1 of the stepwise regression, how many different one-variable models are fitted to the data?

b. In step 1, the variable x1 is selected as the “best” one-variable predictor. How is this determined?

c. In step 2 of the stepwise regression, how many different two-variable models (where x 1 is one of the variables) are fitted to the data?

d. The only two variables selected for entry into the stepwise regression model were x1 and x8. The step wise regression yielded the following prediction equation:

y = .12 - .28x1 + .27x8

Give a practical interpretation of the b estimates multiplied by x1 and x8.

e. Why should a researcher be wary of using the model, part d, as the final model for predicting effort (y)?

## Answer to relevant Questions

Industrial engineers at the University of Florida used regression modeling as a tool to reduce the time and cost associated with developing new metallic alloys (Modeling and Simulation in Materials Science and Engineering , ...The Journal of Organizational Culture, Communications and Conflict (July 2007) published a study on women in upper management positions at U.S. firms. Monthly data (n = 252 months) were collected for several variables in an ...a. Write a first-order model relating E (y) to two quantitative independent variables x1 and x2. b. Write a complete second-order model. Fluorocarbon plasmas are used in the production of semiconductor materials. In the Journal of Applied Physics (Dec. 1, 2000), electrical engineers at Nagoya University (Japan) studied the kinetics of fluorocarbon plasmas in ...A multinomial experiment with k = 4 cells and n = 400 produced the data shown in the accompanying table. Do these data provide sufficient evidence to contradict the null hypothesis that p1 = .2, p2 = .4, p3 = .1, and p4 = ...Post your question

0