New Semester
Started
Get
50% OFF
Study Help!
--h --m --s
Claim Now
Question Answers
Textbooks
Find textbooks, questions and answers
Oops, something went wrong!
Change your search query and then try again
S
Books
FREE
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Tutors
Online Tutors
Find a Tutor
Hire a Tutor
Become a Tutor
AI Tutor
AI Study Planner
NEW
Sell Books
Search
Search
Sign In
Register
study help
business
business statistics communicating
Business Analytics Communicating With Numbers 1st Edition Sanjiv Jaggia, Alison Kelly, Kevin Lertwachara, Leida Chen - Solutions
Kenzi Williams is the Director of Human Resources of a high-tech company. In order to manage the high turnover rate of the IT professionals in her company, she has developed a predictive model for identifying software engineers who are more likely to leave the company within the first year. If the
Answer the following questions using the accompanying data set that lists the actual class memberships and predicted Class 0 (nontarget class) probabilities for 10 observations.a. Compute the misclassification rate, accuracy rate, sensitivity, precision, and specificity using the cutoff value of
Monstermash, an online game app development company, has built a predictive model to identify gamers who are likely to make in-app purchases. The model classifies gamers who are likely to make in-app purchases in Class 1 and gamers who are unlikely to make in-app purchases in Class 0. Applying the
Use the data in the preceding exercise.a. What is the lift that the classification model provides if 5 observations are selected by the model compared to randomly selecting 5 observations?b. What is the lift that the classification model provides if 8 observations are selected by the model compared
A real estate company has built two predictive models for estimating the selling price of a house. Using a small test data set of 10 observations, it tries to assess how the prediction models would perform on a new data set. The following table lists a portion of the actual prices and predicted
Mary Grant, a marketing manager of an online retailer, has built two predictive models for estimating the annual spending of new customers. Applying the models to the 100 observations in the validation data set generates a table that lists the customers’ actual spending and predicted spending. A
The following table displays the weights for computing the principal components and the standardized data (z-scores) for three observations.a. Compute the first principal component score for observation 1.b. Compute the second principal component score for observation 2.c. Compute the third
Perform principal component analysis on the accompanying data set.a. Use the data with the covariance matrix to compute the first two principal components. What are the weights for computing the first principal component scores? What percent of the total variability is accounted for by the first
Perform principal component analysis on the accompanying data set.a. Use the data with the covariance matrix and choose Smallest # components explaining at least: 85% of variance. How many principal components were created? What percent of total variance is accounted for by the calculated principal
The following Analytic Solver results were generated by a principal component analysis.a. How many original variables are in the data set?b. What percent of variance does the third principal component account for?c. How many principal components need to be retained in order to account for at least
Perform principal component analysis on the accompanying data set.a. Use the data with the covariance matrix to compute the seven principal components. How many principal components do you need to retain in order to account for 80% of the total variance? What are the weights for computing the first
The following R results were generated by a principal component analysis.a. How many original variables are in the data set?b. What percent of variance does the third principal component account for?c. How many principal components need to be retained in order to account for at least 85% of the
Standardize the accompanying data set, and then perform principal component analysis.a. What percent of total variance is accounted for by the first three principal components?b. What are the weights for computing the first principal component scores?c. What is the first principal component score
Standardize the accompanying data set, and then perform principal component analysis.a. How many principal components do you need to account for at least 80% of the total variance?b. What are the weights for computing the first principal component scores?c. What is the second principal component
The accompanying data set contains some economic indicators for 11 African countries collected by the World Bank in 2015. The economic indicators include annual % growth in agriculture (Agriculture), annual % growth in exports (Exports), annual % growth in final consumption (Final consumption),
Perform principal component analysis on the accompanying data set.a. Use standardized data to compute the principal components. Select the number of principal components that account for at least 80% of the total variance. How many principal components are displayed by Analytic Solver?b. What are
Beza Gordon-Smith is a high school senior in northern California who loves watching football. She keeps track of football results and statistics of the quarterbacks of each high school team. The accompanying table shows a portion of the data that Beza has recorded, with the following variables: the
Ben Derby is a scout for a college baseball team. He attends many high school games and practices each week in order to evaluate potential players to recruit during each college recruitment season. He also keeps detailed records about each prospective player. His college team is in particular need
Internet addiction has been found to be a widespread problem among university students. A small liberal arts college in Colorado conducted a survey of Internet addiction among its students using the Internet Addiction Test (IAT) developed by Dr. Kimberly Young. The IAT contains 20 questions that
Jenny, a first-year Nutrition Studies student at Hillside College, is conducting research on various common food items and their nutritional facts. She compiled a data set that contains the nutrition facts on 30 common food items. Jenny feels that many of the nutritional facts of these food items
Since 2014, the United Nations has conducted annual studies that measure the level of happiness among its member countries. Experts in social science and psychology are commissioned to collect relevant data and define measurements related to happiness. Happiness measurements are based on survey
In tennis, how well a player serves and returns serves often determine the outcome of the game. Coaches and players track these numbers and work tirelessly to make improvement. The accompanying table shows a sample of data that includes local youth tennis players. The relevant variables include the
Investors usually consider a variety of information to make investment decisions. The accompanying table displays a sample of large publicly traded corporations and their financial information. Relevant information includes stock price (Price), dividend as a percentage of share price (Dividend),
Every year, millions of high school students apply and vie for acceptance to a college of their choice. For many students and their parents, this requires years of preparation, especially for those wishing to attend a top-ranked college. In high schools, students usually work with college advisors
The accompanying file contains 60 observations with the binary target variable y along with the predictor variables x1 and x2. a. Perform KNN analysis. What is the optimal value of k? b. Report the accuracy, specificity, sensitivity, and precision rates for the test data set (for Analytic
The accompanying file contains 111 observations with the binary target variable y along with the predictor variables x1, x2, x3, and x4. a. Perform KNN analysis on the data set. What is the optimal value of k? b. Report the accuracy, specificity, sensitivity, and precision rates for the
The accompanying file contains 80 observations with the binary target variable y along with the predictor variables x1, x2, x3, and x4. a. Perform KNN analysis on the data set. What is the optimal value of k? b. What is the misclassification rate for the optimal k? c. Report the
The accompanying file contains 200 observations with the binary target variable y along with the predictor variables x1, x2, x3, x4, and x5. a. Perform KNN analysis on the data set. What is the optimal value of k? b. What is the misclassification rate for the optimal k? c. Report the
The accompanying file contains 1,000 observations with the binary target variable y along with the predictor variables x1, x2, and x3. a. Perform KNN analysis on the data set. What is the optimal value of k? b. What is the misclassification rate for the optimal k? c. Report the
The accompanying file contains 2,000 observations with the binary target variable y along with the predictor variables x1, x2, and x3.a. Perform KNN analysis on the data set. What is the optimal value of k? b. Report the accuracy, specificity, sensitivity, and precision rates and the AUC value
The accompanying file contains 400 observations with the binary response variable y along with the predictor variables x1, x2, x3, and x4.a. Perform KNN analysis on the data set. What is the optimal value of k?b. Report the accuracy, specificity, sensitivity, and precision rates and the AUC value
A social media marketing company is conducting consumer research to see how the income level and age might correspond to whether or not consumers respond positively to a social media campaign. Aliyah Turner, a new college intern, is assigned to collect data from the past marketing campaigns. She
Universities often rely on a high school student’s grade point average (GPA) and scores on the SAT or ACT for the college admission decisions. Consider the data for 120 applicants on college admission (Admit equals 1 if admitted, 0 otherwise) along with the student’s GPA and SAT scores. A
Law enforcement agencies monitor social media sites on a regular basis, as a way to identify and assess potential crimes and terrorism activities. For example, certain keywords on Facebook pages are tracked, and the data are compiled into a data mining model to determine whether or not the Facebook
The Chartered Financial Analyst (CFA) designation is the de facto professional certification for the financial industry. Employers encourage their prospective employees to complete the CFA exam. Daniella Campos, an HR manager at SolidRock Investment, is reviewing 10 job applications. Given the low
Daniel Lara, a human resources manager at a large tech consulting firm, has been reading about using analytics to predict the success of new employees. With the fast-changing nature of the tech industry, some employees have had difficulties staying current in their field and have missed the
Peter Derby works as a cyber security analyst at a private equity firm. His colleagues at the firm have been inundated by a large number of spam e-mails. Peter has been asked to implement a spam detection system on the company’s e-mail server. He reviewed a sample of 500 spam and legitimate
Online retailers often use a recommendation system to suggest new products to consumers. Consumers are compared to others with similar characteristics such as past purchases, age, income, and education level. A data set, such as the one shown in the accompanying table, is often used as part of a
In recent years, medical research has incorporated the use of data analytics to find new ways to detect heart disease in its early stage. Medical doctors are particularly interested in accurately identifying high-risk patients so that preventive care and intervention can be administered in a timely
College admission is a competitive process where, among other things, the SAT and high school GPA scores of students are evaluated to make an admission decision. The accompanying data set contains the admission decision (Decision; Admit/Deny), SAT score, Female (Yes/No), and high school GPA (HSGPA)
The following table contains three variables and five observations with some missing values.a. Handle the missing values using the omission strategy. How many observations remain in the data set and have complete cases?b. Handle the missing values using the simple mean imputation strategy. How many
The accompanying data set contains four variables, x1, x2, x3, and x4.a. Subset the data set to include only observations that have a date on or after May 1, 1975, for x3. How many observations are in the subset data?b. Split the data set based on the binary values for x4. What are the average
The accompanying data set contains five variables, x1, x2, x3, x4, and x5. There are missing values in the data set.a. Which variables have missing values?b. Which observations have missing values?c. How many missing values are in the data set?d. Handle the missing values using the omission
The accompanying data set contains five variables, x1, x2, x3, x4, and x5. There are missing values in the data set. Handle the missing values using the simple mean imputation strategy for numerical variables and the predominant category strategy for categorical variables.a. How many missing values
The accompanying data set contains five variables, x1, x2, x3, x4, and x5.a. Are there missing values for x1? If so, impute the missing values using the mean value of x1. After imputation, what is the mean of x1?b. Are there missing values for x2? If so, impute the missing values using the mean
The accompanying data set contains five variables, x1, x2, x3, x4, and x5. There are missing values in the data set.a. Which variables have missing values?b. Which observations have missing values?c. How many missing values are in the data set?d. Handle the missing values using the omission
The accompanying data set contains four variables, x1, x2, x3, and x4. There are missing values in the data set.a. Subset the data set to include only x1, x2, and x3.b. Which variables have missing values?c. Which observations have missing values?d. How many missing values are in the data set?e.
The accompanying data set contains seven variables, x1, x2, x3, x4, x5, x6, and x7. There are missing values in the data set.a. Remove variables x2, x6, and x7 from the data set. Which of the remaining variables have missing values?b. Which observations have missing values?c. How many missing
The US Census Bureau records the population for the 50 states each year. The accompanying table shows a portion of these data for the years 2010 to 2018.a. Create two subsets of the state population data: one with 2018 population great than or equal to 5 million and one with 2018 population less
Jerry Stevenson is the manager of a travel agency. He wants to build a model that can predict whether or not a customer will travel within the next year. He has compiled a data set that contains the following variables: whether the individual has a college degree (College), whether the individual
Refer to the previous exercise for a description of the problem and data set.a. Based on his past experience, Jerry knows that whether the individual has a credit card or not has nothing to do with his or her travel plans and would like to remove this variable. Remove the variable CreditCard from
Denise Lau is an avid football fan and religiously follows every game of the National Football League. During the 2017 season, she meticulously keeps a record of how each quarterback has played throughout the season. Denise is making a presentation at the local NFL fan club about these
Refer to the previous exercise for a description of the data set. Denise feels that, for her presentation, it would remove some biases if the player names and team names are suppressed. Remove these variables from the data set.a. Denise also wants to remove outlier cases where the players have less
Ian Stevens is a human resource analyst working for the city of Seattle. He is performing a compensation analysis of city employees. The accompanying data set contains three variables: Department, Job Title, and Hourly Rate (in $). A few hourly rates are missing in the data. a. Split the data
Refer to the previous exercise for a description of the problem and data set. The financial analyst wants to find out if there are any missing values in the data set. a. Are there any missing values in the data set? If there are, which variables have missing values? Which observations have
Investors usually consider a variety of information to make investment decisions. The accompanying table displays a sample of large publicly traded corporations and their financial information. Relevant information includes stock price (Price), dividend as a percentage of share price (Dividend),
The accompanying table contains a portion of data from the National Longitudinal Survey (NLS), which follows over 12,000 individuals in the United States over time. Variables in this analysis include the following information on individuals: Urban (1 if lives in urban area, 0 otherwise), Siblings
New Age Solar sells and installs solar panels for residential homes. The company’s sales representatives contact and pay a personal visit to potential customers to present the benefits of installing solar panels. This high-touch approach works well as the customers feel that they receive personal
Being able to predict machine failures before they happen can save millions of dollars for manufacturing companies. Manufacturers want to be able to perform preventive maintenance or repairs in advance to minimize machine downtime and often install electronic sensors to monitor the machines and
The accompanying data set contains three predictor variables (x1, x2, and x3) and the target variable (y). Partition the data in the Exercise_9.19_Data worksheet to develop a naïve Bayes classification model where “Y” denotes the positive or success class for y. Score the five new observations
The accompanying data set contains three predictor variables (x1, x2, and x3) and the target variable (y). Partition the data in the Exercise_9.18_Data worksheet to develop a naïve Bayes classification model where “Yes” denotes the positive or success class for y. Score the 10 new observations
The accompanying data set contains three predictor variables (x1, x2, and x3) and the target variable (y). Partition the data to develop a naïve Bayes classification model where “1” denotes the positive or success class for y. a. Report the accuracy, sensitivity, and specificity rates for
The accompanying data set contains three predictor variables (x1, x2, and x3) and the target variable (y). Partition the data to develop a naïve Bayes classification model where “1” denotes the positive or success class for y. a. Report the accuracy, sensitivity, specificity, and
The accompanying data set contains two predictor variables (x1 and x2) and the target variable (y). Partition the data to develop a naïve Bayes classification model where “1” denotes the positive or success class for y. a. Report the accuracy, sensitivity, specificity, and precision rates
The accompanying data set contains three predictor variables (x1, x2, and x3) and the target variable (y). a. Bin predictor variables x1, x2, and x3. For Analytic Solver, choose the Equal count option and three bins for each of the three variables. For R, bin x1 into [0, 6), [6, 14), and [14,
The accompanying data set contains three predictor variables (x1, x2, and x3) and the target variable (y). a. Bin predictor variables x1 and x2. For Analytic Solver, choose the Equal count option and three bins for each of the three variables. For R, bin x1 into [0, 60), [60, 400), and [400,
The accompanying data set contains three predictor variables (x1, x2, and x3) and the target variable (y). a. Bin predictor variables x1, x2, and x3. For Analytic Solver, choose the Equal interval option and 2 bins for each of the three variables. For R, bin x1 into [0, 125) and [125, 250); x2
The accompanying data set contains four predictor variables (x1, x2, x3, and x4) and the target variable (y). a. Bin predictor variables x1, x2, x3, and x4. For Analytic Solver, choose the Equal interval option and 2 bins for each of the four variables. For R, bin x1 into [0, 40000) and
Every year, hundreds of thousands of international students apply to graduate programs in the United States. Two of the most important admissions criteria are undergraduate GPAs and TOEFL scores. An English language preparation school in Santiago, Chile, wants to examine the acceptance records of
Admission to medical schools in the United States is highly competitive. The acceptance rate to the top medical schools could be as low as 2% or 3%. With such a low acceptance rate, medical school admissions consulting has become a growing business in many cities. In order to better serve his
Jerry Stevenson is the manager of a travel agency. He wants to build a model that can predict customers’ annual spending on travel products. He has compiled a data set that contains the following variables: whether the individual has a college degree (College), whether the individual has credit
An online retailer is offering a new line of running shoes. The retailer plans to send out an e-mail with a discount offer to some of its existing customers and wants to know if it can use data mining analysis to predict whether or not a customer might respond to its e-mail offer. The retailer
A home improvement retail store is offering its customers store-branded credit cards that come with a deep discount when used to purchase in-store home improvement products. To maintain the profitability of this marketing campaign, the store manager would like to make these offers only to the
Forbes magazine published an article that studied career accomplishments and factors that might contribute to career success (August 30, 2018). It turns out that career success has less to do with talents and is not necessarily influenced by test scores or IQ scores. Rather, “grit,” or a
Predicting whether or not an entering freshman student will drop out of college has been a challenge for many higher education institutions. Nelson Touré, a senior student success adviser at an ivy-league university, has been asked to investigate possible indicators that might allow the university
A community center is launching a campaign to recruit local residents to help maintain a protected nature preserve area that encompasses extensive walking trails, bird watching blinds, wild flowers, and animals. The community center wants to send out a mail invitation to selected residents and
Nora Jackson owns a number of vacation homes on a beach. She works with a consortium of rental home owners to gather a data set to build a classification model to predict the likelihood of potential customers renting a beachfront home during holidays. A portion of the data set is shown in the
Insurance companies use a number of factors to help determine the premium amount for car insurance coverage. Discounts or a lower premium may be given based on factors including credit scores, history of at-fault accidents, age, and sex. Consider the insurance discount data set from 200 existing
A mobile gaming company wants to study a group of its existing customers about their in-game purchases. A data set, a portion of which is shown in the accompanying table, is extracted and includes how old the customer is (Age), Sex (1 if female, 0 otherwise), the amount of weekly play time in hours
Michelle McGrath is a college student working to complete an undergraduate research project to fulfill her psychology degree requirements. She is interested in how physical and behavioral factors might be used to predict an individual’s risk of having depression. After receiving an approval from
Credit card fraud is becoming a serious problem for the financial industry and can pose a considerable cost to banks, credit card issuers, and consumers. Fraud detection using data mining techniques has become an indispensable tool for banks and credit card companies to combat fraudulent
Refer to Exercise 9.16 for the description of a solar panel company called New Age Solar and the Solar_Data worksheet. a. Bin the Age and Income variables in the Solar_Data worksheet as follows. For Analytic Solver, choose the Equal count option and two bins for each of the two variables. For
As millions of people in the U.S. are crippled by student loan debt and high unemployment, policymakers are raising the question of whether college is even a good investment. Richard Clancy, a sociology graduate student, is interested in developing a model for predicting an individual’s income
The classification tree below relates type of wine (A, B, or C) to alcohol content, flavonoids, malic acid, and magnesium. Classify each of the following wines of unknown class.a. Wine with alcohol = 13.3, flavanoids = 1.95, malic acid = 1.03, and magnesium = 114 b. Wine with alcohol = 11.8,
The accompanying data set contains two predictor variables, age and income, and one binary target variable, newspaper subscription (subscribe), indicating whether or not the person subscribes to a newspaper. A media company wants to create a decision tree for predicting whether or not a person will
The accompanying data set contains two predictor variables, average annual number of sunny days (days) and average annual precipitation (precipitation), and one numeric target variable, average annual crop yield in bushels per acre (yield). An agricultural researcher wants to create a decision tree
Refer to the previous exercise for a description of the problem and data set. Build a default classification tree to predict whether a customer has plans to travel within the next year. Display the default classification tree. a. How many leaf nodes are in the tree? What are the predictor
After a data set was partitioned, the first partition contains 43 cases that belong to Class 1 and 12 cases that belong to Class 0, and the second partition contains 24 cases that belong to Class 1 and 121 cases that belong to Class 0. a. Compute the Gini impurity index for the root
Refer to the previous exercise for a description of the problem and data set. Create a classification tree model for predicting whether the community member is likely to enroll in summer courses (ContinueEdu). Select the best-pruned tree for scoring and display the full-grown, best-pruned, and
After a data set was partitioned using the split value of 45.5 for age. The age < 45.5 partition contains 22 patients with a diabetes diagnosis and 178 patients without a diabetes diagnosis, and the age ≥ 45.5 partition contains 48 patients with a diabetes diagnosis and 152 patients without a
Jerry Stevenson is the manager of a travel agency. He wants to build a model that can predict whether or not a customer will travel within the next year. He has compiled a data set that contains the following variables: whether the individual has a college degree (College), whether the individual
The following data set in the Church_ Data worksheet is used to classify individuals as likely or unlikely to attend church using five predictor variables: years of education (Educ), annual income (Income in $), age, sex (F = female, M = male), and marital status (Married, Y = yes, N = no). The
Use the accompanying data set to answer the following questions. a. Which split value for age would best separate the newspaper subscribers from nonsubscribers based on the Gini impurity index? b. Which split value for income would best separate the newspaper subscribers from
Samantha Brown is Director of Continuing Education of a major university. The Continuing Education department offers a wide range of five-week courses to the community during the summer. Samantha would like to find out which community members are more likely to enroll in these summer courses. She
Sunnyville Bank wants to identify customers who may be interested in its new mobile banking app. The worksheet called Mobile_Banking_Data contains 500 customer records collected from a previous marketing campaign for the bank’s mobile banking app. Each observation in the data set contains the
The accompanying data set in the Exercise_10.7_Data worksheet contains four predictor variables (x1 to x4) and one binary target variable (y). Select the best-pruned tree for scoring and display the full-grown, bestpruned, and minimum error trees.a. What is the minimum error in the prune log for
Refer to the previous exercise for a description of the problem and data set. Build a default classification tree to predict whether a customer will download the mobile banking app. Display the default classification tree. a. How many leaf nodes are in the tree? What are the predictor variable
The accompanying data set in the Exercise_10.8_Data worksheet contains four predictor variables (x1 to x4) and one binary target variable (y). Select the best-pruned tree for scoring and display the full-grown, best-pruned, and minimum error trees. a. What is the minimum error in the prune log
The accompanying data set contains five predictor variables (x1 to x5) and one binary target variable (y). Follow the instructions below to create classification trees using the Exercise_10.9_Data worksheet. a. Use the rpart function to build a default classification tree. Display the default
Dereck Anderson is an institutional researcher at a major university. The university has set a goal to increase the number of students who graduate within four years by 20% in five years. Dereck is asked by his boss to create a model that would flag any student who has a high likelihood of not
Showing 7200 - 7300
of 7675
First
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
Step by Step Answers