(a) You are given a data set on cancer detection. After building a classification model which...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
(a) You are given a data set on cancer detection. After building a classification model which achieves an accuracy of 90%, would you be satisfied with your model performance? What can you do about it? (b) In the context of k-NN classifier for multi-class classification, consider the case where k = 3 and the three nearest neighbours of a query have three different class labels. How would you assign the class label to the query example in this case? (c) In unsupervised learning, if a ground truth about a dataset is unknown, how can we determine the most useful number of clusters to be? (d) For a supervised classification problem, how can you determine which features are the most important? (e) Your machine learning application requires that the client be provided an explanation for the learning decision. Assuming that a a deep learning model and a decision tree model achieve similar accuracy for your task, which one would you prefer to use and why? (f) k-NN and kmeans clustering both rely crucially on the distance measure used. What is the difference between these two learning techniques? (g) A company has built a classifier that gets 100% accuracy on training data. When they deployed this model on client side it has been found that the model is highly inaccurate. What might have gone wrong? (h) Sometimes when building a machine learning model we might prefer to have fewer features rather than many feature. Give three reasons why this might be the case. (i) After spending several hours, you are anxious to build a high accuracy model. You built 5 boosting models, but neither of these models performed better than benchmark score. Finally, you decided to combine those models as ensemble models are known to provide high accuracy. If your accuracy still doesn't improve, what could potentially be wrong with your ensemble model? (i) For a given feature, the minimum and maximum value in the training data is 100 and 1000, respectively. The minimum and the maximum value of the feature in the test data is 50 and 950, respectively. What is the correct way to do min-max normalisation of this feature for a test instance with value in order to have a fair validation? A. (100)/(1000 - 100) B. (250)/(950-50) C. (50)/(1000 - 50) D. (100)/(1000-50) [2] [2] [2] [2] [2] [2] [2] [2] [2] [2] (a) You are given a data set on cancer detection. After building a classification model which achieves an accuracy of 90%, would you be satisfied with your model performance? What can you do about it? (b) In the context of k-NN classifier for multi-class classification, consider the case where k = 3 and the three nearest neighbours of a query have three different class labels. How would you assign the class label to the query example in this case? (c) In unsupervised learning, if a ground truth about a dataset is unknown, how can we determine the most useful number of clusters to be? (d) For a supervised classification problem, how can you determine which features are the most important? (e) Your machine learning application requires that the client be provided an explanation for the learning decision. Assuming that a a deep learning model and a decision tree model achieve similar accuracy for your task, which one would you prefer to use and why? (f) k-NN and kmeans clustering both rely crucially on the distance measure used. What is the difference between these two learning techniques? (g) A company has built a classifier that gets 100% accuracy on training data. When they deployed this model on client side it has been found that the model is highly inaccurate. What might have gone wrong? (h) Sometimes when building a machine learning model we might prefer to have fewer features rather than many feature. Give three reasons why this might be the case. (i) After spending several hours, you are anxious to build a high accuracy model. You built 5 boosting models, but neither of these models performed better than benchmark score. Finally, you decided to combine those models as ensemble models are known to provide high accuracy. If your accuracy still doesn't improve, what could potentially be wrong with your ensemble model? (i) For a given feature, the minimum and maximum value in the training data is 100 and 1000, respectively. The minimum and the maximum value of the feature in the test data is 50 and 950, respectively. What is the correct way to do min-max normalisation of this feature for a test instance with value in order to have a fair validation? A. (100)/(1000 - 100) B. (250)/(950-50) C. (50)/(1000 - 50) D. (100)/(1000-50) [2] [2] [2] [2] [2] [2] [2] [2] [2] [2]
Expert Answer:
Answer rating: 100% (QA)
a Achieving an accuracy of 90 in a cancer detection model is a good starting point but it might not be sufficient depending on the specific requirements and consequences of misclassification In medica... View the full answer
Related Book For
Business Intelligence And Analytics Systems For Decision Support
ISBN: 9781292009209
10th Global Edition
Authors: Efraim Turban, Ramesh Sharda, Dursun Delen, Pearson Education Limited, Dennis G. Zill
Posted Date:
Students also viewed these programming questions
-
PART A - MATERIAL PROPERTIES IN UNIAXIAL TENSION Objective To demonstrate the behaviour of brittle and ductile materials in uniaxial tension. Introduction to the Tension Test The term tension test is...
-
Planning is one of the most important management functions in any business. A front office managers first step in planning should involve determine the departments goals. Planning also includes...
-
Managing Scope Changes Case Study Scope changes on a project can occur regardless of how well the project is planned or executed. Scope changes can be the result of something that was omitted during...
-
Write a method to take an integer array as a parameter and return how many elements having the same digits in each element in the array for example if we have array 1 1 , 4 4 , 1 4 , 2 3 , 1 2 , 5 6...
-
Before the 2013 season, the Los Angeles Angels signed outfielder Josh Hamilton to a contract that would pay him an immediate $10 million signing bonus and the following amounts: $15 million for the...
-
Write (+ and (- by the appropriate atoms and draw a dipole moment vector for any of the following molecules that are polar: (a) HF (b) IBr (c) Br2 (d) F2
-
Motion Auto has the following information for the years ending December 31,2010 and 2009: Requirements 1. Compute the rate of inventory turnover for Motion Auto for the years ended December 31, 2010...
-
Convert the temperatures in parts (a) and (b) and temperature intervals in parts (c) and (d): (a) T = 85F to R, C, K (b) T = 10C to K, F, R (c) T = 85C to K, F, R (d) T = 150R to F, C, K
-
Jasmine Company manufactures both pesticide and liquid fertilizer, with each product manufac- tured in separate departments. Three support departments support the production departments: Power,...
-
Construct annual incremental operating cash flow statements. Shrieves Casting Company is considering adding a new line to its product mix, and the capital budgeting analysis is being conducted by...
-
On December 31, Year 3, ABC Corporation, a public company, issues ten-year convertible bonds with a face value of $2,000,000 and a stated rate of 8%, paid semi- annually on June 30 and December 31....
-
Find the transfer function H(z) = x(n) Y(z) X(z) in Z domain for below given system. f(n) g(n) y(n)
-
High Desert Potteryworks makes a variety of pottery products that it sells to retailers. The company uses a job - order costing system in which departmental predetermined overhead rates are used to...
-
If debts to equity ratio equal to 1 0 0 % so debts to assets ratio equal ?
-
An urn has 1 7 balls that are identical except that 5 are white, 6 are red, and 6 are blue. What is the probability that all are white if 3 are selected randomly without replacement?
-
Consider the following game where there are two players. Player 1 first chooses UP or DOWN. The move of player 1 is not observable by player 2. Player I's move is followed by a chance move that...
-
Evaluate the limit using L'Hpital's Rule. (27x)2/3 + 16x lim x-00 (27x)/3-27x (Use symbolic notation and fractions where needed.) (27x)2/3 + 16x lim x+00 (27x)/3-27x = Question Source: Rorawski 4e...
-
Archangel Corporation prepared the following variance report. Instructions Fill in the appropriate amounts or letters for the question marks in the report. ARCHANGEL CORPORATION Variance...
-
BP Lubricants established the BIGS program following recent merger activity to deliver globally consistent and transparent management information. As well as timely business intelligence, BIGS...
-
What are authoritative pages, hubs, and hyperlinkinduced topic search (HITS)? Discuss the differences between citations in research articles and hyperlinks on Web pages.
-
Go to your librarys online resources. Learn how to download attributes of a collection of literature (journal articles) in a specific topic. Download and process the data using a methodology similar...
-
Assume you are working on a project to improve customer service. Create a Pareto chart based on the information in the following table. Use the Pareto chart template from the companion website or...
-
Brainstorm two different quality related problems that you are aware of at your college or organization. Then review the charts found in the section of this chapter on the seven basic tools of...
-
Updating processes, policies, procedures, and knowledge bases is part of __________. A. updating organizational process assets B. archival C. updating project documents D. lessons learned
Study smarter with the SolutionInn App