(a) You are given a data set on cancer detection. After building a classification model which...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
(a) You are given a data set on cancer detection. After building a classification model which achieves an accuracy of 90%, would you be satisfied with your model performance? What can you do about it? (b) In the context of k-NN classifier for multi-class classification, consider the case where k = 3 and the three nearest neighbours of a query have three different class labels. How would you assign the class label to the query example in this case? (c) In unsupervised learning, if a ground truth about a dataset is unknown, how can we determine the most useful number of clusters to be? (d) For a supervised classification problem, how can you determine which features are the most important? (e) Your machine learning application requires that the client be provided an explanation for the learning decision. Assuming that a a deep learning model and a decision tree model achieve similar accuracy for your task, which one would you prefer to use and why? (f) k-NN and kmeans clustering both rely crucially on the distance measure used. What is the difference between these two learning techniques? (g) A company has built a classifier that gets 100% accuracy on training data. When they deployed this model on client side it has been found that the model is highly inaccurate. What might have gone wrong? (h) Sometimes when building a machine learning model we might prefer to have fewer features rather than many feature. Give three reasons why this might be the case. (i) After spending several hours, you are anxious to build a high accuracy model. You built 5 boosting models, but neither of these models performed better than benchmark score. Finally, you decided to combine those models as ensemble models are known to provide high accuracy. If your accuracy still doesn't improve, what could potentially be wrong with your ensemble model? (i) For a given feature, the minimum and maximum value in the training data is 100 and 1000, respectively. The minimum and the maximum value of the feature in the test data is 50 and 950, respectively. What is the correct way to do min-max normalisation of this feature for a test instance with value in order to have a fair validation? A. (100)/(1000 - 100) B. (250)/(950-50) C. (50)/(1000 - 50) D. (100)/(1000-50) [2] [2] [2] [2] [2] [2] [2] [2] [2] [2] (a) You are given a data set on cancer detection. After building a classification model which achieves an accuracy of 90%, would you be satisfied with your model performance? What can you do about it? (b) In the context of k-NN classifier for multi-class classification, consider the case where k = 3 and the three nearest neighbours of a query have three different class labels. How would you assign the class label to the query example in this case? (c) In unsupervised learning, if a ground truth about a dataset is unknown, how can we determine the most useful number of clusters to be? (d) For a supervised classification problem, how can you determine which features are the most important? (e) Your machine learning application requires that the client be provided an explanation for the learning decision. Assuming that a a deep learning model and a decision tree model achieve similar accuracy for your task, which one would you prefer to use and why? (f) k-NN and kmeans clustering both rely crucially on the distance measure used. What is the difference between these two learning techniques? (g) A company has built a classifier that gets 100% accuracy on training data. When they deployed this model on client side it has been found that the model is highly inaccurate. What might have gone wrong? (h) Sometimes when building a machine learning model we might prefer to have fewer features rather than many feature. Give three reasons why this might be the case. (i) After spending several hours, you are anxious to build a high accuracy model. You built 5 boosting models, but neither of these models performed better than benchmark score. Finally, you decided to combine those models as ensemble models are known to provide high accuracy. If your accuracy still doesn't improve, what could potentially be wrong with your ensemble model? (i) For a given feature, the minimum and maximum value in the training data is 100 and 1000, respectively. The minimum and the maximum value of the feature in the test data is 50 and 950, respectively. What is the correct way to do min-max normalisation of this feature for a test instance with value in order to have a fair validation? A. (100)/(1000 - 100) B. (250)/(950-50) C. (50)/(1000 - 50) D. (100)/(1000-50) [2] [2] [2] [2] [2] [2] [2] [2] [2] [2]
Expert Answer:
Answer rating: 100% (QA)
a Achieving an accuracy of 90 in a cancer detection model is a good starting point but it might not be sufficient depending on the specific requirements and consequences of misclassification In medica... View the full answer
Related Book For
Business Intelligence And Analytics Systems For Decision Support
ISBN: 9781292009209
10th Global Edition
Authors: Efraim Turban, Ramesh Sharda, Dursun Delen, Pearson Education Limited, Dennis G. Zill
Posted Date:
Students also viewed these programming questions
-
PART A - MATERIAL PROPERTIES IN UNIAXIAL TENSION Objective To demonstrate the behaviour of brittle and ductile materials in uniaxial tension. Introduction to the Tension Test The term tension test is...
-
Planning is one of the most important management functions in any business. A front office managers first step in planning should involve determine the departments goals. Planning also includes...
-
Managing Scope Changes Case Study Scope changes on a project can occur regardless of how well the project is planned or executed. Scope changes can be the result of something that was omitted during...
-
Write a method to take an integer array as a parameter and return how many elements having the same digits in each element in the array for example if we have array 1 1 , 4 4 , 1 4 , 2 3 , 1 2 , 5 6...
-
Stand with your heels and back against a wall and try to bend over and touch your toes. You'll find that you have to stand away from the wall to do so without toppling over. Compare the minimum...
-
Newton Labs leased chronometers from Brookline Instruments on January 1, 2018. Brookline Instruments manufactured the chronometers at a cost of $200,000. The chronometers have a fair value of...
-
Shauna Washington started a business to sell art supplies and related curricula to home school families. The business grew quickly, with sales doubling three times during a five-year period. At that...
-
Prepare a flexible budget for 20,000, 22,000, and 24,000 units of output, using the information that follows. Variable costs: Direct materials .......... $1.00 per unit Direct labor ................
-
Carol got a tip from a friend to buy stock in a new Internet start-up. She purchased the stock for $20,000 and by the end of the year the stock was worth $100,000. How much must Carol report on her...
-
A. Ray and Maria Gomez have been married for 3 years. Ray is a propane salesman for Palm Oil Corporation and Maria works as a city clerk for the City of McAllen. Rays birthdate is February 21, 1990...
-
You purchase Wachovia stock on September 25th, 2022 for $10,000. Wachovia is completely liquidated (bankrupt) on June 15th, 2023. How does this affect your 2023 tax return? $10,000 short-term capital...
-
Write- a paper, however, I need to plan outline of it first but I have no clue how to even start it. For some background information, the paper will be about a conspiracy theory that has evidence...
-
Question 1 (Essay Worth 10 points) (06.02 HC) Let= 11 12 Part A: Determine tane using the sum formula. Show all necessary work in the calculation. (5 points) Part B: Determine cos e using the...
-
Conveying Bad News Scenario In groups, compose a bad news email to Emily Coroner. Remember to follow the format for an indirect approach as discussed in class and to write it professionally. Your...
-
Medhurst LLC, a manufacturer, had beginning finished goods inventory of $16,000. The cost of goods manufactured for the period was $117,000. Ending finished goods inventory was $45,000. What was the...
-
2. (25 points) For the pin connected crank-slider shown, all measurements are in inches. 0 is the input. Y a. Plot 03 vs. 0. Make sure all angles are in degrees. b. Plot the distance, , vs. 0, and...
-
What is the total crashing cost? The following is a table of activities associated with a project at Rafay Ishfaq's software firm in Chicago, their durations, what activities each must precede and...
-
Archangel Corporation prepared the following variance report. Instructions Fill in the appropriate amounts or letters for the question marks in the report. ARCHANGEL CORPORATION Variance...
-
BP Lubricants established the BIGS program following recent merger activity to deliver globally consistent and transparent management information. As well as timely business intelligence, BIGS...
-
What are authoritative pages, hubs, and hyperlinkinduced topic search (HITS)? Discuss the differences between citations in research articles and hyperlinks on Web pages.
-
Go to your librarys online resources. Learn how to download attributes of a collection of literature (journal articles) in a specific topic. Download and process the data using a methodology similar...
-
The Forensic Accounting Box of this chapter noted that forensic accounting techniques are being used to help pick stocks for investing. How might identifying accounting concerns be useful for...
-
In the corporate social responsibility highlight regarding Starbucks on page 528 in this chapter, it was stated that Starbucks believes in measuring and monitoring the company's CSR progress. The...
-
The fiscal year 2017 annual report of General Mills, Inc. is available on this book's Website. Required a. How many shares of common stock is General Mills authorized to issue? How many common shares...
Study smarter with the SolutionInn App