Question 2: Given the following dataset: ID 1. 2. 3. 4. 5. 6. 7. Age '40-49...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Question 2: Given the following dataset: ID 1. 2. 3. 4. 5. 6. 7. Age '40-49 '50-59 '50-59 40-49 '40-49 '50-59 '50-59 8. 9. '40-49 10. '40-49 11. '50-59 12. '60-69 13. '50-59 14. '50-59 40-49 15. 16. '30-39 17. '50-59 18. '60-69 Menopause Tumor- size '15-19' 'premeno' 'ge40' 'ge40 'premeno' 'premeno' 'ge40' 'premeno' '15-19 '35-39 'premeno' '35-39 '0-2' ''premeno' '30-34" '3-5' 'premeno' '25-29' '3-5' red 40-44 '0-2' '10-14" '0-2' 0-4" '0-2' 'ge40 'ge40 'ge40 'premeno' 'premeno' '20-24" 'premeno' 'ge40' '10-14" '15-19 '40-44' '20-24" 'premeno' 19. '50-59 20. '50-59 'ge40' '40-44" '25-29' '15-19' '30-34" '25-29 nodes 'It40' '0-2' '0-2' '0-2' '0-2' '3-5' '0-2' '0-2' '0-2' 21. '50-59 '20-24" '0-2' '40-44 '3-5' '15-19' '0-2' 22. '60-169' 'ge40' 23. '50-59 'ge40' 24. '40-49 25. 30-39 '0-2' 'premeno' '10-14" 'premeno' '15-19' '6-8' 26. '50-59 'ge40 '20-24" '3-5' 27. '50-59 'ge40' '10-14" '0-2' 28. '40-49 'premeno' '10-14" '0-2' Node. degree-of- Breast malignance caps 'yes' 'no' 'no' 'yes' 'yes' 'no' 'no' 'no' '15-17' 'yes' '0-2' 'no' '0-2' 'no' '0-2' '0-2' 'no' '2' '2' '2' '2' '2' '1' 'no' '2' 'no' 2 2 2 2 2 'no' nan 'no' 'no' 'no' 'yes' '3" '1' '2' 'no' '3" '2' '2' '3' 'no' '3' '1' '2' '2' '3" 1 lng L lng n i i ng '1' '2' '2' '1' 'yes' '2" '3' Breast- 'no' quad 'right' "left_up' 'right' "central" 'no' '"left' "left_low' 'no' 'right' "left_low' 'yes' 'left' "right_up' 'no' 'right' "left_up' 'left' "left_up' 'left' "left_up' 'no' "right_low' 'no' "left_up' 'yes' "left_low' 'no' 'right' '"left_up' 'no' 'no' 'no' 'right' 'right' 'left' 'right' "central" 'right' "left_up' '2' Irradiation 'yes' 'no' 'left' "central' 'no' 'right' "left_up' 'no' 'right' "left_up' 'no' 'left' "left_up' 'no' 'left' "left_up' 'no' 'left' "left_low' 'no' 'right' "left_up' 'yes' 'right' "left_low' 'no' 'right' "left_up' 'no' 'left' "left_low' 'right' "left_up' 'right' "left_low' 'no' 'right' "left_up' 'no' 'no' Reccurence 'recurrence-events" 'no-recurrence-events" 'recurrence-events 'no-recurrence-events" 'recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'recurrence-events" 'no-recurrence-events' 'no-recurrence-events" 'no-recurrence-events" 'recurrence-events" 'no-recurrence-events" 'no-recurrence-events" The dataset contains data on patients who have breast cancer. It records their age, whether they have gone through menopause, how big their tumor is, how many nodes they have, whether their node-caps are positive or negative, how malignant their tumor is, which breast is affected, which quadrant of the breast is affected, whether they have received irradiation treatment, and whether their cancer has recurred. 1. Provide a description of this data set. 2. Provide a brief statistical description of each feature. 3. We would like to use this dataset to predict the risk of recurrence in a new patient based on this data. Formulate the problem and explain how it can be solved. 4. Identify the issues in this data set. 5. List and explain the tasks that should be performed on this data set prior to its use for the prediction task. Question 2: Given the following dataset: ID 1. 2. 3. 4. 5. 6. 7. Age '40-49 '50-59 '50-59 40-49 '40-49 '50-59 '50-59 8. 9. '40-49 10. '40-49 11. '50-59 12. '60-69 13. '50-59 14. '50-59 40-49 15. 16. '30-39 17. '50-59 18. '60-69 Menopause Tumor- size '15-19' 'premeno' 'ge40' 'ge40 'premeno' 'premeno' 'ge40' 'premeno' '15-19 '35-39 'premeno' '35-39 '0-2' ''premeno' '30-34" '3-5' 'premeno' '25-29' '3-5' red 40-44 '0-2' '10-14" '0-2' 0-4" '0-2' 'ge40 'ge40 'ge40 'premeno' 'premeno' '20-24" 'premeno' 'ge40' '10-14" '15-19 '40-44' '20-24" 'premeno' 19. '50-59 20. '50-59 'ge40' '40-44" '25-29' '15-19' '30-34" '25-29 nodes 'It40' '0-2' '0-2' '0-2' '0-2' '3-5' '0-2' '0-2' '0-2' 21. '50-59 '20-24" '0-2' '40-44 '3-5' '15-19' '0-2' 22. '60-169' 'ge40' 23. '50-59 'ge40' 24. '40-49 25. 30-39 '0-2' 'premeno' '10-14" 'premeno' '15-19' '6-8' 26. '50-59 'ge40 '20-24" '3-5' 27. '50-59 'ge40' '10-14" '0-2' 28. '40-49 'premeno' '10-14" '0-2' Node. degree-of- Breast malignance caps 'yes' 'no' 'no' 'yes' 'yes' 'no' 'no' 'no' '15-17' 'yes' '0-2' 'no' '0-2' 'no' '0-2' '0-2' 'no' '2' '2' '2' '2' '2' '1' 'no' '2' 'no' 2 2 2 2 2 'no' nan 'no' 'no' 'no' 'yes' '3" '1' '2' 'no' '3" '2' '2' '3' 'no' '3' '1' '2' '2' '3" 1 lng L lng n i i ng '1' '2' '2' '1' 'yes' '2" '3' Breast- 'no' quad 'right' "left_up' 'right' "central" 'no' '"left' "left_low' 'no' 'right' "left_low' 'yes' 'left' "right_up' 'no' 'right' "left_up' 'left' "left_up' 'left' "left_up' 'no' "right_low' 'no' "left_up' 'yes' "left_low' 'no' 'right' '"left_up' 'no' 'no' 'no' 'right' 'right' 'left' 'right' "central" 'right' "left_up' '2' Irradiation 'yes' 'no' 'left' "central' 'no' 'right' "left_up' 'no' 'right' "left_up' 'no' 'left' "left_up' 'no' 'left' "left_up' 'no' 'left' "left_low' 'no' 'right' "left_up' 'yes' 'right' "left_low' 'no' 'right' "left_up' 'no' 'left' "left_low' 'right' "left_up' 'right' "left_low' 'no' 'right' "left_up' 'no' 'no' Reccurence 'recurrence-events" 'no-recurrence-events" 'recurrence-events 'no-recurrence-events" 'recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'no-recurrence-events" 'recurrence-events" 'no-recurrence-events' 'no-recurrence-events" 'no-recurrence-events" 'recurrence-events" 'no-recurrence-events" 'no-recurrence-events" The dataset contains data on patients who have breast cancer. It records their age, whether they have gone through menopause, how big their tumor is, how many nodes they have, whether their node-caps are positive or negative, how malignant their tumor is, which breast is affected, which quadrant of the breast is affected, whether they have received irradiation treatment, and whether their cancer has recurred. 1. Provide a description of this data set. 2. Provide a brief statistical description of each feature. 3. We would like to use this dataset to predict the risk of recurrence in a new patient based on this data. Formulate the problem and explain how it can be solved. 4. Identify the issues in this data set. 5. List and explain the tasks that should be performed on this data set prior to its use for the prediction task.
Expert Answer:
Answer rating: 100% (QA)
The dataset contains information about breast cancer patients including their age menopausal status tumor size number of nodes node caps status malign... View the full answer
Related Book For
Income Tax Fundamentals 2013
ISBN: 9781285586618
31st Edition
Authors: Gerald E. Whittenburg, Martha Altus Buller, Steven L Gill
Posted Date:
Students also viewed these programming questions
-
Ayayai Company prepares its statement of cash flows using the direct method for operating activities. For the year ended December 31, 2024, Ayayai Company reports the following: Sales on account...
-
If the focal length of a lens is 3 centimeters and the image distance is 5 centimeters from the lens, what is the distance from the object to the lens?
-
(a) Let Find vectors -12 -5] -- -- [18-[1] 1 = = S = -5 to U = A = U = in R such that S is the transition matrix from (v1, v2} to {u, u}- (b) Let P4 be the vectors space of all polynomials of degree...
-
Incorporated in 1990, Raju Diary Ltd is one of the leading manufacturers and marketers of diary-based branded foods in India. In the initial years, its operations were restricted only to the...
-
The loose-fitting collar is supported by the pipe for which the coefficient of static friction at the points of contact A and B is s. Determine the smallest dimension d so the rod will not slip when...
-
Identify the distinguishing characteristics of neuropsychology as a world view! Ps: Is mental psychology the same as neuropsychology!
-
Consider the air pollution and mortality data given in Problem 3.15 and Table B. 15 . Problem 3.15 McDonald and Ayers [1978] present data from an early study that examined the possible link between...
-
A psychologist compares the mean amount of time of rapid-eye movement (REM) sleep for subjects under three conditions. She randomly assigns 12 subjects to the three groups, four per group. The sample...
-
Compute the integral (x.ex)dx. (x-e3-x)dx dx Click here to start next step
-
Verify that under the assumptions of normal multiple regression analysis (a) The maximum likelihood estimates of the β s equal the corresponding least squares estimates; (b) The maximum...
-
The Balance Sheet of Swan Ltd. as on 31.3.2018 was as follows: 1. EQUITY AND LIABILITIES (1) Shareholders Funds: (a) Share Capital (b) Reserves and Surplus-Profit and Loss Account Balance Sheet of...
-
What are the disadvantages of separating financial activities into different firms in an effort to avoid conflicts of interest?
-
How can conflicts of interest make financial markets less efficient?
-
In the July 2017 FOMC meeting, governors and voting presidents of the Federal Reserve System agreed not to increase the federal funds rate target, but somewhat let the markets know that there could...
-
Refer to Problem 22. Now you believe the dealer knows more about the car than you do. How much are you willing to pay? Why? How can this asymmetric information problem be resolved in a competitive...
-
This case study shows a typical situation in which management accounting can be helpful. Read the case study now but only attempt the discussion points after you have finished studying the chapter....
-
1. Prove the following statements using the definitions of the given asymptotic notations (15pts). (a) Show 5n+150n + 100 is in O(n). (b) Show 10n - 20n + 10n is in 2(n) (c) Show (1/20)n -5n is in...
-
When you weigh yourself on good old terra firma (solid ground), your weight is 142 lb. In an elevator your apparent weight is 121 lb. What are the direction and magnitude of the elevator's...
-
For each of the following situations, indicate whether the taxpayer(s) is (are) required to file a tax return for 2012. Explain your answer. a. Helen is a single taxpayer with interest income in 2012...
-
Quince Interests is a partnership with a tax year that ends September 30, 2012. During that year, Potter, a partner, received $3,000 per month as a guaranteed payment, and his share of partnership...
-
Lisa Sizemore, a taxpayer in the 10-15 percent tax bracket, purchased stock as an investment on July 11, 2011. She sold the stock on July 9, 2012, 2 days before qualifying for the long-term holding...
-
For the pediatrician presented in Example 1, find the probability that a randomly selected three-year-old girl is between 35 and 40 inches tall, inclusive. That is, find P(35 X 40). By-Hand...
-
The heights of a pediatricians three-year-old females are approximately normally distributed, with mean 38.72 inches and standard deviation 3.17 inches. Find the height of a three-year-old female at...
-
The scores earned on the mathematics portion of the SAT, a college entrance exam, are approximately normally distributed with mean 516 and standard deviation 116. What scores separate the middle 90%...
Study smarter with the SolutionInn App