Question: Part one: Cluster analysis In this section we will only use the Synthea observations data. Start by using the code below to clear the R

Part one: Cluster analysis In this section we will only use the Synthea observations data. Start by using the code below to clear the R global environment, load the observations data, make a data object with a small number of variables and remove all missing values, remove patients who did not have COVID 19, truncate the variable names, and load the package factoextra which we use for k-means cluster analysis. As always you will need to set your working directory. rm(list=1s()) obs data table(cluster=clust$cluster,dead=data$dead) dead cluster ] 1 1415 4 2 310 273 Question 5 (difficult - 2 marks) Some of the pathology variables were not related to cluster membership (i.e. the mean levels were very similar across clusters). Choose the two pathology tests which were the least related to cluster membership and run your chosen cluster model again without them (just for your chosen cluster model, i.e.,, k=2 or 3, not for all 6 possible models). Save this new model in an object called gluskB, Inspect the mean levels of the pathology variables of this new model (like Q.3) and compare with death (like Q.4). Has removing these variables changed our results? What does this tell you about including variables not related to the clusters in our analysis? IMPORTANT NOTE: The cluster numbers may change in your new model (i.e,, 1 might become 2 or so on, because the clusters are just arbitrarily labelled with a number, but you will know which cluster is which from the relationships with pathology tests and death) net.youngs-agraphlserMat,young, labels=substr(colnames(young),1,5), | Joint :II : Chill\\I Hypox layout="spring") Question 8 (medium - 1 mark) Compare the two networks you plotted in the above question. Do you note any important differences? If you do think there are differences, list just the two biggest that you notice. Question 9 (easy - 2 marks) Generate the plots of node strength and expected influence for both networks and include them along with the code used to generate them in the box below. 11 Part Two: Network analysis In this section we will mostly use the Synthea conditions data. Start by using the code below to clear the R global environment of any data you still have from the last section, load the conditions data, make a data object with a smaller number of variables, remove patients who did not have COVID 19, truncate the variable names, and load the package ggraph which we use for network analysis. NOTE: if you have closed R since the last activity you will need to set your working directory again. rm(list=1s()) ohs. data=50) , ] old

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

ITM 309: Business Information Technology and Systems Spring 2016 Watson and the new era of cognitive systems Jerry Haan IBM Cloud Ecosystem Development January 27, 2016 2013 International Business...

ALY6010 Module 1 Project Instructor: Dr. Dee Chiluiza, PhD Discrete probability and normal distributions Assignment Summary Using the data provided in the attached Excel workbook, apply the methods...

Throughout this course, you have researched a wealth of data and analytics to construct a comprehensive financial analysis and proposal (excluding tables, figures, and addenda) of a chosen company...

What strategic issues confront Vail Resort in 2017? What market or internal circumstances should most concern CEO Rob Katz and his companys senior leadership team? WHISTLER BLACKCOMB AFTON ALPS...

I only need for the economics and financial performance part to be done as per the table in the assignment question attachment. 600-800 words. I have completed the rest of the assignment already. The...

Confirming Pages C H A P T E R 19 Analyzing Information and Writing Reports Chapter Outline Using Your Time Efficiently Analyzing Data and Information for Reports Identifying the Source of the Data...

ACCT2060- Accounting for Organisations and Society Marking Rubric for Individual assignment Semester 1 2016 High distinction 10 Distinction 7.5 Credit 6.5 Pass 5 Below standards 2.5 Title page, and...

The current through the Zener diode in figure is 2.2 10 V (a) 33 mA R=0.1 k2 R V-3.3 V + 3.5 V (b) 3.3 mA (c) 2 mA (d) 0 mA

Write a program that prompts the user to enter the name, the number of children, and the basic salary of an employee. The program will then: Calculate the bonus: bonus = number of children * $25 ...

kspur, inc. is convidering purchaing equipment conting $ 4 4 0 0 0 with a 6 - year useful ife. The equipment will provide amual co trn \ table [ [ , Present Value of manovity dil ] , [ eriod , , 2 8...

Turrublates Corporation makes a product that uses a material with the following standards: Standard quantity Standard price Standard cost 7.3 liters per unit $ 1.80 per liter $13.14 per unit The...