Question: Calculate and visualize some basic properties of gene expression levels in ALL dataset. Remember that function exprs returns a matrix of gene expression levels (~13K
Calculate and visualize some basic properties of gene expression levels in ALL dataset. Remember that function exprs returns a matrix of gene expression levels (~13K genes) for each patient (>100 patients) in the sample.
USE R to generate results. To acces ALL dataset in R. Use biocLite(ALL)
a. Calculate average gene expression level for each patient. Plot them using the following visualizations:
i. Histogram
ii. Boxplot
iii. Stripchart
iv. Stem-and-leaf
v. Dotchart
vi. Sorted in ascending order
b. Calculate average gene expression levels for each gene. Plot them using:
i. Histogram
ii. Boxplot
iii. sorted in the ascending order
c. For average gene expression per patient calculate the following measures of center and spread explain results obtained:
i. Mean
ii. Median
iii. Standard deviation (sd)
iv. Median absolute deviation (mad)
v. Interquartile range
vi. Five number summary (read the docs for, and use fivenum() function)
vii. Five number summary using quantile function
d. For average gene expression per gene calculate the same measures of center and spread as above explain results obtained:
e. Find genes with the highest and the lowest expression levels across patients capture their names and characteristic expression levels. Calculate for each of them the above measures of center and spread across patients. Plot distributions of their expression levels across patients using all visualization techniques practiced above.
f. Qualitatively compare distributions of mean gene expression levels per gene and per patient. Describe results of such comparison in plain English.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
