Question: Data structures and subsetting in R (35 points in total) For the following questions, you may work out and run the R commands / codes
Data structures and subsetting in R (35 points in total)
For the following questions, you may work out and run the R commands / codes in RStudio either in a R Script file, or in the console. You should copy and paste the original R commands as well as the run results for each question in its respective following space in this file.
1.1 Create three vectors as follows using the c( ) function. (12 points in total, 4 points each)
a. The first vector is a numeric vector containing the following five real numbers. Assign the vector to a new variable named num.vector. Type the variable name and run it to get the vector content.
1.2, 5, 7, 10, 3.5
b. The second vector is a character vector containing the following five character strings. Assign the vector to a new variable named char.vector. Type the variable name and run it to get the vector content.
"one point two", "five", "seven", "ten", "three point five"
c. The third vector is a character vector containing the following five character strings. Assign the vector to a new variable named char2.vector. Type the variable name and run it to get the vector content.
"small number", "small number", "big number", "big number", "small number"
1.2 Combine the three vectors into a data frame using the data.frame( ) function and assign it to the new variable named numbers. Type the variable name and run it to get the data frame content. (8 points)
1.3 Select elements of the data frame numbers using brackets [ ] and indexes / variable names. (10 points in total)
a. Select every element in the first column of the data frame and display the result. (2 points)
b. Select every element between the second row (inclusive) and the fifth row (inclusive) of the data frame and display the result. (4 points)
c. Select the element at the intersection of the fifth row and the third column of the data frame and display the result. (4 points)
1.4 Run the structure function str( ) on the data frame numbers. What information can you interpret for each of the three variables of the data frame in the function result? Which variables are factors (i.e., categorical variables)? (5 points)
R tables and graphs (35 points in total)
Use the accompanying 2012Networks.CSV data file for Assignment 2 to complete the following questions.
| Network |
| CBS |
| CBS |
| ABC |
| CBS |
| NBC |
| CBS |
| NBC |
| NBC |
| NBC |
| CBS |
| NBC |
| NBC |
| ABC |
| CBS |
| CBS |
| FOX |
| NBC |
| ABC |
| ABC |
| ABC |
| CBS |
| ABC |
| NBC |
| NBC |
| CBS |
Nielsen Media Research provided the list of the 25 top-rated single shows in television history (The World Almanac, 2012). The data in 2012Networks.CSV show the television network that produced each of these 25 top-rated shows.
2.1 Import/read the provided 2012Networks.CSV data into a variable named networks in RStudio using either the read.csv( ) or read.table( ) function. Remember to specify the arguments for header and sep correctly in the function. Run the variable networks to show the imported result in console. Copy, paste and show both the R commands and imported result in the following space. (5 points)
2.2 Check the data structure of the variable networks using the str( ) function. Show the R command and result in the following space. What is the data structure for this networks variable? And what is the data structure for the Network column/variable in the networks variable? (5 points)
2.3 Construct a frequency distribution using the table( ) function and assign it to a variable named freq.networks. Show the R commands and table result in the following space. (5 points)
2.4 Construct a percent (not relative) frequency distribution using the prop.table( ) function and assign it to a variable named percent.freq.networks. Show the R commands and table result in the following space. (5 points)
2.5 Construct a bar chart for the frequency distribution using the barplot( ) function. In the bar chart, make the main title as Bar chart of TV networks, x axis label as Network, y axis label as Frequency, and the color for the bars as green. Show the R command and bar chart in the following space. (10 points)
(Hint: to save the bar chart image, in the bottom right pane in RStudio, go to the Plots tab Export Save as Image, choose the image format as PNG or JPEG, give a file name, choose a directory/folder to save the image, then click the Save button. Then copy and paste the saved image in the following space.)
2.6 Which network or networks have done the best in terms of presenting top-rated television shows? Compare the performance of ABC, CBS, and NBC. (5 points)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
