Question: ST 351 Data Analysis 2 (15 points) Sampling Objectives of this lab write-up Using R Studio to take a random sample Compare the results of

ST 351 Data Analysis 2 (15 points) Sampling Objectives of this lab write-up Using R Studio to take a random sample Compare the results of a simple random sample with a self-selected sample Shel Silverstein is an American writer who is recognized for his unique cartoon style, songs and children's books. Born to a Jewish family, Chicago is where he calls home. He found himself once expelled from University of Illinois, but then attended Chicago Academy of Fine Arts when he was drafted into the United States Army. He served in Japan and Korea. His poem Sarah Synthia Sylvia Stout Would Not Take The Garbage Out will be the subject of this assignment. 1. (2 point) The entire text of the poem is given to the right. For now, your task is to select a sample of 10 words. The goal would be to use the sample mean from the self-selected 10 words help estimate the mean length of words in this entire poem. Do your best to randomly select ten words on your own (with your eyeballs). In a table that you create in your Word document, please record the words you selected as well as the number of letters in each word. We will call this variable length. The table you create in your word document should have a format similar to this: 2. (2 points) In R Studio, enter the length of your 10 words as a data vector and calculate the mean length of words from your sample. Word Length (number of letters) The 3 Awful 5 etc... SARAH SYNTHIA SYLVIA STOUT WOULD NOT TAKE THE GARBAGE OUT Sarah Cynthia Sylvia Stout Would not take the garbage out! She'd scour the pots and scrape the pans, Candy the yams and spice the hams, And though her daddy would scream and shout, She simply would not take the garbage out. And so it piled up to the ceilings: Coffee grounds, potato peelings, Brown bananas, rotten peas, Chunks of sour cottage cheese. It filled the can, it covered the floor, It cracked the window and blocked the door With bacon rinds and chicken bones, Drippy ends of ice cream cones, Prune pits, peach pits, orange peel, Gloppy glumps of cold oatmeal, Pizza crusts and withered greens, Soggy beans and tangerines, Crusts of black burned buttered toast, Gristly bits of beefy roasts. . . The garbage rolled on down the hall, It raised the roof, it broke the wall. . . Greasy napkins, cookie crumbs, Globs of gooey bubble gum, Cellophane from green baloney, Rubbery blubbery macaroni, Peanut butter, caked and dry, Curdled milk and crusts of pie, Moldy melons, dried-up mustard, Eggshells mixed with lemon custard, Cold french fries and rancid meat, Yellow lumps of Cream of Wheat. At last the garbage reached so high That it finally touched the sky. And all the neighbors moved away, And none of her friends would come to play. And finally Sarah Cynthia Stout said, "OK, I'll take the garbage out!" But then, of course, it was too late. . . The garbage reached across the state, From New York to the Golden Gate. And there, in the garbage she did hate, Poor Sarah met an awful fate, That I cannot now relate Because the hour is much too late. But children, remember Sarah Stout And always take the garbage out!

2 Recall that one way to enter data is by creating a vector and save it under an object name is to run something like this: mysample <- c(1,2,3,4,5) where you would replace the 1,2,3,4,5 with your word lengths. Please note that you will be asked for all of the code you used at the end of this assignment. Please be sure you are typing in code using the Script window: What is the mean length of words for your sample? 3. (3 points) Do you feel you can obtain a truly representative sample of population data by self- selecting cases from that population without using some random number generator? You may explain your answer in the context of this problem or you could be more general. Let's see how different the mean from a simple random sample of 10 words is from the mean of your self-selected sample. At this point, find the SCSSGARBAGE data set in Canvas within the Data Analysis 2 assignment instructions. Save this data set to your computer as a .csv file and import to R Studio. This data set contains the words and the length of each word in the poem. We are going to consider the list of words in the entire poem as our population of interest. In this example the population size is 288 words. 4. (2 point) Using code found in the Lab Activity 2, obtain a simple random sample of 10 word lengths. Record the length of the 10 words here. We want to see the lengths of the 10 words in some way. For example, I'm wanting to see something like this. > sample(SCSSGARBAGE$length, 10) [1] 5 4 6 7 7 4 3 5 3 3 > srsample <- c(5,4,6,7,7,4,3,5,3,3 5. (2 point) What is the mean of your simple random sample? 6. (3 points) Since we have the entire population of all words in the poem, we know the population mean length of words is 4.52 words. Of the two methods used to obtain a sample of 10 words, the self-selected sample or the simple random sample, which method do you think does the best job at selecting a representative sample of word lengths and, therefore, produces a mean that is typically closer to the population mean? Explain WHY you chose the method you did. You may consider discussing how close each sample mean is to the population mean. Use your answers to previous questions to help support your explanation.

3 7. (1 points) Even though it isn't much, please copy and paste all of the code used in this assignment at this point

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock