Written Homework 1 Math& 146 The data we are working on are listed in this file,...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Written Homework 1 Math& 146 The data we are working on are listed in this file, but the most efficient way is to get them form the two CSV files that come with this assignment. To avoid compatibility issues, it's recommended that you attach screenshots of the graphics you produced. Descriptive Statistics 1. Categorical Variables A political poll of 1024 voters in a 5 candidate race results with the following outcomes: Voters supporting A.J. H.L. I.D. R.S. L.L. Candidate Type the data in a spreadsheet and 223 158 350 101 192 1. Calculate the relative frequencies 2. Have the program draw a bar chart, either in terms of frequencies or relative frequencies 3. Sort the data in descending order of voter support (such a small data set can be even sorted manually) 4. Have the program draw a bar chart of the sorted data set: the bars will decrease in height from left to right, forming a Pareto Chart. 1. One Sample Consider this sorted sample. You can download the data in the file hw1ln.csv Sample Data 0.5751 0.5713 0.5182 0.4759 0.4679 0.4280 0.3843 0.3734 0.3100 0.2319 0.2293 0.1777 0.1776 0.1740 0.1678 0.1163 0.1152 0.1019 0.0795 0.0397 0.0268 Spreadsheet Using a spreadsheet's statistics tools provide the usual descriptive summaries: Sample mean Sample variance and standard deviation Median Quartiles A histogram describing the data Optional: a boxplot describing the data "By Hand" Use the following summaries Sum of data: 5.7415 Sum of squares: 2.2102 Count: 21 To find the mean and the standard deviation. Use the sorted data to determine the median and the quartiles Note: The data extracted from the CSV file has many more significant digits than the data and summaries reported here, so your "by hand" results should differ slightly from the spreadsheet output. Notes on Quartiles Quartiles are defined in many ways for not particularly relevant reasons, and your spreadsheet will offer a default choice if you use its QUARTILE function. Please, compare it to the "naive" result in the second part. Also, if you use QUARTILE. EXC, you will probably see different numbers. As discussed in the Supplemental Lectures, quartiles (and the median) for data sets are not uniquely defined as a specific number, but rather, in almost all cases, could be properly defined as intervals - except it's almost never done that way. The choice of a specific number within those intervals should not make any difference (if it did, the use of these summaries is probably useless for getting a feeling of the data set). Whichever rule you could choose to pick a number within the appropriate intervals (the midpoint, or any other) will do the job. Of course, if the "rule" points to a single data point, rather than a "bracket", that point would be the answer. 2. Regression Download into a spreadsheet the pair of data sets available in the file hw1rg.csv. Listed here are only the first pair, so we can agree on which one is the "first" ("X", explanatory variable) and which is the "second" ("Y"data set, response variable) For each find the equation of the regression line, the correlation, R2, and the residuals. Draw a scatterplot of the regression data, and the residuals vs, the "X" variable. Does the linear model look reasonable? X 0.84013 Y -2.664 This assignment is to be worked on a spreadsheet. Most of the data to be worked on can be downloaded as two CSV files (a first question is entirely contained in the assignment file). A first file contains a data set for you to compute some descriptive statistics tools. The second CSV file, contains a data set to be analyzed for regression and correlation, including data and residual scatterplots. Please, note that all values come with a lot of decimal places. Depending on how your spreadsheet opens the files, some entries might look like "#####", as they don't fit in the cell size. If that happens, widen the column size until the data displays correctly. You will need to use your spreadsheet for this assignment. This makes the solution tools a "black box": if you would like to see where the regression numbers come from, this file summarizes the technique) All data sets are simulated, which allows us to work with data that has a known relation to theoretical models. As you know, linear regression models are supposed to work with normal data, and many descriptive statistics tools are also closely tied to normal models. Looking at your results, what do you think about the appropriateness of the tools you used? 0.04363738914431711 0.06795000502983731 0.13643510180289242 0.23779799548759892 0.40066527259163304 0.9015514426012862 0.9335852176237341 0.9567433976816082 1.0123019287479766 1.134059869806964 1.3216439022859263 1.4418534805150827 1.5550276204810307 2.205730267186768 2.97017111134899 4.295794719451372 6.348799156061714 6.589062277393796 7.663569706649609 10.152895275039718 1st Data Set "X" "y" 1.2817007580861017 3.0597973489405845 2.376420365944906 8.470453664107234 1.1430520557315158 5.130411840828759 2.5279376196992893 10.099898424968508 0.3706438137008703 2.1811696427517964 2.959539544774655 11.362072274164017 2.1889962093387356 7.7133101850327614 1.600063809150629 6.394981585200603 0.18720187822811907 2.4961552684736072 2.5507085597056287 10.209254533406241 2.5531874122017273 9.11863609963924 1.5266606493550596 6.105852870396662 2.1858079142261433 8.094174021733737 1.6860545458918421 7.1609840010571535 1.6074066771243272 6.700339839245226 0.6545422641464371 3.3224091024931623 1.7866979232982245 8.283569406783386 0.9697379102353828 4.506592356667577 1.3861601201231428 5.77233535584033 8.536753564897758 3.6264757386895754 12.133267545652172 2.33273802900899 1.254581485064353 5.550740338210396 2.274971975431862 7.475398301044472 1.1870475640999327 4.531852342783179 0.524840878909667 4.613427106302201 2.6870009767885796 10.926537613968403 2.0249072100424015 6.07601655688056 2.085277201385929 8.051819184034017 1.7159116167514785 6.898197920347267 2.4488566844034367 7.335798533610307 1.8914100309895967 7.006132563076974 1.888897644379434 7.388854655259849 0.5517429650585941 4.039522774177144 2nd Data Set "X" 0.5359984460968357 0.9786281632642645 0.03496490225612253 1.7634732785487395 4.6868985569442465 1.087519159352847 0.87303410934061 1.505400863825309 0.8411937523252973 -0.3086601282325898 -0.2991316389608016 3.0985611301045854 2.714064758522485 "Y" -0.8320134502440834 3.4493974579270588 73.11589567275712 0.3859873525978551 -0.07176557925012905 -1.0003666060541945 0.5533178083379825 -0.8144599852537133 0.3546853040982446 -0.953497361311398 21.20890933811937 0.9177138238230635 -0.2130915207843057 1.4491385649548576 2.874500680838455 2.8474292378004527 34.26760773922827 0.2744411896960006 -0.9782504822974398 1.3174664527647149 1.9603191146505723 2.0140657166579614 10.059287359310044 1.6339795645711006 3.706554501828173 -0.12683279558854066 -1.002035720700646 0.5145437699968703 0.35388938193593245 -0.9528176641574884 1.218445861930585 0.91359327119902 -0.156109510836771 0.8250942538761956 -0.381027708958766 1.7994181510682272 6.251725817411769 0.6376123669966351 -0.7317346223074606 1.284743711554868 1.5529042811464948 1.1247898984545943 1.1689205865436842 0.8071043294336717 -0.8531292723618508 1.4658605484408405 2.828464405949893 1.0954892826254223 3.5541443854693955 0.9900939273608875 0.009796675406737128 1.854428645570465 5.897620311162609 Written Homework 1 Math& 146 The data we are working on are listed in this file, but the most efficient way is to get them form the two CSV files that come with this assignment. To avoid compatibility issues, it's recommended that you attach screenshots of the graphics you produced. Descriptive Statistics 1. Categorical Variables A political poll of 1024 voters in a 5 candidate race results with the following outcomes: Voters supporting A.J. H.L. I.D. R.S. L.L. Candidate Type the data in a spreadsheet and 223 158 350 101 192 1. Calculate the relative frequencies 2. Have the program draw a bar chart, either in terms of frequencies or relative frequencies 3. Sort the data in descending order of voter support (such a small data set can be even sorted manually) 4. Have the program draw a bar chart of the sorted data set: the bars will decrease in height from left to right, forming a Pareto Chart. 1. One Sample Consider this sorted sample. You can download the data in the file hw1ln.csv Sample Data 0.5751 0.5713 0.5182 0.4759 0.4679 0.4280 0.3843 0.3734 0.3100 0.2319 0.2293 0.1777 0.1776 0.1740 0.1678 0.1163 0.1152 0.1019 0.0795 0.0397 0.0268 Spreadsheet Using a spreadsheet's statistics tools provide the usual descriptive summaries: Sample mean Sample variance and standard deviation Median Quartiles A histogram describing the data Optional: a boxplot describing the data "By Hand" Use the following summaries Sum of data: 5.7415 Sum of squares: 2.2102 Count: 21 To find the mean and the standard deviation. Use the sorted data to determine the median and the quartiles Note: The data extracted from the CSV file has many more significant digits than the data and summaries reported here, so your "by hand" results should differ slightly from the spreadsheet output. Notes on Quartiles Quartiles are defined in many ways for not particularly relevant reasons, and your spreadsheet will offer a default choice if you use its QUARTILE function. Please, compare it to the "naive" result in the second part. Also, if you use QUARTILE. EXC, you will probably see different numbers. As discussed in the Supplemental Lectures, quartiles (and the median) for data sets are not uniquely defined as a specific number, but rather, in almost all cases, could be properly defined as intervals - except it's almost never done that way. The choice of a specific number within those intervals should not make any difference (if it did, the use of these summaries is probably useless for getting a feeling of the data set). Whichever rule you could choose to pick a number within the appropriate intervals (the midpoint, or any other) will do the job. Of course, if the "rule" points to a single data point, rather than a "bracket", that point would be the answer. 2. Regression Download into a spreadsheet the pair of data sets available in the file hw1rg.csv. Listed here are only the first pair, so we can agree on which one is the "first" ("X", explanatory variable) and which is the "second" ("Y"data set, response variable) For each find the equation of the regression line, the correlation, R2, and the residuals. Draw a scatterplot of the regression data, and the residuals vs, the "X" variable. Does the linear model look reasonable? X 0.84013 Y -2.664 This assignment is to be worked on a spreadsheet. Most of the data to be worked on can be downloaded as two CSV files (a first question is entirely contained in the assignment file). A first file contains a data set for you to compute some descriptive statistics tools. The second CSV file, contains a data set to be analyzed for regression and correlation, including data and residual scatterplots. Please, note that all values come with a lot of decimal places. Depending on how your spreadsheet opens the files, some entries might look like "#####", as they don't fit in the cell size. If that happens, widen the column size until the data displays correctly. You will need to use your spreadsheet for this assignment. This makes the solution tools a "black box": if you would like to see where the regression numbers come from, this file summarizes the technique) All data sets are simulated, which allows us to work with data that has a known relation to theoretical models. As you know, linear regression models are supposed to work with normal data, and many descriptive statistics tools are also closely tied to normal models. Looking at your results, what do you think about the appropriateness of the tools you used? 0.04363738914431711 0.06795000502983731 0.13643510180289242 0.23779799548759892 0.40066527259163304 0.9015514426012862 0.9335852176237341 0.9567433976816082 1.0123019287479766 1.134059869806964 1.3216439022859263 1.4418534805150827 1.5550276204810307 2.205730267186768 2.97017111134899 4.295794719451372 6.348799156061714 6.589062277393796 7.663569706649609 10.152895275039718 1st Data Set "X" "y" 1.2817007580861017 3.0597973489405845 2.376420365944906 8.470453664107234 1.1430520557315158 5.130411840828759 2.5279376196992893 10.099898424968508 0.3706438137008703 2.1811696427517964 2.959539544774655 11.362072274164017 2.1889962093387356 7.7133101850327614 1.600063809150629 6.394981585200603 0.18720187822811907 2.4961552684736072 2.5507085597056287 10.209254533406241 2.5531874122017273 9.11863609963924 1.5266606493550596 6.105852870396662 2.1858079142261433 8.094174021733737 1.6860545458918421 7.1609840010571535 1.6074066771243272 6.700339839245226 0.6545422641464371 3.3224091024931623 1.7866979232982245 8.283569406783386 0.9697379102353828 4.506592356667577 1.3861601201231428 5.77233535584033 8.536753564897758 3.6264757386895754 12.133267545652172 2.33273802900899 1.254581485064353 5.550740338210396 2.274971975431862 7.475398301044472 1.1870475640999327 4.531852342783179 0.524840878909667 4.613427106302201 2.6870009767885796 10.926537613968403 2.0249072100424015 6.07601655688056 2.085277201385929 8.051819184034017 1.7159116167514785 6.898197920347267 2.4488566844034367 7.335798533610307 1.8914100309895967 7.006132563076974 1.888897644379434 7.388854655259849 0.5517429650585941 4.039522774177144 2nd Data Set "X" 0.5359984460968357 0.9786281632642645 0.03496490225612253 1.7634732785487395 4.6868985569442465 1.087519159352847 0.87303410934061 1.505400863825309 0.8411937523252973 -0.3086601282325898 -0.2991316389608016 3.0985611301045854 2.714064758522485 "Y" -0.8320134502440834 3.4493974579270588 73.11589567275712 0.3859873525978551 -0.07176557925012905 -1.0003666060541945 0.5533178083379825 -0.8144599852537133 0.3546853040982446 -0.953497361311398 21.20890933811937 0.9177138238230635 -0.2130915207843057 1.4491385649548576 2.874500680838455 2.8474292378004527 34.26760773922827 0.2744411896960006 -0.9782504822974398 1.3174664527647149 1.9603191146505723 2.0140657166579614 10.059287359310044 1.6339795645711006 3.706554501828173 -0.12683279558854066 -1.002035720700646 0.5145437699968703 0.35388938193593245 -0.9528176641574884 1.218445861930585 0.91359327119902 -0.156109510836771 0.8250942538761956 -0.381027708958766 1.7994181510682272 6.251725817411769 0.6376123669966351 -0.7317346223074606 1.284743711554868 1.5529042811464948 1.1247898984545943 1.1689205865436842 0.8071043294336717 -0.8531292723618508 1.4658605484408405 2.828464405949893 1.0954892826254223 3.5541443854693955 0.9900939273608875 0.009796675406737128 1.854428645570465 5.897620311162609
Expert Answer:
Related Book For
Auditing a business risk appraoch
ISBN: 978-0324375589
6th Edition
Authors: larry e. rittenberg, bradley j. schwieger, karla m. johnston
Posted Date:
Students also viewed these databases questions
-
Managing Scope Changes Case Study Scope changes on a project can occur regardless of how well the project is planned or executed. Scope changes can be the result of something that was omitted during...
-
Planning is one of the most important management functions in any business. A front office managers first step in planning should involve determine the departments goals. Planning also includes...
-
8 Assume a company reported the following results: Sales Net operating income Average operating assets Margin Turnover Return on investment (ROI) What is the return on investment? ? $ 120,000 $...
-
In the Home Depot 2012 financial statements in Appendix A at the end of this textbook, read note 1. Find the information about Home Depot 's international store locations. a. In what countries (other...
-
How does the use of metafictional techniques in novels such as Italo Calvino's "If on a winter's night a traveler" blur the boundaries between fiction and reality, challenging conventional narrative...
-
What is the procedure for substituting parties during the course of a pending lawsuit?
-
Below are balance sheet and income statement data for Blue Panel Corporation. (For the balance sheet data, the end-of-year information is in the left column.) Balance Sheet Income Statement (for...
-
Provide a brief profile of the organisation Question 2 (25 Marks) Assess the vision and mission statements of the organisation. Your answer should consider whether they are in line with the goals and...
-
Larry has been an active postage-stamp collector for a number of years. A few years ago, he bought a rare U.S. postage stamp for $2,500. He has now decided to sell the stamp. If he sells it at an...
-
Why is the scientific method important in developmental psychology? 2. Come up with a basic research question in developmental psychology. 3. Come up with an applied research question in...
-
Answer the following questions about exchanging information with families about their children. For each procedure listed below, explain how it helps educators and families exchange information about...
-
A majority of applicants find what to be the most challenging aspect of the selection process?
-
1. Discuss one form of radiation you encounter in your daily life and why you use it. 2. What are some of the reported effects of that type of radiation on the human body? [Sources of information...
-
what is developmental psychology (or developmental science)? What do developmental psychologists/scientists do? What is the benefit of developmental psychology/science? Why study developmental...
-
Python 3 You are given a complex list of n products, each with a names, prices, and weights. Find out how many duplicate products are present within the list. Duplicate products contain identical...
-
Evaluate how many lines there are in a true rotational spectrum of CO molecules whose natural vibration frequency is w = 4.09 1014 s1 and moment of inertia I = 1.44 1039 g cm2.
-
The text talks about Delphi Company reducing its other post-retirement benefits by approximately $500 million because of a change in the law. The federal government will not reimburse companies for...
-
The existence of fraudulent financial reporting has been of great concern to both the accounting profession and regulatory agencies such as the SEC. It has been asserted that companies in trouble...
-
Explain the purpose of the test counts and other inventory observations that the auditor notes while a physical inventory is being taken. Explain how each written observation is used in completing...
-
What are generally accepted accounting principles (GAAP)?
-
Describe the concepts, principles and constraints underlying financial statements.
-
Accounting is the process of: (a) recognising, measuring, recording and communicating. (b) identifying, measuring, recording and communicating. (c) identifying, recording, classifying and...
Study smarter with the SolutionInn App