New Semester
Started
Get
50% OFF
Study Help!
--h --m --s
Claim Now
Question Answers
Textbooks
Find textbooks, questions and answers
Oops, something went wrong!
Change your search query and then try again
S
Books
FREE
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Tutors
Online Tutors
Find a Tutor
Hire a Tutor
Become a Tutor
AI Tutor
AI Study Planner
NEW
Sell Books
Search
Search
Sign In
Register
study help
business
statistics alive
Statistics And Analysis Of Scientific Data 2nd Edition Massimiliano Bonamente - Solutions
3.3 A coin is tossed ten times. Find(a) The probability of obtaining 5 heads up and 5 tails up;(b) The probability of having the first 5 tosses show heads up, and the final 5 tosses show tails up;(c) The probability to have at least 7 heads up.
3.4 In a given course, it is known that 7:3% of students fail.(a) What is the expected number of failures in a class of 32 students?(b) What is the probability that 5 or more students fail?
3.5 The frequency of twins in European population is about 12 in every 1000 maternities. Calculate the probability that there are no twins in 200 births, using(a) the binomial distribution, and (b) the Poisson distribution.
3.6 Given the distribution of a Poisson variable N, P.n/ Dn nŠeshow that the mean is given by and that the variance is also given by .
3.7 Consider Mendel’s experiment of Table 1.1 at page 9 and refer to the “Long vs.short stem” data.(a) Determine the parent distribution for the number of dominants.(b) Calculate the uncertainty in the measurement of the number of plants that display the dominant character.(c) Determine the
3.8 For Mendel’s experimental data in Table 1.1 at page 9, consider the overall fraction of plants that display the dominant character, for all seven experiments combined.(a) Determine the parent distribution of the overall fraction X of plants with dominant character and its expected value.(b)
4.1 Consider the data from Thomson’s experiment of Tube 1, from page 23.(a) Calculate the mean and standard deviation of the measurements of v.(b) Use the results from Problem 2.3, in which the mean and standard deviation of W=Q and I were calculated, to calculate the approximate values of mean
4.2 Calculate the mean, variance, and moment generating function M.t/ for a uniform random variable in the range 0–1.
4.3 Consider two uniform independent random variables X, Y in the range 1 to 1.(a) Determine the distribution function, mean and variance, and the moment generating function of the variables.(b) We speculate that the sum of the two random variables is distributed like a“triangular”
4.4 Using a computer language of your choice, simulate the sum of N D 100 uniform variables in the range 0–1, and show that the sampling distribution of the sum of the variables is approximately described by a Gaussian distribution with mean equal to the mean of the N uniform variables and
4.5 Consider the J.J. Thomson experiment of page 23.(a) Calculate the sample mean and the standard deviation of m=e for Tube 1.(b) Calculate the approximate mean and standard deviation of m=e from the mean and standard deviation of W=Q and I, according to the equation me D I2 2Q WI Assume that W=Q
4.6 Use the data provided in Example 4.11. Calculate the probability of a positive detection of source counts S in the first time period (where there are N1 D 50 total counts and B D 20 background counts), and the probability that the source emitted 10 source counts. You will need to assume that
4.7 Consider the data in the Thomson experiment for Tube 1 and the fact that the variables W=Q and I are related to the variable v via the relationship v D 2W QI:Calculate the sample mean and variance of v from the direct measurements of this variable, and then using the measurements of W=Q and I
4.8 Provide a general expression for the error propagation formula when three independent random variables are present, to generalize (4.24) that is valid for two variables.
5.1 Using the definition of weighted sample mean as in (5.8), derive its variance and show that it is given by (5.9).
5.2 Using the data from Mendel’s experiment (Table 1.1), calculate the standard deviation in the measurement of each of the seven fractions of dominants, and the weighted mean and standard deviation of the seven fractions.Compare your result from a direct calculation of the overall fraction of
5.3 The Mendel experiment of Table 1.1 can be described as n number n of measurements of ni, the number of plants that display the dominant character, out of a total of Ni plants. The experiment is described by a binomial distribution with probability p D 0:75 for the plant to display the dominant
5.4 Consider a decaying radioactive source observed in a time interval of duration T D 15 s; N is the number of total counts, and B is the number of background counts(assumed to be measured independently of the total counts):(N D 19 counts B D 14 counts:The goal is to determine the probability of
5.5 For the Thomson experiment of Table 2.1 (tube 1) and Table 2.2 (tube 2), calculate:(a) The 90% central confidence intervals for the variable v;(b) The 90% upper and lower limits, assuming that the variable is Gaussian.
5.6 Consider a Poisson variable X of mean .(a) We want to set 90% confidence upper limits to the value of the parent mean, assuming that one measurement of the variable yielded the result of N D 1.Following the classical approach, find the equation that determines the exact 90% upper limit to the
5.7 The data provided in Table 2.3 from Pearson’s experiment on biometric data describes the cumulative distribution function of heights from a sample of 1,079 couples. Calculate the 2 upper limit to the fraction of couples in which bothmother and father are taller than 68 in.
5.8 Use the data presented in Example 5.7, in which there is a non-detection of a source in the presence of a background of B ' 9:8. Determine the Poisson upper limit to the source count at the 99% confidence level and compare this upper limit with that obtained in the case of a zero background
6.1 Calculate the linear average and the weighted mean of the quantity “Ratio” in Table 6.1.
6.2 Consider the 25 measurements of “Ratio” in Table 6.1. Assume that an additional uncertainty of ˙0.1 is to be added linearly to the statistical error of each measurement reported in the table. Show that the addition of this source of uncertainty results in a weighted mean of 0:95 ˙ 0:04.
6.3 Given two measurements x1 and x2 with values in the neighborhood of 1.0, show that the logarithm of the average of the measurements is approximately equal to the average of the logarithms of the measurements.
6.4 Given two measurements x1 and x2 with values in the neighborhood of a positive number A, show that the logarithm of the average of the measurements is approximately equal to the average of the logarithms of the measurements.
6.5 For the data in Table 6.1, calculate the linear average, weighted average and median of each quantity (Radius, Energy Method 1, Energy Method 2 and Ratio). You may assume that the error of each measurements is the average of the asymmetric errors of each measurement reported in the table.
6.6 Table 6.1 contains the measurement of the thermal energy of certain sources using two independent methods labeled as method #1 and method #2. For each source, the measurement is made at a given radius, which varies from source to source. The error bars indicate the 68%, or 1, confidence
7.1 Five students score 70, 75, 65, 70, and 65 on a test. Determine whether the scores are compatible with the following hypotheses:(a) The mean is D 75;(b) the mean is D 75 and the standard deviation is D 5.Test both hypotheses at the 95% or 68% confidence levels, assuming that the scores
7.2 Prove that themean and variance of the F distribution are given by the following relationships, 8ˆˆ
7.3 Using the same data as Problem (7.1), test whether the sample variance is consistent with a parent variance of 2 D 25, at the 95% level.
7.4 Using the J.J. Thomson experiment data of page 23, measure the ratio of the sample variances of the m=e measurements in Air for Tube 1 and Tube 2.Determine if the null hypothesis that the two measurements are drawn from the same distribution can be rejected at the 90% confidence level. State
7.5 Consider a dataset .10; 12; 15; 11; 13; 16; 12; 10; 18; 13/, and calculate the ratio of the sample variance of the first two measurements with that of the last eight. In particular, determine at what confidence level for the null hypothesis both subsets are consistent with the same variance.
7.6 Six measurements of the length of a wooden block gave the following measurements: 20.3, 20.4, 19.8, 20.4, 19.9, and 20.7 cm.(a) Estimate the mean and the standard error of the length of the block;(b) Assume that the block is known to be of length D 20 cm. Establish if the measurements are
7.7 Consider Mendel’s experimental data in Table 1.1 shown at page 9.(a) Consider the data that pertain to the case of “Long vs. short stem.” Write an expression for the probability of making that measurement, assuming Mendel’s hypothesis of independent assortment. You do not need to
7.8 Consider Mendel’s experimental data in Table 1.1 shown at page 9. Considering all seven measurements, calculate the probability that the mean fraction of dominant characters agrees with the expectation of 0.75. For this purpose, you may use the t statistic.
7.9 Starting with (7.36), complete the derivation of (7.34).
7.10 Show that the t distribution, fT .t/ D 1pf .. f C 1/=2/ .f =2/1 C t2 f1 2. fC1/becomes a standard Gaussian in the limit of large f . You can make use of the asymptotic expansion of the Gamma function (A.17).
8.1 Consider the data from Hubble’s experiment in Table 8.1.(a) Determine the best-fit values of the fit to a linear model for .m; log v/ assuming that the dependent variables have a common value for the error.(b) Using the best-fit model determined above, estimate the error from the data and the
8.2 Consider the following two-dimensional data, in which X is the independent variable, and Y is the dependent variable assumed to be derived from a photoncounting experiment:xi yi 0:0 25 1:0 36 2:0 47 3:0 64 4:0 81(a) Determine the errors associated with the dependent variables Yi.(b) Find the
8.3 Consider the following Gaussian dataset in which the dependent variables are assumed to have the same unknown standard deviation , xi yi 0:0 0:0 1:0 1:5 2:0 1:5 3:0 2:5 4:0 4:5 5:0 5:0 The data are to be fit to a linear model.(a) Using the maximum likelihood method, find the analytic
8.4 In the case of a maximum likelihood fit to a 2-dimensional dataset with equal errors in the dependent variable, show that the conditions for having best-fit parameters a D 0 and b D 1 are 8ˆˆˆ
8.5 Show that the best-fit parameter b of a linear fit to a Gaussian dataset is insensitive to a change of all datapoints by the same amount x, or by the same amount y. You can show that this property applies in the case of equal errors in the dependent variable, although the same result applies
8.6 The background rate in a measuring apparatus is assumed to be constant with time. N measurements of the background are taken, of which N=2 result in a value of yC, and N=2 in a value y. Determine the sample variance of the background rate.
8.7 Find an analytic solution for the best-fit parameters of a linear model to the following Poisson dataset:x y2 11 0 0 1 1 0 2 2
8.8 Use the data provided in Table 6.1 to calculate the best-fit parameters a and b for the fit to the radius vs. pressure ratio data, and the minimum 2. For the fit, you can assume that the radius is known exactly, and that the standard deviation of the pressure ratio is obtained as a linear
8.9 Show that, when all measurement errors are identical, the least squares estimators of the linear parameters a and b are given by b D Cov.X; Y/=Var.X/and a D E.Y/ bE.X/.
9.1 Calculate the best-fit parameters and uncertainties for the multi-variable regression of the Iris setosa data of Fig. 9.1.
9.2 Use an F test to determine whether the multi-variable regression of the Iris setosa data is justified or not.
9.3 Prove that (9.5) and (9.10) are equivalent. Take into consideration that in (9.5)the vectors a and ˇ are row vectors. You may re-write (9.5) using column vectors.
9.4 Prove that the coefficient of determination R2 for the simple linear regression y D a C bx is equivalent to the sample correlation coefficient of (2.20).
9.5 Fit the Iris setosa data using the function y D a C bx, where Y is the Sepal Length and X the Sepal Width. For this fit, you will ignore the data associated with the petal. Determine the best-fit parameters of the linear model and their errors.
9.6 Using the results of Problem 9.5, determine whether there is sufficient evidence for the use of the simple y D a C bx model for the data. Use a confidence level of 99% to draw your conclusions.
9.7 Prove (9.17).
10.1 Use the same data as in Problem 8.2 to answer the following questions.(a) Plot the 2-dimensional confidence contours at 68 and 90% significance, by sampling the (a,b) parameter space in a suitable interval around the best-fit values.(b) Using a suitable 2-dimensional confidence contour,
10.2 Find the minimum 2 of the linear fit to the radius vs. ratio data of Table 6.1 and the number of degrees of freedom of the fit. Determine if the null hypothesis can be rejected at the 99% confidence level.
10.3 Consider a simple dataset with the following measurements, assumed to be derived from a counting process. Show that the best-fit value of the parameter a for x y 0 1 1 1 2 1 the model y D eax is a D 0 and derive its 68% confidence interval.
10.4 Consider the same dataset as in Problem 10.3 but assume that the y measurements are Gaussian, with variances equal to the measurements. Show that the confidence interval of the best-fit parameter a D 0 is given by a D p1=5.
10.5 Consider the same dataset as in Problem 10.3 but assume a constant fit function, y Da. Show that the best-fit is given by a D 1 and that the 68%confidence interval corresponds to a standard deviation of p1=3.
10.6 Consider the biometric data in Pearson’s experiment (page 30). Calculate the average father height (X variable) for each value of the mother’s height (Y variable), and the average mother height for each value of the father’s height. Using these two averaged datasets, perform a linear
10.7 Calculate the linear correlation coefficient for the data of Hubble’s experiment(logarithm of velocity, and magnitude m), page 157. Determine whether the hypothesis of uncorrelation between the two quantities can be rejected at the 99%confidence level.
10.8 Use the data from Table 6.1 for the radius vs. ratio, assuming that the radius is the independent variable with no error. Draw the 68 and 90% confidence contours on the two fit parameters a andb, and calculate the 68% confidence interval on the b parameter.
11.1 Fit the data from Table 6.1 for the radius vs. ratio using a linear model and calculate the intrinsic scatter using the best-fit linear model.
11.2 Using the same data as in Problem 11.1, provide an additional estimate of the intrinsic scatter using the 2 red ' 1 method.
11.3 Justify the 1=.N m/ and 1=.N 1/ coefficients in (11.3) and (11.4).
11.4 Using the data for the Hubble measurements of page 157, assume that each measurement of log v has an uncertainty of D 0:01. Estimate the intrinsic scatter in the linear regression of log v vs. m.
11.5 Using the data of Problem 8.2, estimate the intrinsic scatter in the linear fit of the X; Y data.
12.1 Use the bivariate error data of Energy 1 and Energy 2 from Table 6.1. Calculate the best-fit parameters and errors of the linear model Y=X, where X is Energy 1 and Y is Energy 2.
12.2 Use the bivariate error data of Energy 1 and Energy 2 from Table 6.1. Calculate the best-fit parameters and errors of the linear model X=Y, where X is Energy 1 and Y is Energy 2.
12.3 For the Energy 1 and Energy 2 data of Table 6.1, use the results of Problems 12.1 and 12.2 to calculate the bisector model to the Energy 1 vs. Energy 2 data.
12.4 Repeat Problem 12.1 for the Ratio vs. Radius data of Table 6.1.
12.5 Repeat Problem 12.2 for the Ratio vs. Radius data of Table 6.1.
12.6 Repeat Problem 12.3 for the Ratio vs. Radius data of Table 6.1.
13.1 Using the data from Thomson’s experiment at page 23, determine the values of the Kolmogorov–Smirnov statistic DN for the measurement of Tube #1 and Tube#2, when compared with a Gaussian model for the measurement with D 5:7 and2 D1. Determine at what confidence level you can reject the
13.2 Using the data from Thomson’s experiment at page 23, determine the values of the two-sample Kolmogorov–Smirnov statistic DNM for comparison between the twomeasurements.Determine at what confidence level you can reject the hypothesis that the two measurements are consistent with one another.
13.3 Using the data of Table 10.1, determine whether the hypothesis that the last three measurements are described by a simple constant model can be rejected at the 99% confidence level.
13.4 A given dataset with N D 5 points is fit to a linear model, for a fit statistic of2 min. When adding an additional nested parameter to the fit, p D 1, determine by how much should the 2 min be reduced for the additional parameter to be significant at the 90% confidence level.
13.5 A dataset is fit to model 1, with minimum 2 fit statistic of 2 1 D 10 for 5 degrees of freedom; the same dataset is also fit to another model, with 2 2 D 5 for 4 degrees of freedom. Determine which model is acceptable at the 90% confidence, and whether the F test can be used to choose one
13.6 A dataset of size N is successfully fit with a model, to give a fit statistic 2 min. A model with a nested component with 1 additional independent parameter for a total of m parameters is then fit to 2 min, providing a reduction in the fit statistic of2.Determine what is the minimum2 that,
14.1 Calculate how many synthetic bootstrap datasets can be generated at random from a dataset Z with N unique datapoints. Notice that the order in which the datapoints appear in the dataset is irrelevant.
14.2 For a bootstrap dataset Zj constructed from a set Z of N independent measurements of a variable X, show that the covariance between the number of occurrence nji and njk is given by (14.17),2 ik D 1 N:
14.3 Perform a numerical simulation of the number , and determine how many samples are sufficient to achieve a precision of 0.1%. The first six significant digits of the number are D 3:14159.
14.4 Perform a bootstrap simulation on the Hubble data presented in Fig. 14.3, and find the 68% central confidence ranges on the parameters a and b.
14.5 Using the data of Problem 8.2, run a bootstrap simulation with N D 1000 iterations for the fit to a linear model. After completion of the simulation, plot the sample probability distribution function of the parameters a andb, and find the median and 68% confidence intervals on the fit
14.6 Use the data of Problem 8.2, but assuming that the errors in the dependent variable y are unknown. Run a bootstrap simulation with N D 1000 iterations, and determine the median and 68% confidence intervals on the parameters a and b to the fit to a linear model.
14.7 Using the data of Problem 8.2, assuming that the errors in the dependent variable y are unknown, estimate the values of a and b to the fit to a linear model using a jackknife method.
14.8 Given two uniform random variables U1 and U2 between R and CR, as often available in common programming software, provide an analytic expression to simulate a Gaussian variable of mean and variance 2.
15.1 Consider theMarkov chain for the Ehrenfest chain described in Example 15.4.Show that the stationary distribution is the binomial with p D q D 1=2.
15.2 Show that the random walk with p D q D 1=2 (15.10) returns to the origin infinitely often, and therefore the origin is a recurrent state of the chain.
15.3 For the random walk with p ¤ p, show that the origin is a transient state.
15.4 Assume that the diffusion model of Example 15.2 is modified in such a way that at each time step one has the option to choose one box at random from which to replace a ball to the other box.(a) Determine the transition probabilities pij for this process.(b) Determine whether this process is a
15.5 Using the model of diffusion of Problem 15.4, determine if the binomial distribution with p D q D 1=2 is the stationary distribution.
16.1 Prove that, in the presence of positive correlation among MCMC samples, the variance of the sample mean is larger than that of an independent chain.
16.2 Using the data of logm and velocity from Table 8.1 of Hubble’s experiment, construct a Monte Carlo Markov chain for the fit to a linear model with 10,000 iterations. Use uniform distributions for the prior and proposal distributions of the two model parameters a andb, the latter with widths
16.3 A one-parameter chain is constructed such that in two intervals A and B the following values are accepted into the chain:A W 10; 11; 13; 11; 10 B W 7; 8; 1; 11; 10; 8I where A is an initial interval, and B an interval at the end of the chain. Not knowing how the chain was constructed, use the
16.4 Using the data of Table 10.1, construct a Monte Carlo Markov chain for the parameters of the linear model, with 10,000 iterations. Use uniform distributions for the prior and proposal distributions, the latter with a width of 10 for both parameters.Start the chain at a D 12 and b D 6. After
16.5 Consider the following portions of two one-parameter chains, run in parallel and starting from different initial positions:7; 8; 1; 11; 10; 8 11; 11; 8; 10; 9; 12:Using two segments of length b D 3, calculate the Gelman–Rubin statistic pRO for both segments under the hypothesis of
16.6 Consider the step-function model described in Example 16.2, and a dataset consisting of n measurements. Assuming that the priors on the parameters , and m are uniform, show that the full conditional distributions are given by 8ˆˆˆˆˆˆˆ
16.7 Consider the step-function model described in Example 16.2, and a dataset consisting of the following five measurements:0; 1; 3; 4; 2:Start a Metropolis–Hastings MCMC at D 0, D 2 and m D 1, and use uniform priors on all three parameters. Assume for simplicity that all parameters can only
16.8 Consider a Monte Carlo Markov chain constructed with a Metropolis–Hastings algorithm, using uniform prior and proposal distribution. At a given iteration, the chain is at the point of maximum likelihood or, equivalently,minimum2. Calculate the probability of acceptance of a candidate that
Showing 4600 - 4700
of 6613
First
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
Last
Step by Step Answers