New Semester
Started
Get
50% OFF
Study Help!
--h --m --s
Claim Now
Question Answers
Textbooks
Find textbooks, questions and answers
Oops, something went wrong!
Change your search query and then try again
S
Books
FREE
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Tutors
Online Tutors
Find a Tutor
Hire a Tutor
Become a Tutor
AI Tutor
AI Study Planner
NEW
Sell Books
Search
Search
Sign In
Register
study help
business
statistical sampling to auditing
Probability Theory And Statistical Inference 2nd Edition Aris Spanos - Solutions
Discuss the following syllogism by a gambler betting on red or black in roulette: “For the last 6 times in a row the ball stopped in a red; if the WLLN is valid, it means that the probability that the next one will be black must be greater than 1/2.”
“The law of large numbers and the central limit theorem hold for stochastic processes for which we need to postulate restrictions of three types: (a) distribution, (b)dependence, and (c) homogeneity.” Discuss.
“Poisson’s WLLN postulates complete heterogeneity for the Bernoulli random variables involved but implicitly assumes asymptotic homogeneity.” Discuss.
Explain Borel’s strong law of large numbers and discuss why:(a) it gives rise to “potential” learning from data;(b) the SLLN does not imply that limn→∞ xn=p;(c) it does not imply that for a very large n, say n = 106, xn p.(d) Explain why Xn = 1nnk=1 Xk, although a consistent estimator
How does the law of large numbers relate to the central limit theorem?
Explain the conclusion of Bernstein’s WLLN and discuss which assumptions are crucial for the validity of the conclusion.
Compare and contrast Bernoulli’s WLLN with that of Bernstein.
Explain how the conditions underlying Lyapunov’s CLT ensure that no one random variable in the sequence dominates the summation.
(a) Consider the IID sequence {Xk, k ∈ N} with E(Xk) = 0, Var(Xk) = σ2. Define the sequence {Yk, k ∈ N}, where Yk = a + bXk, k = 1, 2, . . . Do the SLLN, WLLN, and CLT hold for the sequence {Yk, k ∈ N}? Explain your answer.(b) Consider the stochastic process {Xk, k ∈ N} that satisfies
Discuss the relationship between the Lindeberg and Feller conditions and their connection with the CLT.
Discuss the relationship between the Lindeberg and uniform asymptotic negligibility conditions.
Convergence in probability implies convergence in distribution but more stringent conditions are needed for the CLT than those for the LLN. Explain why.
Explain how the CLT can be extended beyond the scaled summations.
Explain how the FCLT improves upon the classical CLT.
Compare and contrast the classical CLT and FCLT in the case of second-order stationary processes.
Explain intuitively why “converges in probability” is a stronger mode of convergence than “converges in distribution.”
Explain intuitively why “converges almost surely” is a stronger mode of “convergence than converges in probability.”
Compare and contrast convergence almost surely and rth-order convergence.
“For modeling purposes specific distribution assumptions are indispensable as suggested by the Berry–Esseen result.” Discuss.
(a) Compare and contrast Linderberg’s CLT (Table 9.19) with that for second-order martingale difference processes (Table 9.22).(b) Compare and contrast Chebyshev’s “near” CLT, which was invalid, with the valid CLT in Table 9.21.(c) Explain why the CLT results for second-order stationary
Explain why the interpretation of mathematical probability plays a crucial role in determining the type and nature of statistical modeling and inference.
(a) Compare and contrast the degrees of belief and model-based frequentist interpretations of probability.(b) Explain why the criticism that the model-based frequentist interpretation of probability is of very limited scope because it is only applicable in the case of IID data is misplaced.(c)
(a) Explain briefly the difference between the model-based and von Mises frequentist interpretations of probability.(b) Using your answer in (a), explain why (i) the circularity charge, (ii) the long-run metaphor, and (iii) the single event probability charge are misplaced when leveled against the
(a) Discuss the common features of the model-based frequentist interpretation of probability and Kolmogorov’s complexity interpretation of probability.(b) Discuss the relationship between the model-based frequentist interpretation of probability and the propensity interpretation of probability.
Explain why for subjective probabilities associated with two independent events A and B of: Pr(A) = .5, Pr(B) = .7, Pr(A ∩ B) = .2 is not coherent.
(a) Compare and contrast the subjective and logical (objective) “degrees of belief”interpretations of probability.(b) Briefly compare and contrast the frequentist and Bayesian approaches to statistical inference on the basis of the delimiting features in Table 10.6.
In the case of the simple Bernoulli model, explain the difference between frequentist inference based on a sampling distribution of an estimator of θ, say f ( θ (x); θ), ∀x∈Rn and Bayesian inference based on the posterior distribution π(θ|x0), ∀θ∈.
(a) “. . . likelihoods are just as subjective as priors” (Kadane, 2011, p. 445). Discuss.(b) Discuss the following claims by Koop et al. (2007, p. 2):frequentists argue that situations not admitting repetition under essentially identical conditions are not within the realm of statistical
Compare and contrast Karl Pearson’s approach to statistics with that of R. A. Fisher, and explain why the former implicitly assumes that the data constitute a realization of an IID sample.
(a) Compare and contrast the following: (i) sample vs. sample realization, (ii) estimator vs. estimate, (iii) distribution of the sample vs. likelihood function.(b) Explain briefly why frequentist estimation (point and interval), testing, and prediction are primarily based on mappings between the
“The various limit theorems relating to the asymptotic behavior of the empirical cumulative distribution function, in conjunction with the validity of the probabilistic assumptions it invokes, bestow empirical content to the mathematical concept of a cdf F(x).” Explain and discuss.
For the random variable X, where E(X) = 0 and Var(X) = 13, derive an upper bound on the probability of the event {|X −.6| > .1}. How does this probability change if one knows that X U(−1, 1)?
For the random variable X, where E(X) = 0 and Var(X) = 1, derive an upper bound on the probability of the event {|X − .6| > .1}. How does this probability change if one knows that X N(0, 1)? How accurate is the following inequality: x 1 2 () e, for x > E. P ([X] ) e* dx = (),
(a) (i) In Example 10.25 with ε = .1, evaluate the required sample size n to ensure that (ii) Calculate the increase in n needed for the same upper bound (.02) with ε =.05.(iii) Repeat (i) and (ii) using the Normal approximation to the finite sample distribution and compare the results.(b)
(a) “For modeling purposes specific distribution assumptions are indispensable if we need precise and sharp results. Results based on bounded moment conditions are invariably imprecise and blunt.” Discuss.(b) “For a large enough sample size n, one does not need to worry about distributional
Explain briefly what we do when we construct an estimator. Why is an estimator a random variable?
“Defining the sampling distribution of an estimator is in theory trivial but technically very difficult.” Discuss.
Explain what the primary aim of an estimator is, and why its optimality can only be assessed via its sampling distribution.
For the Bernoulli statistical model (Table 11.1):(a) Discuss whether the following functions constitute possible estimators of θ:(b) For those that constitute estimators, derive their sampling distributions. (i) = Xn, (ii) 2 = (X1-X2), (iii) 3 = (X1-X2+Xn), (iv)=X, (v) n+1 = n n+1 X-
Explain briefly the properties of unbiasedness and efficiency of estimators.
“In assessing the optimality of an estimator we need to look at the first two moments of its sampling distribution only.” Discuss.
Explain briefly what a consistent estimator is. What is the easiest way to prove consistency for estimators with bounded second moments?
Explain briefly the difference between weak and strong consistency of estimators.
“Asymptotic Normality of an estimator is an extension of the central limit theorem for functions of the sample beyond the sample mean.” Discuss.
(a) Compare and contrast full efficiency with relative efficiency.(b) Explain the difference between full efficiency and asymptotic efficiency.
Discuss the key differences between finite sample and asymptotic properties of estimators, and why these differences matter for the reliability of inference with x0.
Explain the difference between the Cramer–Rao and Bhattacharya lower bounds.
(a) Explain the notion of sufficiency.(b) Explain the notion of a minimal sufficient statistic and how it relates to the best unbiased estimator.
(a) Discuss the difference between the following two definitions of the notions of the bias and MSE: (i) E()=0,MSE) = E-0), for all 0 ; == (ii) E()=0*, MSE;0*) = E 0*), where * is the true 0 in . Explain which one makes sense in the context of frequentist estimation and why the other one does
Consider the Normal (two-parameter) statistical model.(a) Derive (not guess!) the sampling distributions of the following estimators: (i) = Xn, (ii) = (x1+x2+X3), (iii) 3=(X-X), (iv) n = Xi (HINT: State explicitly any properties of E(.) or any lemmas you use.) (b) Compare these estimators in terms
Consider the simple Poisson model based on f (x; θ) = θxeθ /x!, θ>0, x = 0, 1, 2, . . .(a) Derive the Cramer–Rao lower bound for unbiased estimators of θ.(b) Explain why X = 1nni=1 Xi is an unbiased and fully efficient estimator of θ.(c) Derive the posterior distribution when π(θ) is
(a) Explain what the gold standard for the optimality of estimators amounts to in terms of a combination of properties; give examples if it helps.(b) Explain why the property of admissibility is unduly dependent on the quantifier∀θ∈, which is at odds with the frequentist definition of the MSE
(a) Explain the concept of the likelihood function as it relates to the distribution of the sample.(b) Explain why the likelihood function does not assign probabilities to the unknown parameter(s) θ.(c) Explain how one can derive the maximum likelihood estimators in the case where the likelihood
(a) Explain how the likelihood function ensures learning from data as n→∞.(b) Explain why the identity below is mathematically correct but probabilistically questionable:explain how one can remedy that.(c) Explain the difference between (i) Fisher’s sample and individual observation
(a) Define the concept of the score function and explain its connection to Fisher’s information.(b) Explain how the score function relates to both sufficiency and the Cramer–Rao lower bound.
(a) State and explain the finite sample properties of MLEs under the usual regularity conditions.(b) State and explain the asymptotic properties of MLEs under the usual regularity conditions.(c) Explain why an asymptotically Normal MLE is also asymptotically unbiased.
Consider the simple Normal statistical model (Table 12.9).(a) Derive the MLEs of (μ, σ2) and state their sampling distributions.(b) Derive the least-squares estimators of (μ, σ2) without the Normality assumption[1] and their sampling distributions.(c) Compare these estimators in terms of the
(a) The maximum likelihood method is often criticized for the fact that for a very small sample size, say n = 5, MLEs are not very reliable. Discuss.(b) Explain why the fact that the ML method gives rise to inconsistent estimators in the case of the Neyman–Scott (1948) model constitutes a
(a) Explain why least squares as a mathematical approximation method provides no basis for statistical inference.(b) Explain how Gauss has transformed the least-squares method into a statistical estimation procedure.
(a) State and explain the Gauss–Markov theorem, emphasizing its scope.(b) Discuss how one can use the Gauss–Markov theorem to test the significance of the slope coefficient.(c) Explain why the least-squares method relates to the Normal distribution and why in the case where the error term has a
(a) Explain the moment matching principle as an estimation method.(b) Use the MM principle to derive the MM estimator of θ in the case of the simple Laplace statistical model with density function f (x; θ) = 1 2θ e−1θ|x|,θ > 0, x ∈ R.
(a) Explain how the sample raw moments provide consistent estimators for the distribution moments, but their finite sample properties invoke the existence of much higher moments that (i) are often difficult to justify and (ii) would introduce major imprecision in the resulting inference from their
(a) Explain why it is anachronistic to compare the maximum likelihood method to the parametric method of moments.(b) Compare and contrast Pearson’s method of moments with the parametric method of moments.(c) Consider the simple Pareto statistical model with densitywhere x0 is a known lower bound
(a) Explain why the notion of nuisance vs. parameters of interest is problematic empirical modeling.(b) Using your answer in (a), explain why eliminating nuisance parameters distorts the original statistical model in ways that might be very difficult to test for statistical misspecification.13*.
What are the similarities and differences between Karl Pearson’s chi-square test and Fisher’s t-test for μ = μ0.
In the case of the simple Normal model, Gosset has showed that for any n > 1:(a) Explain why the result in (13.78), as it stands, is meaningless unless it is supplemented with the reasoning underlying it.(b) Explain how interpreting (13.78) using factual reasoning yields a pivotal quantity, and
(a) Define and explain the key components introduced by Fisher in specifying his significance testing.(b) Apply a Fisher significance test for H0: θ = .5, in the context of a simple Bernoulli model, using the new births data for Cyprus below, where θ = P(X = 1), with{X = 1} = {male}.(c) “The
(a) Define and explain the notion of the p-value.(b) Using your answer in (a) explain why each of the following interpretations of the p-value are erroneous: (i) The p-value is the probability that H0 is true. (ii) 1−p(x0)is the probability that the H1 is true. (iii) The p-value is the
(a) Explain the new features the N-P framing introduced into Fisher’s testing and for what purpose. In what respects it modified Fisher’s testing.(b) Compare and contrast the N-P significance level and Fisher’s p-value.(c) Explain why the archetypal N-P formulation of the null and alternative
(a) Explain the notions of a type I and type II error. Why does one increase when the other decreases?(b) How does the Neyman-Pearson procedure solve the problem of a trade off between the type I and type II errors?(c) Compare and contrast the power of a test at a point, and the power function.(d)
(a) In the case of the simple (one parameter) Normal model, explain how the sampling distribution of the test statistic √n(Xn−μ0)σ changes when evaluated under the null and under the alternative: H0: μ = μ0, vs. H1: μ>μ0.(b) In the context of the simple (one parameter) Normal model (σ =
(a) State and explain the N-P lemma, paying particular attention to the framing of the null and alternative hypotheses.(b) State and explain the two conditions needed to extend the N-P lemma to more realistic cases for the existence of these α-UMP tests.(c) In the case of the simple Normal model
(a) In the context of the simple Bernoulli model, explain how you would reformulate the null and alternative hypotheses when the substantive hypotheses of interest are: H0: θ = .5, vs. H1: θ = 18/35.(b) Using your answer in (a), apply the test for α = .01 using the following data: Data on
(a) State and explain the fallacies of acceptance and rejection and relate your answer to the distinction between statistical and substantive significance.(b) Explain how the post-data severity assessment of the N-P accept/reject results can be used to address both the fallacies of acceptance and
(a) Explain the concept of a post-data severity assessment and use it – in the context of a simple Bernoulli model– to evaluate the severity of the N-P decision for the hypotheses: H0: θ ≤ .5, vs. H1: θ > .5, using the data in Table 13.2, by evaluating the different discrepancies γ = .01,
(a) Explain the notions testing within and testing without (outside) the boundaries of a statistical model in relation to Neyman–Pearson (N-P) and mispecification (M-S)testing.(b) Specify the generic form of the null and alternative hypotheses in M-S testing for a statistical modelMθ (z) and
(a) Explain the likelihood ratio test procedure and comment on its relationship to the Neyman–Pearson lemma.(b) Explain why when the postulated statistical model is misspecified all Neyman-Pearson type tests will be invalid.
(a) “Confidence intervals are more reliable than p-values and are not vulnerable to the large n problem.” Discuss.(b) “The middle of an observed CI is more probable than any other part of the interval.”Discuss.
Explain why an equation expressed in terms of variables and parameters with a stochastic error term attached does not necessarily constitute a proper statistical model.
Compare and contrast the specification of the LR model in Tables 14.1 and 14.2 from the statistical modeling perspective that includes specification, estimation, misspecification testing and respecification.
Explain how the distinction between reduction and model assumptions (Table 14.3) can be useful for statistical modeling purposes.
Compare and contrast the sampling distributions of the ML estimators of the LR model and the ML estimators of the simple Normal model: Me(x): Xk NIID(,), xR, R, o > 0, KEN.
(a) Explain why the R2 as a goodness-of-fit measure for the LR model is highly vulnerable to any mean-heterogeneity in data Z0:=(y,X).(b) Explain the relationship between the R2 and the F-test for the joint significance of the coefficients β1 in an LR model.
Plot the residuals from the four estimated LR models in Example 14.3, and discuss the modeling strategy to avoid such problems.
(a) Test the hypotheses H0: σ2 = σ20, H1: σ2=σ20 for σ20= .2 at α = .05 using the estimated LR model in Example 14.2.(b) Using the estimated LR model in Example 14.2, derive the .95 two-sided CIs for the regression coefficients.(c) Use the estimated regression in Example 14.2 to predict the
(a) Using the data in Table 1, Appendix 5A repeat example 14.8 by replacing the Intel log-returns with the CITI log-returns and compare your results with those in the example.(b) Explain intuitively the use of auxiliary regressions, based on ut and u2t, as the basis of misspecification testing for
(a) Using the data in 1, Appendix 5.A, estimate the following LR model for the Intel log-returns (yt):Yt = β0 + β1x1t + β1x1t + ut, (14.91)where x1t= log-returns of the market (SP500) and x2t= log-returns of 3-month treasury bills and .(b) Explain the relationship between the statistical model
Explain why adding an additional explanatory variable in the auction estimated LR model in Example 14.9 has nothing to do with statistical misspecification; it is a case of substantive misspecification. Discuss the differences between the two types of misspecification.
Compare and contrast the LR with the Gauss linear (GL) model in terms of their specifications in Tables 14.1 and 14.5, respectively.
Discuss the assumptions and the conclusions of the Gauss–Markov theorem and explain why it provides a very poor basis for statistical inference purposes.
Explain why the error term distributions (ii)–(iv) in Table 14.8 raise questions about the appropriateness of least-squares as the relevant estimation method.
Explain the connection between the Normality of the joint distribution f (x, y) and the LR assumptions [1]–[5].
Compare and contrast the asymptotic properties of the OLS estimators of the LR parameters with those of the ML estimators under the Normality assumption.
Explain why the specification of the LR model in Table 14.10 can be misleading for modeling purposes.
Explain the relationship between the statistical GM: Yt = β0 + β1 xt + ut, t ∈ N and the estimated orthogonal decomposition: Yt = β 0 + β1 xt + ut.
Explain the variance decomposition in Table 14.11 and its importance for testing purposes.
Explain why relying exclusively on the asymptotic sampling distributions of the OLS estimators of the LR model parameters can cause problems for the reliability and precision of inference.
(a) Explain the problem of near-collinearity of the (X X) matrix and how it might affect the ML estimators of the LR model parameters.(b) Explain why near-collinearity of the (X X) matrix is neither necessary nor sufficient for sign reversals in the estimated regression coefficients.
(a) The Norm condition number κ2(X X) provides a reliable measure of the illconditioning of the (X X) matrix, but det(X X) 0 does not.(b) Explain why κ2(X X) invokes no probabilistic assumptions about the underlying data, but the correlation matrix among regressors does!
(a) Explain why the concept of a variance inflation factor makes no sense for trend terms (1, t, t2, . . . , tm).(b) How should one handle the terms (1, t, t2, . . . , tm) in the context of an LR model?
Showing 3100 - 3200
of 4976
First
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Last
Step by Step Answers