Mathematical Statistics With Applications Wackerly Mendenhall Scheaffer Pdf

Book cover Mathematical Statistics with Applications

Mathematical Statistics with Applications

Wackerly/Mendenhall/Scheaffer

How much do you like this book?

What's the quality of the file?

Download the book for quality assessment

What's the quality of the downloaded files?

The file will be sent to your email address. It may take up to 1-5 minutes before you receive it.

The file will be sent to your Kindle account. It may takes up to 1-5 minutes before you received it.

Please note: you need to verify every book you want to send to your Kindle. Check your mailbox for the verification email from Amazon Kindle.

Continuous Distributions Distribution Uniform Normal Exponential Gamma Chi-square Beta Probability Function f (y) = 1 ; θ ≤ y ≤ θ2 θ2 − θ1 1 # $ ! " 1 1 2 f (y) = √ exp − (y − µ) 2σ 2 σ 2π −∞ < y < +∞ f (y) = f (y) = ! $ 1 α−1 −y/β e ; α y %(α)β 0<y<∞ f (y) = f (y) = ! 1 −y/β e ; β>0 β 0<y<∞ (y)(v/2)−1 e−y/2 2v/2 %(v/2) y2 > 0 ; $ %(α + β) y α−1 (1 − y)β−1 ; %(α)%(β) 0<y<1 Mean Variance θ1 + θ2 2 (θ2 − θ1 )2 12 2 MomentGenerating Function etθ2 − etθ1 t (θ2 − θ1 ) t 2σ 2 exp µt + 2 % µ σ β β2 (1 − βt)−1 αβ αβ 2 (1 − βt)−α v 2v (1−2t)−v/2 α α+β αβ (α + β)2 (α + β + 1) does not exist in closed form Discrete Distributions Distribution Binomial Probability Function p(y) = & ' n y p y (1 − p)n−y ; Mean Variance MomentGenerating Function np np(1 − p) [ pet + (1 − p)]n 1 p 1− p pet 1 − (1 − p)et y = 0, 1, . . . , n Geometric p(y) = p(1 − p) y−1 ; p y = 1, 2, . . . Hypergeometric p(y) = & '& r y N−r n−y & ' N n ' ; nr N y = 0, 1, . . . , n if n ≤ r, y = 0, 1, . . . , r if n > r Poisson Negative binomial λ y e−λ ; y! y = 0, 1, 2, . . . p(y) = p(y) = & y−1 r−1 ' p r (1 − p) y−r ; y = r, r + 1, . . . n 2 & r '" N − r #" N −n# N N N −1 λ λ r p r(1 − p) p 2 exp[λ(et − 1)] ! pet 1 − (1 − p)et $r MATHEMATICAL STATISTICS WITH APPLICATIONS SEVENTH EDITION Mathematical Statistics with Applications Dennis D. Wackerly University of Florida William Mendenhall III University of Florida, Emeritus Richard L. Scheaffer University of Florida, Emeritus Australia • Brazil • Canada • Mexico • Singapore • Spain United Kingdom • United States Mathematical Statistics with Applications, Seventh Edition Dennis D. Wackerly, William Mendenhall III, Richard L. Scheaffer Statistics Editor: Carolyn Crockett Assistant Editors: Beth Gershman, Catie Ronquillo Editorial Assistant: Ashley Summers Technology Project Manager: Jennifer Liang Marketing Manager: Mandy Jellerichs Mar; keting Assistant: Ashley Pickering Marketing Communications Manager: Darlene Amidon-Brent Project Manager, Editorial Production: Hal Humphrey Art Director: Vernon Boes Print Buyer: Karen Hunt Production Service: Matrix Productions Inc. Copy Editor: Betty Duncan Cover Designer: Erik Adigard, Patricia McShane Cover Image: Erik Adigard Cover Printer: TK Compositor: International Typesetting and Composition Printer: TK © 2008, 2002 Duxbury, an imprint of Thomson Thomson Higher Education 10 Davis Drive Belmont, CA 94002-3098 USA Brooks/Cole, a part of The Thomson Corporation. Thomson, the Star logo, and Brooks/Cole are trademarks used herein under license. ALL RIGHTS RESERVED. No part of this work covered by the copyright hereon may be reproduced or used in any form or by any means—graphic, electronic, or mechanical, including photocopying, recording, taping, web distribution, information storage and retrieval systems, or in any other manner—without the written permission of the publisher. Printed in the United States of America 1 2 3 4 5 6 7 14 13 12 11 10 09 08 07 ExamView® and ExamView Pro® are registered trademarks of FSCreations, Inc. Windows is a registered trademark of the Microsoft Corporation used herein under license. Macintosh and Power Macintosh are registered trademarks of Apple Computer, Inc. Used herein under license. © 2008 Thomson Learning, Inc. All Rights Reserved. Thomson Learning WebTutorTM is a trademark of Thomson Learning, Inc. International Student Edition ISBN-13: 978-0-495-38508-0 ISBN-10: 0-495-38508-5 For more information about our products, contact us at: Thomson Learning Academic Resource Center 1-800-423-0563 For permission to use material from this text or product, submit a request online at http://www.thomsonrights.com. Any additional questions about permissions can be submitted by e-mail to thomsonrights@thomson.com. CONTENTS Preface xiii Note to the Student xxi 1 What Is Statistics? 1 1.1 Introduction 1.2 Characterizing a Set of Measurements: Graphical Methods 3 1.3 Characterizing a Set of Measurements: Numerical Methods 8 1.4 How Inferences Are Made 1.5 Theory and Reality 1.6 Summary 1 13 14 15 2 Probability 20 2.1 Introduction 2.2 Probability and Inference 21 2.3 A Review of Set Notation 23 2.4 A Probabilistic Model for an Experiment: The Discrete Case 2.5 Calculating the Probability of an Event: The Sample-Point Method 2.6 Tools for Counting Sample Points 2.7 Conditional Probability and the Independence of Events 2.8 Two Laws of Probability 20 26 35 40 51 57 v vi Contents 2.9 Calculating the Probability of an Event: The Event-Composition Method 62 2.10 The Law of Total Probability and Bayes' Rule 2.11 Numerical Events and Random Variables 2.12 Random Sampling 2.13 Summary 70 75 77 79 3 Discrete Random Variables and Their Probability Distributions 86 3.1 Basic Definition 3.2 The Probability Distribution for a Discrete Random Variable 3.3 The Expected Value of a Random Variable or a Function of a Random Variable 91 3.4 The Binomial Probability Distribution 3.5 The Geometric Probability Distribution 3.6 The Negative Binomial Probability Distribution (Optional) 121 3.7 The Hypergeometric Probability Distribution 3.8 The Poisson Probability Distribution 3.9 Moments and Moment-Generating Functions 138 3.10 Probability-Generating Functions (Optional) 143 3.11 Tchebysheff's Theorem 3.12 Summary 86 87 100 114 125 131 146 149 4 Continuous Variables and Their Probability Distributions 157 4.1 Introduction 4.2 The Probability Distribution for a Continuous Random Variable 4.3 Expected Values for Continuous Random Variables 4.4 The Uniform Probability Distribution 4.5 The Normal Probability Distribution 178 4.6 The Gamma Probability Distribution 185 4.7 The Beta Probability Distribution 157 174 194 170 158 Contents vii 4.8 Some General Comments 4.9 Other Expected Values 4.10 Tchebysheff's Theorem 4.11 Expectations of Discontinuous Functions and Mixed Probability Distributions (Optional) 210 4.12 Summary 201 202 207 214 5 Multivariate Probability Distributions 223 5.1 Introduction 5.2 Bivariate and Multivariate Probability Distributions 224 5.3 Marginal and Conditional Probability Distributions 235 5.4 Independent Random Variables 5.5 The Expected Value of a Function of Random Variables 5.6 Special Theorems 5.7 The Covariance of Two Random Variables 5.8 The Expected Value and Variance of Linear Functions of Random Variables 270 5.9 The Multinomial Probability Distribution 5.10 The Bivariate Normal Distribution (Optional) 5.11 Conditional Expectations 5.12 Summary 223 247 255 258 264 279 283 285 290 6 Functions of Random Variables 296 6.1 Introduction 6.2 Finding the Probability Distribution of a Function of Random Variables 297 6.3 The Method of Distribution Functions 6.4 The Method of Transformations 6.5 The Method of Moment-Generating Functions 6.6 Multivariable Transformations Using Jacobians (Optional) 6.7 Order Statistics 6.8 Summary 296 341 333 298 310 318 325 viii Contents 7 Sampling Distributions and the Central Limit Theorem 346 7.1 Introduction 7.2 Sampling Distributions Related to the Normal Distribution 7.3 The Central Limit Theorem 7.4 A Proof of the Central Limit Theorem (Optional) 7.5 The Normal Approximation to the Binomial Distribution 7.6 Summary 346 353 370 377 378 385 8 Estimation 390 8.1 Introduction 8.2 The Bias and Mean Square Error of Point Estimators 8.3 Some Common Unbiased Point Estimators 8.4 Evaluating the Goodness of a Point Estimator 8.5 Confidence Intervals 8.6 Large-Sample Confidence Intervals 8.7 Selecting the Sample Size 8.8 Small-Sample Confidence Intervals for µ and µ1 − µ2 8.9 8.10 390 396 399 406 Confidence Intervals for σ Summary 392 411 421 2 425 434 437 9 Properties of Point Estimators and Methods of Estimation 444 9.1 Introduction 9.2 Relative Efficiency 9.3 Consistency 9.4 Sufficiency 9.5 The Rao–Blackwell Theorem and Minimum-Variance Unbiased Estimation 464 9.6 The Method of Moments 9.7 The Method of Maximum Likelihood 9.8 Some Large-Sample Properties of Maximum-Likelihood Estimators (Optional) 483 9.9 Summary 444 445 448 459 485 472 476 Contents ix 10 Hypothesis Testing 488 10.1 Introduction 10.2 Elements of a Statistical Test 10.3 Common Large-Sample Tests 10.4 Calculating Type II Error Probabilities and Finding the Sample Size for Z Tests 507 10.5 Relationships Between Hypothesis-Testing Procedures and Confidence Intervals 511 10.6 Another Way to Report the Results of a Statistical Test: Attained Significance Levels, or p-Values 513 10.7 Some Comments on the Theory of Hypothesis Testing 10.8 Small-Sample Hypothesis Testing for µ and µ1 − µ2 10.9 488 489 496 Testing Hypotheses Concerning Variances 518 520 530 10.10 Power of Tests and the Neyman–Pearson Lemma 10.11 Likelihood Ratio Tests 10.12 Summary 540 549 556 11 Linear Models and Estimation by Least Squares 563 11.1 Introduction 11.2 Linear Statistical Models 11.3 The Method of Least Squares 11.4 Properties of the Least-Squares Estimators: Simple Linear Regression 577 11.5 Inferences Concerning the Parameters βi 11.6 Inferences Concerning Linear Functions of the Model Parameters: Simple Linear Regression 589 11.7 Predicting a Particular Value of Y by Using Simple Linear Regression 593 11.8 Correlation 11.9 Some Practical Examples 11.10 Fitting the Linear Model by Using Matrices 11.11 Linear Functions of the Model Parameters: Multiple Linear Regression 615 11.12 Inferences Concerning Linear Functions of the Model Parameters: Multiple Linear Regression 616 564 566 569 584 598 604 609 x Contents 11.13 Predicting a Particular Value of Y by Using Multiple Regression 11.14 A Test for H0 : βg+1 = βg+2 = · · · = βk = 0 11.15 Summary and Concluding Remarks 622 624 633 12 Considerations in Designing Experiments 640 12.1 The Elements Affecting the Information in a Sample 12.2 Designing Experiments to Increase Accuracy 12.3 The Matched-Pairs Experiment 12.4 Some Elementary Experimental Designs 12.5 Summary 640 641 644 651 657 13 The Analysis of Variance 661 13.1 Introduction 13.2 The Analysis of Variance Procedure 13.3 Comparison of More Than Two Means: Analysis of Variance for a One-Way Layout 667 13.4 An Analysis of Variance Table for a One-Way Layout 13.5 A Statistical Model for the One-Way Layout 13.6 Proof of Additivity of the Sums of Squares and E(MST) for a One-Way Layout (Optional) 679 13.7 Estimation in the One-Way Layout 13.8 A Statistical Model for the Randomized Block Design 13.9 The Analysis of Variance for a Randomized Block Design 13.10 Estimation in the Randomized Block Design 13.11 Selecting the Sample Size 13.12 Simultaneous Confidence Intervals for More Than One Parameter 13.13 Analysis of Variance Using Linear Models 13.14 Summary 661 662 671 677 681 686 688 695 696 701 705 14 Analysis of Categorical Data 713 14.1 A Description of the Experiment 14.2 The Chi-Square Test 14.3 A Test of a Hypothesis Concerning Specified Cell Probabilities: A Goodness-of-Fit Test 716 713 714 698 Contents xi 14.4 Contingency Tables 14.5 r × c Tables with Fixed Row or Column Totals 14.6 14.7 Other Applications 721 729 734 Summary and Concluding Remarks 736 15 Nonparametric Statistics 741 15.1 Introduction 15.2 A General Two-Sample Shift Model 15.3 The Sign Test for a Matched-Pairs Experiment 15.4 The Wilcoxon Signed-Rank Test for a Matched-Pairs Experiment 15.5 Using Ranks for Comparing Two Population Distributions: Independent Random Samples 755 15.6 The Mann–Whitney U Test: Independent Random Samples 15.7 The Kruskal–Wallis Test for the One-Way Layout 15.8 The Friedman Test for Randomized Block Designs 15.9 The Runs Test: A Test for Randomness 15.10 Rank Correlation Coefficient 15.11 Some General Comments on Nonparametric Statistical Tests 741 742 744 765 771 777 783 16 Introduction to Bayesian Methods for Inference 796 16.1 Introduction 16.2 Bayesian Priors, Posteriors, and Estimators 16.3 Bayesian Credible Intervals 16.4 Bayesian Tests of Hypotheses 16.5 Summary and Additional Comments 796 797 808 813 816 Appendix 1 Matrices and Other Useful Mathematical Results 821 A1.1 Matrices and Matrix Algebra A1.2 Addition of Matrices A1.3 Multiplication of a Matrix by a Real Number A1.4 Matrix Multiplication 821 822 823 758 823 789 750 xii Contents A1.5 Identity Elements A1.6 The Inverse of a Matrix A1.7 The Transpose of a Matrix A1.8 A Matrix Expression for a System of Simultaneous Linear Equations 828 A1.9 Inverting a Matrix A1.10 Solving a System of Simultaneous Linear Equations A1.11 Other Useful Mathematical Results 825 827 828 830 834 835 Appendix 2 Common Probability Distributions, Means, Variances, and Moment-Generating Functions 837 Table 1 Discrete Distributions 837 Table 2 Continuous Distributions 838 Appendix 3 Tables 839 Table 1 Binomial Probabilities Table 2 Table of e−x Table 3 Poisson Probabilities 843 Table 4 Normal Curve Areas 848 Table 5 Percentage Points of the t Distributions Table 6 Percentage Points of the χ 2 Distributions Table 7 Percentage Points of the F Distributions Table 8 Distribution Function of U Table 9 Critical Values of T in the Wilcoxon Matched-Pairs, Signed-Ranks Test; n = 5(1)50 868 839 842 849 850 852 862 Table 10 Distribution of the Total Number of Runs R in Samples of Size (n 1 , n 2 ); P(R ≤ a) 870 Table 11 Critical Values of Spearman's Rank Correlation Coefficient Table 12 Random Numbers Answers to Exercises Index 896 877 873 872 PREFACE The Purpose and Prerequisites of this Book Mathematical Statistics with Applications was written for use with an undergraduate 1-year sequence of courses (9 quarter- or 6 semester-hours) on mathematical statistics. The intent of the text is to present a solid undergraduate foundation in statistical theory while providing an indication of the relevance and importance of the theory in solving practical problems in the real world. We think a course of this type is suitable for most undergraduate disciplines, including mathematics, where contact with applications may provide a refreshing and motivating experience. The only mathematical prerequisite is a thorough knowledge of first-year college calculus— including sums of infinite series, differentiation, and single and double integration. Our Approach Talking with students taking or having completed a beginning course in mathematical statistics reveals a major flaw in many courses. Students can take the course and leave it without a clear understanding of the nature of statistics. Many see the theory as a collection of topics, weakly or strongly related, but fail to see that statistics is a theory of information with inference as its goal. Further, they may leave the course without an understanding of the important role played by statistics in scientific investigations. These considerations led us to develop a text that differs from others in three ways: • First, the presentation of probability is preceded by a clear statement of the objective of statistics—statistical inference—and its role in scientific research. As students proceed through the theory of probability (Chapters 2 through 7), they are reminded frequently of the role that major topics play in statistical inference. The cumulative effect is that statistical inference is the dominating theme of the course. • The second feature of the text is connectivity. We explain not only how major topics play a role in statistical inference, but also how the topics are related to xiii xiv Preface one another. These integrating discussions appear most frequently in chapter introductions and conclusions. • Finally, the text is unique in its practical emphasis, both in exercises throughout the text and in the useful statistical methodological topics contained in Chapters 11–15, whose goal is to reinforce the elementary but sound theoretical foundation developed in the initial chapters. The book can be used in a variety of ways and adapted to the tastes of students and instructors. The difficulty of the material can be increased or decreased by controlling the assignment of exercises, by eliminating some topics, and by varying the amount of time devoted to each topic. A stronger applied flavor can be added by the elimination of some topics—for example, some sections of Chapters 6 and 7—and by devoting more time to the applied chapters at the end. Changes in the Seventh Edition Many students are visual learners who can profit from visual reinforcement of concepts and results. New to this edition is the inclusion of computer applets, all available for on line use at the Thomson website, www.thomsonedu.com/statistics/wackerly. Some of these applets are used to demonstrate statistical concepts, other applets permit users to assess the impact of parameter choices on the shapes of density functions, and the remainder of applets can be used to find exact probabilities and quantiles associated with gamma-, beta-, normal-, χ 2 -, t-, and F-distributed random variables—information of importance when constructing confidence intervals or performing tests of hypotheses. Some of the applets provide information available via the use of other software. Notably, the R language and environment for statistical computation and graphics (available free at http://www.r-project.org/) can be used to provide the quantiles and probabilities associated with the discrete and continuous distributions previously mentioned. The appropriate R commands are given in the respective sections of Chapters 3 and 4. The advantage of the applets is that they are "point and shoot," provide accompanying graphics, and are considerably easier to use. However, R is vastly more powerful than the applets and can be used for many other statistical purposes. We leave other applications of R to the interested user or instructor. Chapter 2 introduces the first applet, Bayes' Rule as a Tree, a demonstration that allows users to see why sometimes surprising results occur when Bayes' rule is applied (see Figure 1). As in the sixth edition, maximum-likelihood estimates are introduced in Chapter 3 via examples for the estimates of the parameters of the binomial, geometric, and negative binomial distributions based on specific observed numerical values of random variables that possess these distributions. Follow-up problems at the end of the respective sections expand on these examples. In Chapter 4, the applet Normal Probabilities is used to compute the probability that any user-specified, normally distributed random variable falls in any specified interval. It also provides a graph of the selected normal density function and a visual reinforcement of the fact that probabilities associated with any normally distributed Preface xv FIGURE 1 Applet illustration of Bayes' rule random variable are equivalent to probabilities associated with the standard normal distribution. The applet Normal Probabilities (One Tail) provides upper-tail areas associated with any user-specified, normal distribution and can also be used to establish the value that cuts off a user-specified area in the upper tail for any normally distributed random variable. Probabilities and quantiles associated with standard normal random variables are obtained by selecting the parameter values mean = 0 and standard deviation = 1. The beta and gamma distributions are more thoroughly explored in this chapter. Users can simultaneously graph three gamma (or beta) densities (all with user selected parameter values) and assess the impact that the parameter values have on the shapes of gamma (or beta) density functions (see Figure 2). This is accomplished FIGURE 2 Applet comparison of three beta densities xvi Preface using the applets Comparison of Gamma Density Functions and Comparison of Beta Density Functions, respectively. Probabilities and quantiles associated with gamma- and beta-distributed random variables are obtained using the applets Gamma Probabilities and Quantiles or Beta Probabilities and Quantiles. Sets of Applet Exercises are provided to guide the user to discover interesting and informative results associated with normal-, beta-, and gamma- (including exponential and χ 2 ) distributed random variables. We maintain emphasis on the χ 2 distribution, including some theoretical results that are useful in the subsequent development of the t and F distributions. In Chapter 5, it is made clear that conditional densities are undefined for values of the conditioning variable where the marginal density is zero. We have also retained the discussion of the "conditional variance" and its use in finding the variance of a random variable. Hierarchical models are briefly discussed. As in the previous edition, Chapter 6 introduces the concept of the support of a density and emphasizes that a transformation method can be used when the transformation is monotone on the region of support. The Jacobian method is included for implementation of a bivariate transformation. In Chapter 7, the applet Comparison of Student's t and Normal Distributions permits visualization of similarities and differences in t and standard normal density functions, and the applets Chi-Square Probabilities and Quantiles, Student's t Probabilities and Quantiles, and F-Ratio Probabilities and Quantiles provide probabilites and quantiles associated with the respective distributions, all with user-specified degrees of freedom. The applet DiceSample uses the familiar die-tossing example to introduce the concept of a sampling distribution. The results for different sample sizes permit the user to assess the impact of sample size on the sampling distribution of the sample mean. The applet also permits visualization of how the sampling distribution is affected if the die is not balanced. Under the general heading of "Sampling Distributions and the Central Limit Theorem," four different applets illustrate different concepts: • Basic illustrates that, when sampling from a normally distributed population, the sample mean is itself normally distributed. • SampleSize exhibits the effect of the sample size on the sampling distribution of the sample mean. The sampling distribution for two (user-selected) sample sizes are simultaneously generated and displayed side by side. Similarities and differences of the sampling distributions become apparent. Samples can be generated from populations with "normal," uniform, U-shaped, and skewed distributions. The associated approximating normal sampling distributions can be overlayed on the resulting simulated distributions, permitting immediate visual assessment of the quality of the normal approximation (see Figure 3). • Variance simulates the sampling distribution of the sample variance when sampling from a population with a "normal" distribution. The theoretical (proportional to that of a χ 2 random variable) distribution can be overlayed with the click of a button, again providing visual confirmation that theory really works. • VarianceSize allows a comparison of the effect of the sample size on the distribution of the sample variance (again, sampling from a normal population). The associated theoretical density can be overlayed to see that the theory actually Preface xvii FIGURE 3 Applet illustration of the central limit theorem. works. In addition, it is seen that for large sample sizes the sample variance has an approximate normal distribution. The applet Normal Approximation to the Binomial permits the user to assess the quality of the the (continuous) normal approximation for (discrete) binomial probabilities. As in previous chapters, a sequence of Applet Exercises leads the user to discover important and interesting answers and concepts. From a more theoretical perspective, we establish the independence of the sample mean and sample variance for a sample of size 2 from a normal distribution. As before, the proof of this result for general n is contained in an optional exercise. Exercises provide step-by-step derivations of the mean and variance for random variables with t and F distributions. Throughout Chapter 8, we have stressed the assumptions associated with confidence intervals based on the t distributions. We have also included a brief discussion of the robustness of the t procedures and the lack of such for the intervals based on the χ 2 and F distributions. The applet ConfidenceIntervalP illustrates properties of large-sample confidence intervals for a population proportion. In Chapter 9, the applets PointSingle, PointbyPoint, and PointEstimation ultimately lead to a very nice xviii Preface illustration of convergence in probability. In Chapter 10, the applet Hypothesis Testing (for Proportions) illustrates important concepts associated with test of hypotheses including the following: • What does α really mean? • Tests based on larger-sample sizes typically have smaller probabilities of type II errors if the level of the tests stays fixed. • For a fixed sample size, the power function increases as the value of the parameter moves further from the values specified by the null hypothesis. Once users visualize these concepts, the subsequent theoretical developments are more relevant and meaningful. Applets for the χ 2 , t, F distributions are used to obtain exact p-values for associated tests of hypotheses. We also illustrate explicitly that the power of a uniformly most powerful test can be smaller (although the largest possible) than desired. In Chapter 11, the simple linear regression model is thoroughly discussed (including confidence intervals, prediction intervals, and correlation) before the matrix approach to multiple linear regression model is introduced. The applets Fitting a Line Using Least Squares and Removing Points from Regression illustrate what the least-squares criterion accomplishes and that a few unusual data points can have considerable impact on the fitted regression line. The coefficients of determination and multiple determination are introduced, discussed, and related to the relevant t and F statistics. Exercises demonstrate that high (low) coefficients of (multiple) determination values do not necessarily correspond to statistically significant (insignificant) results. Chapter 12 includes a separate section on the matched-pairs experiment. Although many possible sets of dummy variables can be used to cast the analysis of variance into a regression context, in Chapter 13 we focus on the dummy variables typically used by SAS and other statistical analysis computing packages. The text still focuses primarily on the randomized block design with fixed (nonrandom) block effects. If an instructor wishes, a series of supplemental exercises dealing with the randomized block design with random block effects can be used to illustrate the similarities and differences of these two versions of the randomized block design. The new Chapter 16 provides a brief introduction to Bayesian methods of statistical inference. The chapter focuses on using the data and the prior distribution to obtain the posterior and using the posterior to produce estimates, credible intervals, and hypothesis tests for parameters. The applet Binomial Revision facilitates understanding of the process by which data are used to update the prior and obtain the posterior. Many of the posterior distributions are beta or gamma distributions, and previously discussed applets are instrumental in obtaining credible intervals or computing the probability of various hypotheses. The Exercises This edition contains more than 350 new exercises. Many of the new exercises use the applets previously mentioned to guide the user through a series of steps that lead to more thorough understanding of important concepts. Others use the applets to provide confidence intervals or p-values that could only be approximated by using tables in the Preface xix appendix. As in previous editions, some of the new exercises are theoretical whereas others contain data from documented sources that deal with research in a variety of fields. We continue to believe that exercises based on real data or actual experimental scenarios permit students to see the practical uses of the various statistical and probabilistic methods presented in the text. As they work through these exercises, students gain insight into the real-life applications of the theoretical results developed in the text. This insight makes learning the necessary theory more enjoyable and produces a deeper understanding of the theoretical methods. As in previous editions, the more challenging exercises are marked with an asterisk (*). Answers to the odd-numbered exercises are provided in the back of the book. Tables and Appendices We have maintained the use of the upper-tail normal tables because the users of the text find them to be more convenient. We have also maintained the format of the table of the F distributions that we introduced in previous editions. This table of the F distributions provides critical values corresponding to upper-tail areas of .100, .050, .025, .010, and .005 in a single table. Because tests based on statistics possessing the F distribution occur quite often, this table facilitates the computation of attained significance levels, or p-values, associated with observed values of these statistics. We have also maintained our practice of providing easy access to often-used information. Because the normal and t tables are the most frequently used statistical tables in the text, copies of these tables are given in Appendix 3 and inside the front cover of the text. Users of previous editions have often remarked favorably about the utility of tables of the common probability distributions, means, variances, and moment-generating functions provided in Appendix 2 and inside the back cover of the text. In addition, we have included some frequently used mathematical results in a supplement to Appendix 1. These results include the binomial expansion of (x + y)n , the series expansion of e x , sums of geometric series, definitions of the gamma and beta functions, and so on. As before, each chapter begins with an outline containing the titles of the major sections in that chapter. Acknowledgments The authors wish to thank the many colleagues, friends, and students who have made helpful suggestions concerning the revisions of this text. In particular, we are indebted to P. V. Rao, J. G. Saw, Malay Ghosh, Andrew Rosalsky, and Brett Presnell for their technical comments. Gary McClelland, University of Colorado, did an outstanding job of developing the applets used in the text. Jason Owen, University of Richmond, wrote the solutions manual. Mary Mortlock, Cal Poly, San Luis Obispo, checked accuracy. We wish to thank E. S. Pearson, W. H. Beyer, I. Olkin, R. A. Wilcox, C. W. Dunnett, and A. Hald. We profited substantially from the suggestions of the reviewers of the current and previous editions of the text: Roger Abernathy, Arkansas State University; Elizabeth S. Allman, University of Southern Maine; Robert Berk, Rutgers xx Preface University; Albert Bronstein, Purdue University; Subha Chakraborti, University of Alabama; Rita Chattopadhyay, Eastern Michigan University; Eric Chicken, Florida State University; Charles Dunn, Linfield College; Eric Eide, Brigham Young University; Nelson Fong, Creighton University; Dr. Gail P. Greene, Indiana Wesleyan University; Barbara Hewitt, University of Texas, San Antonio; Richard Iltis, Willamette University; K. G. Janardan, Eastern Michigan University; Mark Janeba, Willamette University; Rick Jenison, Univeristy of Wisconsin, Madison; Jim Johnston, Concord University; Bessie H. Kirkwood, Sweet Briar College; Marc L. Komrosky, San Jose State University; Dr. Olga Korosteleva, California State University, Long Beach; Teck Ky, Evegreen Valley College; Matthew Lebo, Stony Brook University; Phillip Lestmann, Bryan College; Tamar London, Pennsylvania State University; Lisa Madsen, Oregon State University; Martin Magid, Wellesley College; Hosam M. Mahmoud, George Washington University; Kim Maier, Michigan State University; David W. Matolak, Ohio University; James Edward Mays, Virginia Commonwealth University; Katherine McGivney, Shippensburg Univesity; Sanjog Misra, University of Rochester; Donald F. Morrison, University of Pennsylvania, Wharton; Mir A. Mortazavi, Eastern New Mexico University; Abdel-Razzaq Mugdadi, Southern Illinois University; Ollie Nanyes, Bradley University; Joshua Naranjo, Western Michigan University; Sharon Navard, The College of New Jersey; Roger B. Nelsen, Lewis & Clark College; David K. Park, Washington University; Cheng Peng, University of Southern Maine; Selwyn Piramuthu, University of Florida, Gainesville; Robert Martin Price, Jr., East Tennessee State University; Daniel Rabinowitz, Columbia University; Julianne Rainbolt, Saint Louis University; Timothy A.Riggle, Baldwin-Wallace College; Mark Rizzardi, Humboldt State University; Jesse Rothstein, Princeton University; Katherine Schindler, Eastern Michigan University; Michael E. Schuckers, St. Lawrence University; Jean T. Sells, Sacred Heart University; Qin Shao, The University of Toledo; Alan Shuchat, Wellesley College; Laura J. Simon, Pennsylvania State University; Satyanand Singh, New York City College of Technology; Randall J. Swift, California State Polytechnic University, Pomona; David Sze, Monmouth University; Bruce E. Trumbo, California State University, East Bay; Harold Dean Victory, Jr., Texas Tech University; Thomas O. Vinson, Washington & Lee University; Vasant Waikar, Miami University, Ohio; Bette Warren, Eastern Michigan University; Steve White, Jacksonville State University; Shirley A. Wilson, North Central College; Lan Xue, Oregon State University; and Elaine Zanutto, The Wharton School, University of Pennsylvania. We also wish to acknowledge the contributions of Carolyn Crockett, our editor; Catie Ronquillo, assistant editor; Ashley Summers, editorial assistant; Jennifer Liang, technology project manager; Mandy Jellerichs, marketing manager; Ashley Pickering, marketing assistant; and of those involved in the production of the text: Hal Humphrey, production project manager; Betty Duncan, copyeditor; and Merrill Peterson and Sara Planck, production coordinators. Finally, we appreciate the support of our families during the writing of the various editions of this text. DENNIS D. WACKERLY WILLIAM MENDENHALL III RICHARD L. SCHEAFFER NOTE TO THE STUDENT As the title Mathematical Statistics with Applications implies, this text is concerned with statistics, in both theory and application, and only deals with mathematics as a necessary tool to give you a firm understanding of statistical techniques. The following suggestions for using the text will increase your learning and save your time. The connectivity of the book is provided by the introductions and summaries in each chapter. These sections explain how each chapter fits into the overall picture of statistical inference and how each chapter relates to the preceding ones. FIGURE 4 Applet calculation of the probability that a gamma–distributed random variable exceeds its mean xxi xxii Note to the Student Within the chapters, important concepts are set off as definitions. These should be read and reread until they are clearly understood because they form the framework on which everything else is built. The main theoretical results are set off as theorems. Although it is not necessary to understand the proof of each theorem, a clear understanding of the meaning and implications of the theorems is essential. It is also essential that you work many of the exercises—for at least four reasons: • You can be certain that you understand what you have read only by putting your knowledge to the test of working problems. • Many of the exercises are of a practical nature and shed light on the applications of probability and statistics. • Some of the exercises present new concepts and thus extend the material covered in the chapter. • Many of the applet exercises help build intuition, facilitate understanding of concepts, and provide answers that cannot (practically) be obtained using tables in the appendices (see Figure 4). D. D. W. W. M. R. L. S. CHAPTER 1 What Is Statistics? 1.1 Introduction 1.2 Characterizing a Set of Measurements: Graphical Methods 1.3 Characterizing a Set of Measurements: Numerical Methods 1.4 How Inferences Are Made 1.5 Theory and Reality 1.6 Summary References and Further Readings 1.1 Introduction Statistical techniques are employed in almost every phase of life. Surveys are designed to collect early returns on election day and forecast the outcome of an election. Consumers are sampled to provide information for predicting product preferences. Research physicians conduct experiments to determine the effect of various drugs and controlled environmental conditions on humans in order to infer the appropriate treatment for various illnesses. Engineers sample a product quality characteristic and various controllable process variables to identify key variables related to product quality. Newly manufactured electronic devices are sampled before shipping to decide whether to ship or hold individual lots. Economists observe various indices of economic health over a period of time and use the information to forecast the condition of the economy in the future. Statistical techniques play an important role in achieving the objective of each of these practical situations. The development of the theory underlying these techniques is the focus of this text. A prerequisite to a discussion of the theory of statistics is a definition of statistics and a statement of its objectives. Webster's New Collegiate Dictionary defines statistics as "a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data." Stuart and Ord (1991) state: "Statistics is the branch of the scientific method which deals with the data obtained by counting or measuring the properties of populations." Rice (1995), commenting on experimentation and statistical applications, states that statistics is "essentially concerned with procedures for analyzing data, especially data that in some vague sense have a random character." Freund and Walpole (1987), among others, view statistics as encompassing "the science of basing inferences on observed data and the entire 1 2 Chapter 1 What Is Statistics? problem of making decisions in the face of uncertainty." And Mood, Graybill, and Boes (1974) define statistics as "the technology of the scientific method" and add that statistics is concerned with "(1) the design of experiments and investigations, (2) statistical inference." A superficial examination of these definitions suggests a substantial lack of agreement, but all possess common elements. Each description implies that data are collected, with inference as the objective. Each requires selecting a subset of a large collection of data, either existent or conceptual, in order to infer the characteristics of the complete set. All the authors imply that statistics is a theory of information, with inference making as its objective. The large body of data that is the target of our interest is called the population, and the subset selected from it is a sample. The preferences of voters for a gubernatorial candidate, Jones, expressed in quantitative form (1 for "prefer" and 0 for "do not prefer") provide a real, finite, and existing population of great interest to Jones. To determine the true fraction who favor his election, Jones would need to interview all eligible voters—a task that is practically impossible. The voltage at a particular point in the guidance system for a spacecraft may be tested in the only three systems that have been built. The resulting data could be used to estimate the voltage characteristics for other systems that might be manufactured some time in the future. In this case, the population is conceptual. We think of the sample of three as being representative of a large population of guidance systems that could be built using the same method. Presumably, this population would possess characteristics similar to the three systems in the sample. Analogously, measurements on patients in a medical experiment represent a sample from a conceptual population consisting of all patients similarly afflicted today, as well as those who will be afflicted in the near future. You will find it useful to clearly define the populations of interest for each of the scenarios described earlier in this section and to clarify the inferential objective for each. It is interesting to note that billions of dollars are spent each year by U.S. industry and government for data from experimentation, sample surveys, and other data collection procedures. This money is expended solely to obtain information about phenomena susceptible to measurement in areas of business, science, or the arts. The implications of this statement provide keys to the nature of the very valuable contribution that the discipline of statistics makes to research and development in all areas of society. Information useful in inferring some characteristic of a population (either existing or conceptual) is purchased in a specified quantity and results in an inference (estimation or decision) with an associated degree of goodness. For example, if Jones arranges for a sample of voters to be interviewed, the information in the sample can be used to estimate the true fraction of all voters who favor Jones's election. In addition to the estimate itself, Jones should also be concerned with the likelihood (chance) that the estimate provided is close to the true fraction of eligible voters who favor his election. Intuitively, the larger the number of eligible voters in the sample, the higher will be the likelihood of an accurate estimate. Similarly, if a decision is made regarding the relative merits of two manufacturing processes based on examination of samples of products from both processes, we should be interested in the decision regarding which is better and the likelihood that the decision is correct. In general, the study of statistics is concerned with the design of experiments or sample surveys to obtain a specified quantity of information at minimum cost and the optimum use of this information in making an inference about a population. The objective of statistics is to make 1.2 Characterizing a Set of Measurements: Graphical Methods 3 an inference about a population based on information contained in a sample from that population and to provide an associated measure of goodness for the inference. Exercises 1.1 For each of the following situations, identify the population of interest, the inferential objective, and how you might go about collecting a sample. a The National Highway Safety Council wants to estimate the proportion of automobile tires with unsafe tread among all tires manufactured by a specific company during the current production year. b A political scientist wants to determine whether a majority of adult residents of a state favor a unicameral legislature. c A medical scientist wants to estimate the average length of time until the recurrence of a certain disease. d An electrical engineer wants to determine whether the average length of life of transistors of a certain type is greater than 500 hours. e A university researcher wants to estimate the proportion of U.S. citizens from "Generation X" who are interested in starting their own businesses. f For more than a century, normal body temperature for humans has been accepted to be 98.6◦ Fahrenheit. Is it really? Researchers want to estimate the average temperature of healthy adults in the United States. g A city engineer wants to estimate the average weekly water consumption for single-family dwelling units in the city. 1.2 Characterizing a Set of Measurements: Graphical Methods In the broadest sense, making an inference implies partially or completely describing a phenomenon or physical object. Little difficulty is encountered when appropriate and meaningful descriptive measures are available, but this is not always the case. For example, we might characterize a person by using height, weight, color of hair and eyes, and other descriptive measures of the person's physiognomy. Identifying a set of descriptive measures to characterize an oil painting would be a comparatively more difficult task. Characterizing a population that consists of a set of measurements is equally challenging. Consequently, a necessary prelude to a discussion of inference making is the acquisition of a method for characterizing a set of numbers. The characterizations must be meaningful so that knowledge of the descriptive measures enables us to clearly visualize the set of numbers. In addition, we require that the characterizations possess practical significance so that knowledge of the descriptive measures for a population can be used to solve a practical, nonstatistical problem. We will develop our ideas on this subject by examining a process that generates a population. Consider a study to determine important variables affecting profit in a business that manufactures custom-made machined devices. Some of these variables might be the dollar size of the contract, the type of industry with which the contract is negotiated, the degree of competition in acquiring contracts, the salesperson who estimates the 4 Chapter 1 What Is Statistics? contract, fixed dollar costs, and the supervisor who is assigned the task of organizing and conducting the manufacturing operation. The statistician will wish to measure the response or dependent variable, profit per contract, for several jobs (the sample). Along with recording the profit, the statistician will obtain measurements on the variables that might be related to profit—the independent variables. His or her objective is to use information in the sample to infer the approximate relationship of the independent variables just described to the dependent variable, profit, and to measure the strength of this relationship. The manufacturer's objective is to determine optimum conditions for maximizing profit. The population of interest in the manufacturing problem is conceptual and consists of all measurements of profit (per unit of capital and labor invested) that might be made on contracts, now and in the future, for fixed values of the independent variables (size of the contract, measure of competition, etc.). The profit measurements will vary from contract to contract in a seemingly random manner as a result of variations in materials, time needed to complete individual segments of the work, and other uncontrollable variables affecting the job. Consequently, we view the population as being represented by a distribution of profit measurements, with the form of the distribution depending on specific values of the independent variables. Our wish to determine the relationship between the dependent variable, profit, and a set of independent variables is therefore translated into a desire to determine the effect of the independent variables on the conceptual distribution of population measurements. An individual population (or any set of measurements) can be characterized by a relative frequency distribution, which can be represented by a relative frequency histogram. A graph is constructed by subdividing the axis of measurement into intervals of equal width. A rectangle is constructed over each interval, such that the height of the rectangle is proportional to the fraction of the total number of measurements falling in each cell. For example, to characterize the ten measurements 2.1, 2.4, 2.2, 2.3, 2.7, 2.5, 2.4, 2.6, 2.6, and 2.9, we could divide the axis of measurement into intervals of equal width (say, .2 unit), commencing with 2.05. The relative frequencies (fraction of total number of measurements), calculated for each interval, are shown in Figure 1.1. Notice that the figure gives a clear pictorial description of the entire set of ten measurements. Observe that we have not given precise rules for selecting the number, widths, or locations of the intervals used in constructing a histogram. This is because the F I G U R E 1.1 Relative frequency histogram Relative Frequency .3 .2 .1 0 2.05 2.25 2.45 2.65 2.85 3.05 Axis of Measurement 1.2 Characterizing a Set of Measurements: Graphical Methods 5 selection of these items is somewhat at the discretion of the person who is involved in the construction. Although they are arbitrary, a few guidelines can be very helpful in selecting the intervals. Points of subdivision of the axis of measurement should be chosen so that it is impossible for a measurement to fall on a point of division. This eliminates a source of confusion and is easily accomplished, as indicated in Figure 1.1. The second guideline involves the width of each interval and consequently, the minimum number of intervals needed to describe the data. Generally speaking, we wish to obtain information on the form of the distribution of the data. Many times the form will be mound-shaped, as illustrated in Figure 1.2. (Others prefer to refer to distributions such as these as bellshaped, or normal.) Using many intervals with a small amount of data results in little summarization and presents a picture very similar to the data in their original form. The larger the amount of data, the greater the number of included intervals can be while still presenting a satisfactory picture of the data. We suggest spanning the range of the data with from 5 to 20 intervals and using the larger number of intervals for larger quantities of data. In most real-life applications, computer software (Minitab, SAS, R, S+, JMP, etc.) is used to obtain any desired histograms. These computer packages all produce histograms satisfying widely agreed-upon constraints on scaling, number of intervals used, widths of intervals, and the like. Some people feel that the description of data is an end in itself. Histograms are often used for this purpose, but there are many other graphical methods that provide meaningful summaries of the information contained in a set of data. Some excellent references for the general topic of graphical descriptive methods are given in the references at the end of this chapter. Keep in mind, however, that the usual objective of statistics is to make inferences. The relative frequency distribution associated with a data set and the accompanying histogram are sufficient for our objectives in developing the material in this text. This is primarily due to the probabilistic interpretation that can be derived from the frequency histogram, Figure 1.1. We have already stated that the area of a rectangle over a given interval is proportional to the fraction of the total number of measurements falling in that interval. Let's extend this idea one step further. If a measurement is selected at random from the original data set, the probability that it will fall in a given interval is proportional to the area under the histogram lying over that interval. (At this point, we rely on the layperson's concept of probability. This term is discussed in greater detail in Chapter 2.) For example, for the data used to construct Figure 1.1, the probability that a randomly selected measurement falls in the interval from 2.05 to 2.45 is .5 because half the measurements fall in this interval. Correspondingly, the area under the histogram in Figure 1.1 over the interval from F I G U R E 1.2 Relative frequency distribution Relative Frequency 0 2.05 2.25 2.45 2.65 2.85 3.05 6 Chapter 1 What Is Statistics? 2.05 to 2.45 is half of the total area under the histogram. It is clear that this interpretation applies to the distribution of any set of measurements—a population or a sample. Suppose that Figure 1.2 gives the relative frequency distribution of profit (in millions of dollars) for a conceptual population of profit responses for contracts at specified settings of the independent variables (size of contract, measure of competition, etc.). The probability that the next contract (at the same settings of the independent variables) yields a profit that falls in the interval from 2.05 to 2.45 million is given by the proportion of the area under the distribution curve that is shaded in Figure 1.2. Exercises 1.2 Are some cities more windy than others? Does Chicago deserve to be nicknamed "The Windy City"? Given below are the average wind speeds (in miles per hour) for 45 selected U.S. cities: 8.9 7.1 9.1 8.8 10.2 12.4 11.8 10.9 12.7 10.3 8.6 10.7 10.3 8.4 7.7 11.3 7.6 9.6 7.8 10.6 9.2 9.1 7.8 5.7 8.3 8.8 9.2 11.5 10.5 8.8 35.1 8.2 9.3 10.5 9.5 6.2 9.0 7.9 9.6 8.8 7.0 8.7 8.8 8.9 9.4 Source: The World Almanac and Book of Facts, 2004. a Construct a relative frequency histogram for these data. (Choose the class boundaries without including the value 35.1 in the range of values.) b The value 35.1 was recorded at Mt. Washington, New Hampshire. Does the geography of that city explain the magnitude of its average wind speed? c The average wind speed for Chicago is 10.3 miles per hour. What percentage of the cities have average wind speeds in excess of Chicago's? d Do you think that Chicago is unusually windy? 1.3 Of great importance to residents of central Florida is the amount of radioactive material present in the soil of reclaimed phosphate mining areas. Measurements of the amount of 238 U in 25 soil samples were as follows (measurements in picocuries per gram): .74 .32 1.66 3.59 4.55 6.47 9.99 .70 .37 .76 1.90 1.77 2.42 1.09 2.03 2.69 2.41 .54 8.32 5.70 .75 1.96 3.36 4.06 12.48 Construct a relative frequency histogram for these data. 1.4 The top 40 stocks on the over-the-counter (OTC) market, ranked by percentage of outstanding shares traded on one day last year are as follows: 11.88 7.99 7.15 7.13 6.27 6.07 5.98 5.91 5.49 5.26 5.07 4.94 4.81 4.79 4.55 4.43 4.40 4.05 3.94 3.93 3.78 3.69 3.62 3.48 3.44 3.36 3.26 3.20 3.11 3.03 2.99 2.89 2.88 2.74 2.74 2.69 2.68 2.63 2.62 2.61 a Construct a relative frequency histogram to describe these data. b What proportion of these top 40 stocks traded more than 4% of the outstanding shares? Exercises 7 c If one of the stocks is selected at random from the 40 for which the preceding data were taken, what is the probability that it will have traded fewer than 5% of its outstanding shares? 1.5 Given here is the relative frequency histogram associated with grade point averages (GPAs) of a sample of 30 students: Relative Frequency 6/30 3/30 0 1.85 2.05 2.25 2.45 2.65 2.85 3.05 3.25 3.45 Grade Point Average a Which of the GPA categories identified on the horizontal axis are associated with the largest proportion of students? b What proportion of students had GPAs in each of the categories that you identified? c What proportion of the students had GPAs less than 2.65? 1.6 The relative frequency histogram given next was constructed from data obtained from a random sample of 25 families. Each was asked the number of quarts of milk that had been purchased the previous week. Relative .4 Frequency .3 .2 .1 0 0 1 2 3 4 5 Quarts a Use this relative frequency histogram to determine the number of quarts of milk purchased by the largest proportion of the 25 families. The category associated with the largest relative frequency is called the modal category. b What proportion of the 25 families purchased more than 2 quarts of milk? c What proportion purchased more than 0 but fewer than 5 quarts? 8 Chapter 1 What Is Statistics? 1.7 The self-reported heights of 105 students in a biostatistics class were used to construct the histogram given below. Relative 10/105 frequency 5/105 0 60 63 66 69 72 75 Heights a Describe the shape of the histogram. b Does this histogram have an unusual feature? c Can you think of an explanation for the two peaks in the histogram? Is there some consideration other than height that results in the two separate peaks? What is it? 1.8 An article in Archaeometry presented an analysis of 26 samples of Romano–British pottery, found at four different kiln sites in the United Kingdom. The percentage of aluminum oxide in each of the 26 samples is given below: Llanederyn 14.4 11.6 13.8 11.1 14.6 13.4 11.5 12.4 13.8 13.1 10.9 12.7 10.1 12.5 Caldicot 11.8 11.6 Island Thorns 18.3 15.8 18.0 18.0 20.8 Ashley Rails 17.7 18.3 16.7 14.8 19.1 Source: A. Tubb, A. J. Parker, and G. Nickless, "The Analysis of Romano–British Pottery by Atomic Absorption Spectrophotometry," Archaeometry 22 (1980): 153. a Construct a relative frequency histogram to describe the aluminum oxide content of all 26 pottery samples. b What unusual feature do you see in this histogram? Looking at the data, can you think of an explanation for this unusual feature? 1.3 Characterizing a Set of Measurements: Numerical Methods The relative frequency histograms presented in Section 1.2 provide useful information regarding the distribution of sets of measurement, but histograms are usually not adequate for the purpose of making inferences. Indeed, many similar histograms 1.3 Characterizing a Set of Measurements: Numerical Methods 9 could be formed from the same set of measurements. To make inferences about a population based on information contained in a sample and to measure the goodness of the inferences, we need rigorously defined quantities for summarizing the information contained in a sample. These sample quantities typically have mathematical properties, to be developed in the following chapters, that allow us to make probability statements regarding the goodness of our inferences. The quantities we define are numerical descriptive measures of a set of data. We seek some numbers that have meaningful interpretations and that can be used to describe the frequency distribution for any set of measurements. We will confine our attention to two types of descriptive numbers: measures of central tendency and measures of dispersion or variation. Probably the most common measure of central tendency used in statistics is the arithmetic mean. (Because this is the only type of mean discussed in this text, we will omit the word arithmetic.) DEFINITION 1.1 The mean of a sample of n measured responses y1 , y2 , . . . , yn is given by y= n 1( yi . n i=1 The corresponding population mean is denoted µ. The symbol y, read "y bar," refers to a sample mean. We usually cannot measure the value of the population mean, µ; rather, µ is an unknown constant that we may want to estimate using sample information. The mean of a set of measurements only locates the center of the distribution of data; by itself, it does not provide an adequate description of a set of measurements. Two sets of measurements could have widely different frequency distributions but equal means, as pictured in Figure 1.3. The difference between distributions I and II in the figure lies in the variation or dispersion of measurements on either side of the mean. To describe data adequately, we must also define measures of data variability. The most common measure of variability used in statistics is the variance, which is a function of the deviations (or distances) of the sample measurements from their mean. F I G U R E 1.3 Frequency distributions with equal means but different amounts of variation ! ! " "" 10 Chapter 1 What Is Statistics? DEFINITION 1.2 The variance of a sample of measurements y1 , y2 , . . . , yn is the sum of the square of the differences between the measurements and their mean, divided by n − 1. Symbolically, the sample variance is s2 = n 1 ( (yi − y)2 . n − 1 i=1 The corresponding population variance is denoted by the symbol σ 2 . Notice that we divided by n − 1 instead of by n in our definition of s 2 . The theoretical reason for this choice of divisor is provided in Chapter 8, where we will show that s 2 defined this way provides a "better" estimator for the true population variance, σ 2 . Nevertheless, it is useful to think of s 2 as "almost" the average of the squared deviations of the observed values from their mean. The larger the variance of a set of measurements, the greater will be the amount of variation within the set. The variance is of value in comparing the relative variation of two sets of measurements, but it gives information about the variation in a single set only when interpreted in terms of the standard deviation. DEFINITION 1.3 The standard deviation of a sample of measurements is the positive square root of the variance; that is, √ s = s2. √ The corresponding population standard deviation is denoted by σ = σ 2 . Although it is closely related to the variance, the standard deviation can be used to give a fairly accurate picture of data variation for a single set of measurements. It can be interpreted using Tchebysheff's theorem (which is discussed in Exercise 1.32 and will be presented formally in Chapter 3) and by the empirical rule (which we now explain). Many distributions of data in real life are mound-shaped; that is, they can be approximated by a bell-shaped frequency distribution known as a normal curve. Data possessing mound-shaped distributions have definite characteristics of variation, as expressed in the following statement. Empirical Rule For a distribution of measurements that is approximately normal (bell shaped), it follows that the interval with end points µ ± σ contains approximately 68% of the measurements. µ ± 2σ contains approximately 95% of the measurements. µ ± 3σ contains almost all of the measurements. Exercises 11 F I G U R E 1.4 Normal curve 68% ! # # As was mentioned in Section 1.2, once the frequency distribution of a set of measurements is known, probability statements regarding the measurements can be made. These probabilities were shown as areas under a frequency histogram. Analogously, the probabilities specified in the empirical rule are areas under the normal curve shown in Figure 1.4. Use of the empirical rule is illustrated by the following example. Suppose that the scores on an achievement test given to all high school seniors in a state are known to have, approximately, a normal distribution with mean µ = 64 and standard deviation σ = 10. It can then be deduced that approximately 68% of the scores are between 54 and 74, that approximately 95% of the scores are between 44 and 84, and that almost all of the scores are between 34 and 94. Thus, knowledge of the mean and the standard deviation gives us a fairly good picture of the frequency distribution of scores. Suppose that a single high school student is randomly selected from those who took the test. What is the probability that his score will be between 54 and 74? Based on the empirical rule, we find that 0.68 is a reasonable answer to this probability question. The utility and value of the empirical rule are due to the common occurrence of approximately normal distributions of data in nature—more so because the rule applies to distributions that are not exactly normal but just mound-shaped. You will find that approximately 95% of a set of measurements will be within 2σ of µ for a variety of distributions. Exercises 1.9 Resting breathing rates for college-age students are approximately normally distributed with mean 12 and standard deviation 2.3 breaths per minute. What fraction of all college-age students have breathing rates in the following intervals? a b c d 1.10 9.7 to 14.3 breaths per minute 7.4 to 16.6 breaths per minute 9.7 to 16.6 breaths per minute Less than 5.1 or more than 18.9 breaths per minute It has been projected that the average and standard deviation of the amount of time spent online using the Internet are, respectively, 14 and 17 hours per person per year (many do not use the Internet at all!). a What value is exactly 1 standard deviation below the mean? b If the amount of time spent online using the Internet is approximately normally distributed, what proportion of the users spend an amount of time online that is less than the value you found in part (a)? 12 Chapter 1 What Is Statistics? c Is the amount of time spent online using the Internet approximately normally distributed? Why? 1.11 The following results on summations will help us in calculating the sample variance s 2 . For any constant c, a n ( c = nc. i=1 n n ( ( cyi = c yi . b i=1 c i=1 n n n ( ( ( (xi + yi ) = xi + yi . i=1 i=1 i=1 Use (a), (b), and (c) to show that ⎡ + %2 ⎤ n n n 1 ( 1 ⎣( 1 ( 2 2 s = (yi − y) = y − yi ⎦. n − 1 i=1 n − 1 i=1 i n i=1 2 1.12 Use the result of Exercise 1.11 to calculate s for the n = 6 sample measurements 1, 4, 2, 1, 3, and 3. 1.13 Refer to Exercise 1.2. a Calculate y and s for the data given. b Calculate the interval y ± ks for k = 1, 2, and 3. Count the number of measurements that fall within each interval and compare this result with the number that you would expect according to the empirical rule. 1.14 Refer to Exercise 1.3 and repeat parts (a) and (b) of Exercise 1.13. 1.15 Refer to Exercise 1.4 and repeat parts (a) and (b) of Exercise 1.13. 1.16 In Exercise 1.4, there is one extremely large value (11.88). Eliminate this value and calculate y and s for the remaining 39 observations. Also, calculate the intervals y ± ks for k = 1, 2, and 3; count the number of measurements in each; then compare these results with those predicted by the empirical rule. Compare the answers here to those found in Exercise 1.15. Note the effect of a single large observation on y and s. 1.17 The range of a set of measurements is the difference between the largest and the smallest values. The empirical rule suggests that the standard deviation of a set of measurements may be roughly approximated by one-fourth of the range (that is, range/4). Calculate this approximation to s for the data sets in Exercises 1.2, 1.3, and 1.4. Compare the result in each case to the actual, calculated value of s. 1.18 The College Board's verbal and mathematics Scholastic Aptitude Tests are scored on a scale of 200 to 800. It seems reasonable to assume that the distribution of test scores are approximately normally distributed for both tests. Use the result from Exercise 1.17 to approximate the standard deviation for scores on the verbal test. 1.19 According to the Environmental Protection Agency, chloroform, which in its gaseous form is suspected to be a cancer-causing agent, is present in small quantities in all the country's 240,000 public water sources. If the mean and standard deviation of the amounts of chloroform present in water sources are 34 and 53 micrograms per liter (µg/L), respectively, explain why chloroform amounts do not have a normal distribution. 1.4 How Inferences Are Made 13 1.20 Weekly maintenance costs for a factory, recorded over a long period of time and adjusted for inflation, tend to have an approximately normal distribution with an average of $420 and a standard deviation of $30. If $450 is budgeted for next week, what is an approximate probability that this budgeted figure will be exceeded? 1.21 The manufacturer of a new food additive for beef cattle claims that 80% of the animals fed a diet including this additive should have monthly weight gains in excess of 20 pounds. A large sample of measurements on weight gains for cattle fed this diet exhibits an approximately normal distribution with mean 22 pounds and standard deviation 2 pounds. Do you think the sample information contradicts the manufacturer's claim? (Calculate the probability of a weight gain exceeding 20 pounds.) 1.4 How Inferences Are Made The mechanism instrumental in making inferences can be well illustrated by analyzing our own intuitive inference-making procedures. Suppose that two candidates are running for a public office in our community and that we wish to determine whether our candidate, Jones, is favored to win. The population of interest is the set of responses from all eligible voters who will vote on election day, and we wish to determine whether the fraction favoring Jones exceeds .5. For the sake of simplicity, suppose that all eligible voters will go to the polls and that we randomly select a sample of 20 from the courthouse roster of voters. All 20 are contacted and all favor Jones. What do you conclude about Jones's prospects for winning the election? There is little doubt that most of us would immediately infer that Jones will win. This is an easy inference to make, but this inference itself is not our immediate goal. Rather, we wish to examine the mental processes that were employed in reaching this conclusion about the prospective behavior of a large voting population based on a sample of only 20 people. Winning means acquiring more than 50% of the votes. Did we conclude that Jones would win because we thought that the fraction favoring Jones in the sample was identical to the fraction favoring Jones in the population? We know that this is probably not true. A simple experiment will verify that the fraction in the sample favoring Jones need not be the same as the fraction of the population who favor him. If a balanced coin is tossed, it is intuitively obvious that the true proportion of times it will turn up heads is .5. Yet if we sample the outcomes for our coin by tossing it 20 times, the proportion of heads will vary from sample to sample; that is, on one occasion we might observe 12 heads out of 20 flips, for a sample proportion of 12/20 = .6. On another occasion, we might observe 8 heads out of 20 flips, for a sample proportion of 8/20 = .4. In fact, the sample proportion of heads could be 0, .05, .10, . . . , 1.0. Did we conclude that Jones would win because it would be impossible for 20 out of 20 sample voters to favor him if in fact less than 50% of the electorate intended to vote for him? The answer to this question is certainly no, but it provides the key to our hidden line of logic. It is not impossible to draw 20 out of 20 favoring Jones when less than 50% of the electorate favor him, but it is highly improbable. As a result, we concluded that he would win. 14 Chapter 1 What Is Statistics? This example illustrates the potent role played by probability in making inferences. Probabilists assume that they know the structure of the population of interest and use the theory of probability to compute the probability of obtaining a particular sample. Assuming that they know the structure of a population generated by random drawings of five cards from a standard deck, probabilists compute the probability that the draw will yield three aces and two kings. Statisticians use probability to make the trip in reverse—from the sample to the population. Observing five aces in a sample of five cards, they immediately infer that the deck (which generates the population) is loaded and not standard. The probability of drawing five aces from a standard deck is zero! This is an exaggerated case, but it makes the point. Basic to inference making is the problem of calculating the probability of an observed sample. As a result, probability is the mechanism used in making statistical inferences. One final comment is in order. If you did not think that the sample justified an inference that Jones would win, do not feel too chagrined. One can easily be misled when making intuitive evaluations of the probabilities of events. If you decided that the probability was very low that 20 voters out of 20 would favor Jones, assuming that Jones would lose, you were correct. However, it is not difficult to concoct an example in which an intuitive assessment of probability would be in error. Intuitive assessments of probabilities are unsatisfactory, and we need a rigorous theory of probability in order to develop methods of inference. 1.5 Theory and Reality Theories are conjectures proposed to explain phenomena in the real world. As such, theories are approximations or models for reality. These models or explanations of reality are presented in verbal form in some less quantitative fields and as mathematical relationships in others. Whereas a theory of social change might be expressed verbally in sociology, a description of the motion of a vibrating string is presented in a precise mathematical manner in physics. When we choose a mathematical model for a physical process, we hope that the model reflects faithfully, in mathematical terms, the attributes of the physical process. If so, the mathematical model can be used to arrive at conclusions about the process itself. If we could develop an equation to predict the position of a vibrating string, the quality of the prediction would depend on how well the equation fit the motion of the string. The process of finding a good equation is not necessarily simple and usually requires several simplifying assumptions (uniform string mass, no air resistance, etc.). The final criterion for deciding whether a model is "good" is whether it yields good and useful information. The motivation for using mathematical models lies primarily in their utility. This text is concerned with the theory of statistics and hence with models of reality. We will postulate theoretical frequency distributions for populations and will develop a theory of probability and inference in a precise mathematical manner. The net result will be a theoretical or mathematical model for acquiring and utilizing information in real life. The model will not be an exact representation of nature, but this should not disturb us. Its utility, like that of other theories, will be measured by its ability to assist us in understanding nature and in solving problems in the real world. References and Further Readings 15 1.6 Summary The objective of statistics is to make an inference about a population based on information contained in a sample taken from that population. The theory of statistics is a theory of information concerned with quantifying information, designing experiments or procedures for data collection, and analyzing data. Our goal is to minimize the cost of a specified quantity of information and to use this information to make inferences. Most important, we have viewed making an inference about the unknown population as a two-step procedure. First, we enlist a suitable inferential procedure for the given situation. Second, we seek a measure of the goodness of the resulting inference. For example, every estimate of a population characteristic based on information contained in the sample might have associated with it a probabilistic bound on the error of estimation. A necessary prelude to making inferences about a population is the ability to describe a set of numbers. Frequency distributions provide a graphic and useful method for characterizing conceptual or real populations of numbers. Numerical descriptive measures are often more useful when we wish to make an inference and measure the goodness of that inference. The mechanism for making inferences is provided by the theory of probability. The probabilist reasons from a known population to the outcome of a single experiment, the sample. In contrast, the statistician utilizes the theory of probability to calculate the probability of an observed sample and to infer from this the characteristics of an unknown population. Thus, probability is the foundation of the theory of statistics. Finally, we have noted the difference between theory and reality. In this text, we will study the mathematical theory of statistics, which is an idealization of nature. It is rigorous, mathematical, and subject to study in a vacuum completely isolated from the real world. Or it can be tied very closely to reality and can be useful in making inferences from data in all fields of science. In this text, we will be utilitarian. We will not regard statistics as a branch of mathematics but as an area of science concerned with developing a practical theory of information. We will consider statistics as a separate field, analogous to physics—not as a branch of mathematics but as a theory of information that utilizes mathematics heavily. Subsequent chapters will expand on the topics that we have encountered in this introduction. We will begin with a study of the mechanism employed in making inferences, the theory of probability. This theory provides theoretical models for generating experimental data and thereby provides the basis for our study of statistical inference. References and Further Readings Cleveland, W. S. 1994. The Elements of Graphing Data. Murray Hill, N.J.: AT&T Bell Laboratories. ———. Visualizing Data. 1993. Summit, N.J.: Hobart Press. Fraser, D. A. S. 1958. Statistics, an Introduction. New York: Wiley. 16 Chapter 1 What Is Statistics? Freund, J. E., and R. E. Walpole. 1987. Mathematical Statistics, 4th ed. Englewood Cliffs, N.J.: Prentice Hall. Iman, R. L. 1994. A Data-Based Approach to Statistics. Belmont, Calif.: Duxbury Press. Mendenhall, W., R. J. Beaver, and B. M. Beaver. 2006. Introduction to Probability and Statistics, 12th ed. Belmont, Calif.: Duxbury Press. Mood, A. M., F. A. Graybill, and D. Boes. 1974. Introduction to the Theory of Statistics, 3rd ed. New York: McGraw-Hill. Moore, D. S., and G. P. McCabe. 2002. Introduction to the Practice of Statistics, 4th ed. New York: Freeman. Rice, J. A. Mathematical Statistics and Data Analysis, 2nd ed. Belmont, Calif.: Duxbury Press, 1995. Stuart, A., and J. K. Ord. 1991. Kendall's Theory of Statistics, 5th ed., vol. 1. London: Edward Arnold. Supplementary Exercises 1.22 Prove that the sum of the deviations of a set of measurements about their mean is equal to zero; that is, n ( (yi − y) = 0. i=1 1.23 The mean duration of television commercials is 75 seconds with standard deviation 20 seconds. Assume that the durations are approximately normally distributed to answer the following. a What percentage of commercials last longer than 95 seconds? b What percentage of the commercials last between 35 and 115 seconds? c Would you expect commercial to last longer than 2 minutes? Why or why not? 1.24 Aqua running has been suggested as a method of cardiovascular conditioning for injured athletes and others who desire a low-impact aerobics program. In a study to investigate the relationship between exercise cadence and heart rate,1 the heart rates of 20 healthy volunteers were measured at a cadence of 48 cycles per minute (a cycle consisted of two steps). The data are as follows: 87 101 109 91 79 78 80 112 96 94 95 98 90 94 92 107 96 81 98 96 a Use the range of the measurements to obtain an estimate of the standard deviation. b Construct a frequency histogram for the data. Use the histogram to obtain a visual approximation to y and s. c Calculate y and s. Compare these results with the calculation checks provided by parts (a) and (b). d Construct the intervals y ± ks, k = 1, 2, and 3, and count the number of measurements falling in each interval. Compare the fractions falling in the intervals with the fractions that you would expect according to the empirical rule. 1. R. P. Wilder, D. Breenan, and D. E. Schotte,"A Standard Measure for Exercise Prescription for Aqua Running," American Journal of Sports Medicine 21(1) (1993): 45. Supplementary Exercises 1.25 17 The following data give the lengths of time to failure for n = 88 radio transmitter-receivers: 16 392 358 304 108 156 438 60 360 56 168 224 576 384 16 194 216 120 208 232 72 168 16 128 256 72 136 168 308 340 40 64 114 80 56 246 8 224 184 32 104 112 40 280 96 656 328 80 80 552 272 72 112 184 152 536 224 464 72 16 72 152 168 288 264 208 400 40 448 56 424 184 328 40 168 96 160 80 32 716 608 264 240 480 152 352 224 176 a Use the range to approximate s for the n = 88 lengths of time to failure. b Construct a frequency histogram for the data. [Notice the tendency of the distribution to tail outward (skew) to the right.] c Use a calculator (or computer) to calculate y and s. (Hand calculation is much too tedious for this exercise.) d Calculate the intervals y ± ks, k = 1, 2, and 3, and count the number of measurements falling in each interval. Compare your results with the empirical rule results. Note that the empirical rule provides a rather good description of these data, even though the distribution is highly skewed. 1.26 Compare the ratio of the range to s for the three sample sizes (n = 6, 20, and 88) for Exercises 1.12, 1.24, and 1.25. Note that the ratio tends to increase as the amount of data increases. The greater the amount of data, the greater will be their tendency to contain a few extreme values that will inflate the range and have relatively little effect on s. We ignored this phenomenon and suggested that you use 4 as the ratio for finding a guessed value of s in checking calculations. 1.27 A set of 340 examination scores exhibiting a bell-shaped relative frequency distribution has a mean of y = 72 and a standard deviation of s = 8. Approximately how many of the scores would you expect to fall in the interval from 64 to 80? The interval from 56 to 88? 1.28 The discharge of suspended solids from a phosphate mine is normally distributed with mean daily discharge 27 milligrams per liter (mg/L) and standard deviation 14 mg/L. In what proportion of the days will the daily discharge be less than 13 mg/L? 1.29 A machine produces bearings with mean diameter 3.00 inches and standard deviation 0.01 inch. Bearings with diameters in excess of 3.02 inches or less than 2.98 inches will fail to meet quality specifications. a Approximately what fraction of this machine's production will fail to meet specifications? b What assumptions did you make concerning the distribution of bearing diameters in order to answer this question? 1.30 Compared to their stay-at-home peers, women employed outside the home have higher levels of high-density lipoproteins (HDL), the "good" cholesterol associated with lower risk for heart attacks. A study of cholesterol levels in 2000 women, aged 25–64, living in Augsburg, Germany, was conducted by Ursula Haertel, Ulrigh Keil, and colleagues2 at the GSF-Medis Institut in 2. Science News 135 (June 1989): 389. 18 Chapter 1 What Is Statistics? Munich. Of these 2000 women, the 48% who worked outside the home had HDL levels that were between 2.5 and 3.6 milligrams per deciliter (mg/dL) higher than the HDL levels of their stayat-home counterparts. Suppose that the difference in HDL levels is normally distributed, with mean 0 (indicating no difference between the two groups of women) and standard deviation 1.2 mg/dL. If you were to select an employed woman and a stay-at-home counterpart at random, what is the probability that the difference in their HDL levels would be between 1.2 and 2.4? 1.31 Over the past year, a fertilizer production process has shown an average daily yield of 60 tons with a variance in daily yields of 100. If the yield should fall to less than 40 tons tomorrow, should this result cause you to suspect an abnormality in the process? (Calculate the probability of obtaining less than 40 tons.) What assumptions did you make concerning the distribution of yields? *1.32 Let k ≥ 1. Show that, for any set of n measurements, the fraction included in the interval y − ks to y + ks is at least (1 − 1/k 2 ). [Hint: / . n ( 1 2 2 (yi − y) . s = n − 1 i=1 In this expression, replace all deviations for which |yi − y| ≥ ks with ks. Simplify.] This result is known as Tchebysheff's theorem.3 1.33 A personnel manager for a certain industry has records of the number of employees absent per day. The average number absent is 5.5, and the standard deviation is 2.5. Because there are many days with zero, one, or two absent and only a few with more than ten absent, the frequency distribution is highly skewed. The manager wants to publish an interval in which at least 75% of these values lie. Use the result in Exercise 1.32 to find such an interval. 1.34 For the data discussed in Exercise 1.33, give an upper bound to the fraction of days when there are more than 13 absentees. 1.35 A pharmaceutical company wants to know whether an experimental drug has an effect on systolic blood pressure. Fifteen randomly selected subjects were given the drug and, after sufficient time for the drug to have an impact, their systolic blood pressures were recorded. The data appear below: 172 148 123 140 108 152 123 129 133 130 137 128 115 161 142 a Approximate the value of s using the range approximation. b Calculate the values of y and s for the 15 blood pressure readings. c Use Tchebysheff's theorem (Exercise 1.32) to find values a and b such that at least 75% of the blood pressure measurements lie between a and b. d Did Tchebysheff's theorem work? That is, use the data to find the actual percent of blood pressure readings that are between the values a and b you found in part (c). Is this actual percentage greater than 75%? 1.36 A random sample of 100 foxes was examined by a team of veterinarians to determine the prevalence of a specific parasite. Counting the number of parasites of this specific type, the veterinarians found that 69 foxes had no parasites of the type of interest, 17 had one parasite of the 3. Exercises preceded by an asterisk are optional. Supplementary Exercises 19 type under study, and so on. A summary of their results is given in the following table: Number of Parasites Number of Foxes 0 69 1 17 2 6 3 3 4 1 5 2 6 1 7 0 8 1 a Construct the relative frequency histogram for the number of parasites per fox. b Calculate y and s for the data given. c What fraction of the parasite counts falls within 2 standard deviations of the mean? Within 3 standard deviations? Do your results agree with Tchebysheff's theorem (Exercise 1.32) and/or the empirical rule? 1.37 Studies indicate that drinking water supplied by some old lead-lined city piping systems may contain harmful levels of lead. Based on data presented by Karalekas and colleagues,4 it appears that the distribution of lead content readings for individual water specimens has mean .033 mg/L and standard deviation .10 mg/L. Explain why it is obvious that the lead content readings are not normally distributed. 1.38 In Exercise 1.19, the mean and standard deviation of the amount of chloroform present in water sources were given to be 34 and 53, respectively. You argued that the amounts of chloroform could therefore not be normally distributed. Use Tchebysheff's theorem (Exercise 1.32) to describe the distribution of chloroform amounts in water sources. 4. P. C. Karalekas, Jr., C. R. Ryan, and F. B. Taylor, "Control of Lead, Copper and Iron Pipe Corrosion in Boston," American Water Works Journal (February 1983): 92. CHAPTER 2 Probability 2.1 Introduction 2.2 Probability and Inference 2.3 A Review of Set Notation 2.4 A Probabilistic Model for an Experiment: The Discrete Case 2.5 Calculating the Probability of an Event: The Sample-Point Method 2.6 Tools for Counting Sample Points 2.7 Conditional Probability and the Independence of Events 2.8 Two Laws of Probability 2.9 Calculating the Probability of an Event: The Event-Composition Method 2.10 The Law of Total Probability and Bayes' Rule 2.11 Numerical Events and Random Variables 2.12 Random Sampling 2.13 Summary References and Further Readings 2.1 Introduction In everyday conversation, the term probability is a measure of one's belief in the occurrence of a future event. We accept this as a meaningful and practical interpretation of probability but seek a clearer understanding of its context, how it is measured, and how it assists in making inferences. The concept of probability is necessary in work with physical, biological, or social mechanisms that generate observations that cannot be predicted with certainty. For example, the blood pressure of a person at a given point in time cannot be predicted with certainty, and we never know the exact load that a bridge will endure before collapsing into a river. Such random events cannot be predicted with certainty, but the relative frequency with which they occur in a long series of trials is often remarkably stable. Events possessing this property are called random, or stochastic, events. This stable long-term relative frequency provides an intuitively meaningful 20 2.2 Probability and Inference 21 measure of our belief in the occurrence of a random event if a future observation is to be made. It is impossible, for example, to predict with certainty the occurrence of heads on a single toss of a balanced coin, but we would be willing to state with a fair measure of confidence that the fraction of heads in a long series of trials would be very near .5. That this relative frequency is commonly used as a measure of belief in the outcome for a single toss is evident when we consider chance from a gambler's perspective. He risks money on the single toss of a coin, not a long series of tosses. The relative frequency of a head in a long series of tosses, which a gambler calls the probability of a head, gives him a measure of the chance of winning on a single toss. If the coin were unbalanced and gave 90% heads in a long series of tosses, the gambler would say that the probability of a head is .9, and he would be fairly confident in the occurrence of a head on a single toss of the coin. The preceding example possesses some realistic and practical analogies. In many respects all people are gamblers. The research physician gambles time and money on a research project, and she is concerned with her success on a single flip of this symbolic coin. Similarly, the investment of capital in a new manufacturing plant is a gamble that represents a single flip of a coin on which the entrepreneur has high hopes for success. The fraction of similar investments that are successful in a long series of trials is of interest to the entrepreneur only insofar as it provides a measure of belief in the successful outcome of a single individual investment. The relative frequency concept of probability, although intuitively meaningful, does not provide a rigorous definition of probability. Many other concepts of probability have been proposed, including that of subjective probability, which allows the probability of an event to vary depending upon the person performing the evaluation. Nevertheless, for our purposes we accept an interpretation based on relative frequency as a meaningful measure of our belief in the occurrence of an event. Next, we will examine the link that probability provides between observation and inference. 2.2 Probability and Inference The role that probability plays in making inferences will be discussed in detail after an adequate foundation has been laid for the theory of probability. At this point we will present an elementary treatment of this theory through an example and an appeal to your intuition. The example selected is similar to that presented in Section 1.4 but simpler and less practical. It was chosen because of the ease with which we can visualize the population and sample and because it provides an observation-producing mechanism for which a probabilistic model will be constructed in Section 2.3. Consider a gambler who wishes to make an inference concerning the balance of a die. The conceptual population of interest is the set of numbers that would be generated if the die were rolled over and over again, ad infinitum. If the die were perfectly balanced, one-sixth of the measurements in this population would be 1s, one-sixth, 2s, one-sixth, 3s, and so on. The corresponding frequency distribution is shown in Figure 2.1. Using the scientific method, the gambler proposes the hypothesis that the die is balanced, and he seeks observations from nature to contradict the theory, if false. 22 Chapter 2 Probability F I G U R E 2.1 Frequency distribution for the population generated by a balanced die Relative Frequency 1!6 1 2 3 4 5 6 Number on Upper Face of the Die A sample of ten tosses is selected from the population by rolling the die ten times. All ten tosses result in 1s. The gambler looks upon this output of nature with a jaundiced eye and concludes that his hypothesis is not in agreement with nature and hence that the die is not balanced. The reasoning employed by the gambler identifies the role that probability plays in making inferences. The gambler rejected his hypothesis (and concluded that the die is unbalanced) not because it is impossible to throw ten 1s in ten tosses of a balanced die but because it is highly improbable. His evaluation of the probability was most likely subjective. That is, the gambler may not have known how to calculate the probability of ten 1s in ten tosses, but he had an intuitive feeling that this event was highly unlikely if the die were balanced. The point to note is that his decision was based on the probability of the observed sample. The need for a theory of probability that will provide a rigorous method for finding a number (a probability) that will agree with the actual relative frequency of occurrence of an event in a long series of trials is apparent if we imagine a different result for the gambler's sample. Suppose, for example, that instead of ten 1s, he observed five 1s along with two 2s, one 3, one 4, and one 6. Is this result so improbable that we should reject our hypothesis that the die is balanced and conclude that the die is loaded in favor of 1s? If we must rely solely on experience and intuition to make our evaluation, it is not so easy to decide whether the probability of five 1s in ten tosses is large or small. The probability of throwing four 1s in ten tosses would be even more difficult to guess. We will not deny that experimental results often are obviously inconsistent with a given hypothesis and lead to its rejection. However, many experimental outcomes fall in a gray area where we require a rigorous assessment of the probability of their occurrence. Indeed, it is not difficult to show that intuitive evaluations of probabilities often lead to answers that are substantially in error and result in incorrect inferences about the target population. For example, if there are 20 people in a room, most people would guess that it is very unlikely that there would be two or more persons with the same birthday. Yet, under certain reasonable assumptions, in Example 2.18 we will show that the probability of such an occurrence is larger than .4, a number that is surprisingly large to many. We need a theory of probability that will permit us to calculate the probability (or a quantity proportional to the probability) of observing specified outcomes, assuming that our hypothesized model is correct. This topic will be developed in detail in subsequent chapters. Our immediate goal is to present an introduction to the theory of probability, which provides the foundation for modern statistical inference. We will 2.3 A Review of Set Notation 23 begin by reviewing some set notation that will be used in constructing probabilistic models for experiments. 2.3 A Review of Set Notation To proceed with an orderly development of probability theory, we need some basic concepts of set theory. We will use capital letters, A, B, C, . . . , to denote sets of points. If the elements in the set A are a1 , a2 , and a3 , we will write A = {a1 , a2 , a3 }. Let S denote the set of all elements under consideration; that is, S is the universal set. For any two sets A and B, we will say that A is a subset of B, or A is contained in B (denoted A ⊂ B), if every point in A is also in B. The null, or empty, set, denoted by ∅, is the set consisting of no points. Thus, ∅ is a subset of every set. Sets and relationships between sets can be conveniently portrayed by using Venn diagrams. The Venn diagram in Figure 2.2 shows two sets, A and B, in the universal set S. Set A is the set of all points inside the triangle; set B is the set of all points inside the circle. Note that in Figure 2.2, A ⊂ B. Consider now two arbitrary sets of points. The union of A and B, denoted by A ∪ B, is the set of all points in A or B or both. That is, the union of A and B contains all points that are in at least one of the sets. The Venn diagram in Figure 2.3 shows F I G U R E 2.2 Venn diagram for A⊂ B S A B F I G U R E 2.3 Venn diagram for A∪ B S A B 24 Chapter 2 Probability F I G U R E 2.4 Venn diagram for AB S A B two sets A and B, where A is the set of points in the left-hand circle and B is the set of points in the right-hand circle. The set A ∪ B is the shaded region consisting of all points inside either circle (or both). The key word for expressing the union of two sets is or (meaning A or B or both). The intersection of A and B, denoted by A ∩ B or by AB, is the set of all points in both A and B. The Venn diagram of Figure 2.4 shows two sets A and B, with A ∩ B consisting of the points in the shaded region where the two sets overlap. The key word for expressing intersections is and (meaning A and B simultaneously). If A is a subset of S, then the complement of A, denoted by A, is the set of points that are in S but not in A. Figure 2.5 is a Venn diagram illustrating that the shaded area in S but not in A is A. Note that A ∪ A = S. Two sets, A and B, are said to be disjoint, or mutually exclusive, if A ∩ B = ∅. That is, mutually exclusive sets have no points in common. The Venn diagram in Figure 2.6 illustrates two sets A and B that are mutually exclusive. Referring to Figure 2.5, it is easy to see that, for any set A, A and A are mutually exclusive. Consider the die-tossing problem of Section 2.2 and let S denote the set of all possible numerical observations for a single toss of a die. That is, S = {1, 2, 3, 4, 5, 6}. Let A = {1, 2}, B = {1, 3}, and C = {2, 4, 6}. Then A ∪ B = {1, 2, 3}, A ∩ B = {1}, and A = {3, 4, 5, 6}. Also, note that B and C are mutually exclusive, whereas A and C are not. F I G U R E 2.5 Venn diagram for A S A A Exercises F I G U R E 2.6 Venn diagram for mutually exclusive sets A and B 25 S A B We will not attempt a thorough review of set algebra, but we mention four equalities of considerable importance. These are the distributive laws, given by A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C), and DeMorgan's laws: A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C), (A ∩ B) = A ∪ B and (A ∪ B) = A ∩ B. In the next section we will proceed with an elementary discussion of probability theory. Exercises 2.1 Suppose a family contains two children of different ages, and we are interested in the gender of these children. Let F denote that a child is female and M that the child is male and let a pair such as F M denote that the older child is female and the younger is male. There are four points in the set S of possible observations: S = {F F, F M, M F, M M}. 2.2 Let A denote the subset of possibilities containing no males; B, the subset containing two males; and C, the subset containing at least one male. List the elements of A, B, C, A ∩ B, A ∪ B, A ∩ C, A ∪ C, B ∩ C, B ∪ C, and C ∩ B. Suppose that A and B are two events. Write expressions involving unions, intersections, and complements that describe the following: a b c d 2.3 Both events occur. At least one occurs. Neither occurs. Exactly one occurs. Draw Venn diagrams to verify DeMorgan's laws. That is, for any two sets A and B, (A ∪ B) = A ∩ B and (A ∩ B) = A ∪ B. 26 Chapter 2 Probability 2.4 If A and B are two sets, draw Venn diagrams to verify the following: a b 2.5 2.6 A = (A ∩ B) ∪ (A ∩ B). If B ⊂ A then A = B ∪ (A ∩ B). Refer to Exercise 2.4. Use the identities A = A ∩ S and S = B ∪ B and a distributive law to prove that a A = (A ∩ B) ∪ (A ∩ B). b If B ⊂ A then A = B ∪ (A ∩ B). c Further, show that (A ∩ B) and (A ∩ B) are mutually exclusive and therefore that A is the union of two mutually exclusive sets, (A ∩ B) and (A ∩ B). d Also show that B and (A ∩ B) are mutually exclusive and if B ⊂ A, A is the union of two mutually exclusive sets, B and (A ∩ B). From a survey of 60 students attending a university, it was found that 9 were living off campus, 36 were undergraduates, and 3 were undergraduates living off campus. Find the number of these students who were a undergraduates, were living off campus, or both. b undergraduates living on campus. c graduate students living on campus. 2.7 A group of five applicants for a pair of identical jobs consists of three men and two women. The employer is to select two of the five applicants for the jobs. Let S denote the set of all possible outcomes for the employer's selection. Let A denote the subset of outcomes corresponding to the selection of two men and B the subset corresponding to the selection of at least one woman. List the outcomes in A, B, A ∪ B, A ∩ B, and A ∩ B. (Denote the different men and women by M1 , M2 , M3 and W1 , W2 , respectively.) 2.8 Suppose two dice are tossed and the numbers on the upper faces are observed. Let S denote the set of all possible pairs that can be observed. [These pairs can be listed, for example, by letting (2, 3) denote that a 2 was observed on the first die and a 3 on the second.] a Define the following subsets of S: A: The number on the second die is even. B: The sum of the two numbers is even. C: At least one number in the pair is odd. b List the points in A, C, A ∩ B, A ∩ B, A ∪ B, and A ∩ C. 2.4 A Probabilistic Model for an Experiment: The Discrete Case In Section 2.2 we referred to the die-tossing experiment when we observed the number appearing on the upper face. We will use the term experiment to include observations obtained from completely uncontrollable situations (such as observations on the daily price of a particular stock) as well as tho