Chapter 6 Statistical Testing

Modified: 2006-03-23


Chapter 6 focuses on the logic of null hypothesis statistical testing (NHST). Six statistical tests that are based on NHST logic are explained. The chapter also discusses statistical power and introduces non-NHST approaches to data analysis. NOTE: we are not covering Chapter 5. You may wish to review that chapter in order to refresh your basic statistical knowledge.

Objectives for Chapter 6

After studying this chapter and working the problems, you should be able to:


Consider the following data first presented in Chapter 5 (Jurica, Alanis, & Ogletree, 2002). The question is do males prefer violent video games more than females. We'll come back to this research question soon.

Gender

Nonviolent

Violent

Total

Female

24

7

31

Male

87

65

152

Total

111

72

183

Table 6.1 Observed frequencies from study of gender and video games. Total frequencies are in the margins.

SAMPLING FLUCTUATION

sampling fluctuation --The chance differences between samples and the population the samples are from.

chi square test--A NHST test that is appropriate for category data.

A NEGATIVE INFERENCE LOGIC PROBLEM

CHI SQUARE LOGIC

21

10

31

90

62

152

111

72

183

  • The probability of Table 6.6 Array B occuring is .036

29

2

31

82

70

152

111

72

183

  • If a low probabilty data array actually occurs, then we must reject the idea that the variables that produced the data were statiscally independent
  • Can you see how this is a case of negative inference?

NULL HYPOTHESIS STATISTICAL TESTING (NHST) LOGIC

Rejection regions, two-tailed

Rejection regions, one-tailed (can also be at -1.64)

Sampling distribution for t-test with 1, 10, 20, and 30 df

Above figure from: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htm

True situation in the population

H0 true

H0 false

The decision based on

Reject H0

Type I error (False Rejection)

Correct decision

sample data

Retain H0

Correct decision

Type II error (False acceptance)

Table 6.7 Illustration of the types of errors that can occur in NHST testing

 

NHST (Null Hypothesis Statistical Testing) TESTS

TESTS FOR RANKED DATA

CONFIDENCE INTERVALS

A STUDENT'S GUIDE TO ANALYZING DATA

After you finish gathering data for your experiment, the next task is to analyze it. For most researchers, this is the most exciting part of the project. Here is our suggested order for analyzing the data.

POWER

META-ANALYSIS

In The Know--Chi square was invented in 1900 by Karl Pearson (of Pearson product-moment correlation coefficient fame). It is used by researchers in almost every discipline that uses quantitative data. The importance of chi square was recognized when it was listed as one of the 20 greatest discoveries of the 20th century by the editors of a popular science magazine (Hacking, 1984).

In The Know--Ronald A. Fisher introduced the null hypothesis concept in 1925 in an influential book written for practical research workers. His goal was to give researchers guidelines and the .05 cutoff was proposed as a guideline. Unfortunately, later writers and researchers treated it as a rule. The exclusive reliance on NHST techniques for analyzing quantitative data is under attack these days. In the 1990s there was a move to ban its use. The American Psychological Association (APA) assembled a task force that recommended that NHST not be banned, but that researchers not rely exclusively on NHST. The task force mentioned exploratory data analysis and confidence intervals as examples of alternatives to NHST. The APA report can be viewed at http://www.apa.org/science/bsaweb-tfsi.html. Other accessible explanations of the controversy and its outcome are in Dillon (1999) and Spatz (2000).

In the Know--It is important to recognize that NHST is just an aid to decision making. The NHST technique results in one of two decisions:

  • 1) reject H0, accept H1 and write a strong conclusion that eliminates sampling fluctuation as an explanation.
  • 2) fail to reject H0, which leads to retaining both H0 and H1.

In recent years a number of prominent researchers have questioned the value of the NHST technique. They argue that other approaches provide a more extensive analysis and will result in faster progress in our effort to understand behavior and cognitive processes. The issues raised are not simple and are beyond the scope of this introduction to research methods. References that help explain this controversy are Dillon (1999), Spatz (2000), and Nickerson (2000). Many of the researchers who raised the questions about NHST contributed to a book whose title captures the problem, What If There Were No Significance Tests? (Harlow, Mulaik, & Steiger, 1997).

GLOSSARY

alpha (a)--The probability that is the criterion for rejecting the null hypothesis.

alternative hypothesis--A hypothesis that two variables are related or that two population means are not equal.

chi square test--A NHST test that is appropriate for category data.

confidence interval--A range of scores that is expected with a specified degree of confidence to capture a parameter.

critical value--The number from a sampling distribution that determines whether H0 is rejected.

degrees of freedom --A concept used by mathematical statisticians to determine the sampling distribution that is appropriate for a set of data.

meta-analysis--A quantitative technique that summarizes the results of many studies of a single topic.

null hypothesis--Usually a hypothesis that there is no relationship or that population means are equal.

null hypothesis statistical testing --An inferential statistics technique that measures the uncertainty that surrounds samples.

one-tailed test--A statistical test to detect a difference in population means, either positive or negative but not both.

power--Power is the probability of correctly rejecting a false null hypothesis

power analysis--A statistical analysis that solves for one of the factors that is involved in rejecting a false H0 with a NHST test.

rejection region --The portion of a sampling distribution that includes sample data that is less probable than alpha (a).

research hypothesis--The researcher's expectation of what the data will show.

robust--A statistical test that produces reasonably accurate probabilities even when the assumptions the test is based on are not fulfilled.

sampling distribution --A theoretical distribution based on random sampling that shows probabilities of actual sample outcomes.

sampling fluctuation --The chance differences between samples and the population the samples are from.

statistical independence--Two variables that, as their own levels change, do not produce changes in the other variable.

statistically significant --Sample data with a probability less than .05.

t distribution--Sampling distribution used to determine probabilities for t tests.

two-tailed test--A statistical test to detect a difference in population means, regardless of direction.


Back to Main Page