Chapter 8 Basic Designs II

Modified: 2006-02-08

ONE-GROUP DESIGNS

One-Group Two-Treatment Design

Table 8.7 Diagram for one-group experiments in which participants receive two different treatments

T1
T2
O1
O2

Time -------------------------------------------->

Where:

T = treatment

O = measure on the dependent variable

 

One-Group Pretest-posttest Design

Table 8.8 Diagram for a One-group pretest-posttest design

O1
T
O2

Time -------------------------------------------->

Where:

T = treatment

O = measure on the dependent variable

Three new Extraneous Variables in Chapter 8

Testing--The name that psychologists use for the extraneous variable of order is testing. Testing is an extraneous variable that must always be controlled or accounted for whenever participants take the same or a similar test more than once.

Instrument Change--The "instrument" in instrument change refers to the method of measuring behavior or cognitive processes in an experiment. Also known as: instrument decay or instrumentation. An instrument might be:

History--whenever events in two or more conditions of a study differ, the events may be an extraneous variable. If events differ, psychologists refer to this extraneous variable as history. Sometimes an extraneous variable of history is built into the experiment, as it was with the cognitive processing experiment. Sometimes, however, an unexpected event occurs during a study that affects one group but not the other. For example, a loud noise or a nauseous participant during testing creates a distracting event. If the distraction occurs for one group but not for the other, the extraneous variable of history, and not the IV, may cause differences in the DV.

Review of Three Extraneous Variables from Chapter 7

Selection--Selection is an uncontrolled extraneous variable if the participants who contribute scores to one condition of the IV are different from the participants who contribute to other conditions (other than the IV difference). One of the best features of a within-subjects design is that the same participants contribute to all conditions. Thus, for within-subjects designs that occur in a short time frame, selection is a controlled extraneous variable.

Differential Attrition--Differential attrition occurs if you lose more scores from one level of the IV than from others AND the loss has a differential effect on the DV. Like selection, differential attrition is usually not a problem for experiments with short time frames. In some within-subjects experiments attrition does occur. Participants complete one part of the experiment but not another part. The problem of differential attrition is easily solved - the participant is dropped from the analysis and this action noted in the Method section of the write-up.

Diffusion of Treatment--Diffusion of treatment occurs when the participants or conditions of one level of the IV influence in some way the participants or conditions at other levels of the IV. Again, a logical analysis of the procedures of the cognitive processing experiment shows that diffusion of treatment was not a problem. Although it is conceivable that counting letters has an effect on writing associates and that this effect influences recollection of the second list, such "diffusion" does not seem very plausible.

Summary

In summary, the one-group two-treatment design is a poor design. The extraneous variable of testing is always present. Instrument change and history may be present. In addition, the design may also be subject to two additional extraneous variables that we discuss in the next section.

ADDITIONAL EXTRANEOUS VARIABLES

Thus far in this chapter and chapter 7 we have discussed six extraneous variables:

  1. Selection
  2. Differential Attrition
  3. Diffusion of Treatments
  4. Testing
  5. Instrument Change
  6. History

Maturation--Maturation is a concept that refers to biological and psychological changes that occur during development. When used to describe an extraneous variable, maturation refers to changes due to development and to temporary, short-term changes as well.

Normal aging can be a maturational extraneous variable in longitudinal studies of children who are assessed over a long period.

As for the temporary, short-term form of maturation, there are many experiments in which participants work for 30 minutes on a task. Tiredness, boredom, or inattention are examples of short-term maturational changes. The usual solution is to provide rest periods.

Regression--Regression (or regression to the mean) is a phenomenon that can occur when participants are tested a second time. Regression is most pronounced for participants who made the highest scores or the lowest scores on the first test. Specifically, the phenomenon of regression is that high-scoring participants generally score lower when tested a second time and low-scoring participants generally score higher on a second test. Thus, both groups regress toward the mean that is established when all participants are retested.

Sometimes the first test is only implied rather than being actually administered. For example, clients who seek therapy and sports figures who appear on the cover of the magazine, Sports Illustrated represent extreme cases. People who seek therapy for depression are at a low point in their lives. Those chosen to be on the cover of a magazine such as Sports Illustrated are usually selected for outstanding athletic performance. In either case, we can expect regression toward the mean. That is, depressed clients become less depressed and outstanding athletes fail to repeat their best performances. A common observation about Nobel Prize winners is that they seldom continue to create additional ideas of Nobel Prize quality. Regression seems to be at work in these cases as well.

Table 8.9 List of extraneous variable variables and their definitions. The words treatments and conditions are used interchangeably.

Extraneous variable
Definition

Selection

Participants in the different conditions are not equivalent before the treatment is administered.

Differential Attrition

A differential loss of participants from the conditions that biases the DV scores.

Diffusion of Treatment

The conditions intended for only one group of participants are experienced by other groups.

Testing

Change in a participant's score is the result of having taken the same or a similar test earlier.

Instrument Change

A change in a measuring instrument (human judge, survey, or machine) during the course of an experiment.

History

Events that affect the DV that occur during one treatment but not other treatments.

Maturation

Either long-term or short-term changes in participants that affect the DV.

Regression

Extreme scores upon retesting produce a mean that is closer to the population mean.

ASSESSING ONE-GROUP DESIGNS

They are poor designs. Except for selection, differential attrition, and diffusion of treatment, these designs might harbor any (or all) of the other five extraneous variables - testing, instrument change, history, maturation, and regression.

TWO-GROUP DESIGNS

Nonrandom Two-Group Pretest-posttest Design

The nonrandom two-group pretest-posttest design has elements of both the one-group pretest-posttest design that is diagrammed in Table 8.8 and the nonrandom assignment design that you studied in chapter 7 (Table 7.9). Table 8.10 shows a diagram of a generic nonrandom two-group pretest-posttest design. As you can see, there are two groups. Each group is pretested, receives a treatment, and then takes a posttest. The effect of the treatment is expected to show up in the difference between the pretest and posttest.

Table 8.10 Generic notation for the Nonrandom Two-Group Pretest-posttest Design for Two Treatments

Method of Assignment
Dependent Variable
Independent Variable
Dependent Variable

NR

O1

T1

O2

NR

O1

T2

O2

Time -------------------------------------------->

Where:

NR - assignment to treatments is nonrandom

T1 and T2 - two treatment levels of the independent variable

O - observations that constitute the dependent variable

To illustrate the nonrandom two-group pretest-posttest design, let's return to our example of the experiment that attempted to assess the value of cognitive therapy for clients with depression (page 8-23). In that one-group pretest-posttest design experiment, clients completed the Beck Depression Inventory (BDI), participated in 10 weeks of cognitive therapy, and took the BDI a second time. If improvement were evident, it could be attributed to

1. Cognitive therapy, the IV in the experiment

2. Maturation, the improvement that often occurs when people are ill

3. Regression, the movement of scores toward the mean when a low-scoring group is tested a second time

4. History, the effect of events outside the experiment in the lives of the participants

The first task is to answer the question of who might serve as a comparison group. One solution that is often available to researchers is to use clients who have been put on a waiting list because no therapist at the clinic has an opening when they first seek help. In this case the two groups are best identified as therapy and waiting list. Examine Table 8.11, which converts the generic Table 8.10 into a specific illustration of the cognitive therapy/depression study.

Table 8.11 Specific notation for the Nonrandom Two-Group Pretest-posttest Design for two treatments for the cognitive therapy/depression experiment

Method of Assignment
Dependent Variable
Independent Variable
Dependent Variable

Therapist available

BDI-1

Cognitive Therapy

BDI-2

No therapist available

BDI-1

No therapy at clinic

BDI-2

Time -------------------------------------------->

Selection--Selection appears to be fairly well controlled. Although the two groups might not be equal in depression to begin with, the effect of the treatment is found in the difference between the pretest and the posttest. The fact that the two groups might not be equal in the beginning is not really a problem.

Differential Attrition--Differential attrition could be a problem in this study. Of course, if the drop-out rates are comparable, differential attrition is unlikely. However, if the drop-out rate is higher for one group, further analysis is in order. Suppose the group with the higher drop-out rate was the therapy group and that the drop-outs were those with the greatest depression. If so, the therapy group would appear to improve. In this case, the improvement in the therapy group's performance on the BDI should be attributed to differential attrition and not to therapy.

Diffusion of Treatment--Diffusion of treatment also could be a problem in this study. Clients on the waiting list might seek other sources of therapy. They might find another therapist, or they might seek therapy in less formal ways (pastor, friend, or family member). Also, they might obtain prescription medicine from a physician.

Testing--Testing and order effects are nicely controlled by the nonrandom two-group pretest-posttest design. Whatever the effects of taking the BDI the second time, these effects will show up in both groups, leaving the difference unaffected.

Instrument Change--Instrument change, like testing, is well controlled by this design and for the same reasons.

History--History might be a problem for this particular experiment. Although large public events, such as a war or economic change, will be the same for all participants, their individual histories will surely differ. If the sample size is large, then we would expect that the ups and downs of individual histories will balance out, but if the sample size is small, we don't have that assurance.

Maturation--Maturation may be well controlled, because whatever maturation occurs in the therapy group can be expected to occur in the wait-listed group. However, if the mean ages of the two groups differ, maturation may be a problem because younger clients exhibit maturational change faster than older clients. To resolve this question, means and standard deviations of the ages of the two groups are needed.

Regression--Although regression to the mean can be expected for clients who seek therapy, there is no reason to expect that regression should occur more in one group than in the other. Thus, the nonrandom two-group pretest-posttest design controls for regression.

Statistical Analysis of the Nonrandom Two-Group Pretest-posttest Design

Data from a nonrandom two-group pretest-posttest design can be analyzed in one of two ways. One is fairly simple and easy to understand; the other is more complicated. For the simpler analysis, subtract the pretest score from the posttest score (or vice versa) for each participant. This produces a set of difference scores for the therapy group and a set of difference scores for the wait-listed group. The two sets of difference scores are then analyzed as two independent groups. From this point on, the analyses are the same as those we used for the exercise/tiredness experiment in chapter 7. Unfortunately, the simpler, easy-to-understand way has some problems associated with it.

The more complicated way to analyze data from a nonrandom two-group pretest-posttest design is to use a statistical technique called the analysis of covariance (ANCOVA). An analysis of covariance statistically adjusts the posttest scores to reflect the information in the pretest scores, and it does this without creating the reliability problem that goes with difference scores. The ANCOVA is usually covered in advanced courses. One other two-group design deserves attention. It is the random two-group pretest-posttest design.

Random Two-Group Pretest-posttest Design

The random two-group pretest-posttest design is diagrammed in Table 8.12. As you can see, the only difference between it and the design that you just studied is that the participants are randomly assigned to the two conditions. As you know from chapter 7, random assignment is better if it is feasible. If random assignment were used for the depression/cognitive therapy experiment, the concerns raised earlier about maturation would be eliminated. With random assignment, the average age of the participants in the two groups would be approximately equal. Any maturational changes would be expected to occur equally in the two groups.

Table 8.12 Generic notation for the Random Two-Group Pretest-posttest Design for Two Treatments

Method ofAssignment
Dependent Variable
Independent Variable
Dependent Variable

R

O1

T1

O2

R

O1

T2

O2

Time -------------------------------------------------------------->

R - assignment to treatments is random

T1 and T2 - two treatment levels of the independent variable

O - observations that constitute the dependent variable

We are now ready for a final summary of extraneous variable information. Table 8.13 shows a design/extraneous variable matrix for the five designs and the eight extraneous variables that we've covered in these two chapters. This table is a modification of one that appeared in Campbell and Stanley (1966).

In The Know--In 1957, Donald T. Campbell published an article in the Psychological Bulletin that provided a catalog of experimental designs and their ability to control various extraneous variables. Research methods courses since then have typically included modified versions of his catalog. Campbell (1916-1996) was a social psychologist who advocated testing the outcome of governmental social programs designed to help those in need. To assess outcomes, Campbell argued that extraneous variables must be identified and eliminated (or at least reduced).

Table 8.13 Control of extraneous variables by different research designs.

(?? means the procedures must be studied to determine if extraneous variable is present)

Extraneous Variables

Designs

Selection
Differential Attrition
Diffusion of Treatments
Testing
Instrument Change
History
Maturation
Regression
Nonrandom Assignment
Not controlled
Not controlled
Not controlled
Controlled
??
??
??
Not controlled
Random Assignment
Controlled
Not controlled
??
Controlled
Controlled
Not controlled
Controlled
Controlled

One-Group Two Treatment

Controlled
Controlled
Not controlled
Not controlled
??
Not controlled
Not controlled
Controlled

One-Group Pretest Posttest

Controlled
Controlled
Controlled
Not controlled
Not controlled
Not controlled
Not controlled
Not controlled

NR Two-Group Pretest Posttest

Controlled
Not controlled
Not controlled
Controlled
??
??
??
??

R Two-Group Pretest Posttest

Controlled
Not controlled
Not controlled
Controlled
Controlled
??
Controlled
Controlled

CONTROLLING THE EFFECTS OF EXTRANEOUS VARIABLES

We describe three different techniques that researchers use to control extraneous variables. The first one is a technique you are already familiar with.

Random Assignment--In chapter 7, we described and explained random assignment, which is a very important technique for controlling extraneous variables. Our discussion and the problems you worked all focused on the random assignment of participants, but random assignment can be used to ensure equality of other extraneous variables as well. [one or two examples needed here]

Counterbalancing--Like most other methods of controlling extraneous variables, counterbalancing does not eliminate a variable but balances it out.

Simple Counterbalancing--Simple counterbalancing is used when participants are tested more than once for every level of the independent variable. Such repeated testing is common in experiments that investigate sensation and perception. Our vigilance experiment required repeated testing. A counterbalanced order of treatments is:

a b b a

For three levels of the IV, the sequence is:

a b c c b a

The technique of simple counterbalancing also can be used when testing a participant more than two times on each level of the IV. The following sequences illustrate four tests for two conditions and four tests for three conditions:

a b b a a b b a

a b c c b a a b c c b a

Block Counterbalancing--Block counterbalancing is another versatile counterbalancing technique. It is useful when participants are tested only once on each level of the IV rather than several times as in an a b b a a b b a sequence. A diagram of this solution using the a b terminology gives:

Order 1

a

b

Order 2

b

a

The solution above is an example of block counterbalancing. Order of testing is nicely balanced. Shallow processing occurs first for half of the participants and deep processing occurs first for the other half. Of course, shallow and deep both occur second for half of the participants as well.

A block counterbalanced design is a "square" design. The number of orders is equal to the number of treatments. As with simple counterbalancing, block counterbalancing distributes changes that occur over time in such a way that differences among levels of the IV are not affected.

Table 8.14 An illustration of block counterbalancing for an IV with three levels.

Order 1

a

b

c

Order 2

b

c

a

Order 3

c

a

b

Time -------------------------------------------------------------->

In the cognitive processing experiment, we identified three extraneous variables that were confounded with the IV (levels of processing). The three were order of testing, different lists, and different processing times. As you saw above, block counterbalancing solves the order of testing problem. Can it also solve the different lists problem? What about the problem of different processing times? Can it solve that one too? After all, we did describe counterbalancing as a versatile technique.

At first glance, it appears that block counterbalancing is a simple solution to the different lists problem. If we designate the two lists L1 and L2, then half the deep processing trials present L1 and half use list L2. Similarly, for shallow processing, half the trials use list L2 and half use list L1. The next question is whether we can counterbalance both the order of testing and the different lists? Unfortunately, this is impossible using only two groups.

Stop & Think--Use the designations deepL1, deepL2, shallowL1, and shallowL2 to indicate the two levels of the IV and the two lists. Prove to yourself that you cannot achieve block counterbalancing of both the levels of the IV and the lists with just two groups of participants.

Fortunately, a modification of block counterbalancing solves the problem above. The solution is to divide the participants into four groups. Here's our answer to the problem of counterbalancing both order of treatments and two different lists.

Group 1

deepL1

shallow L2

Group 2

shallow L1

deep L2

Group 3

deep L2

shallow L1

Group 4

shallow L2

deep L1

There was one other extraneous variable in the cognitive processing experiment - processing time. Participants studied the deep processing list longer than they studied the shallow processing list. Can counterbalancing solve this problem? No, it cannot. The solution to providing equal times for the two conditions is to pace the participants so that they are forced to spend equal time on each word.

One characteristic of block counterbalancing deserves additional attention. Although block counterbalancing balances out overall order effects, it does not control for carryover effects. More complicated versions of block counterbalancing such as complete counterbalancing, Latin Squares, and Greco-Latin Squares are used to control carryover effects. (See Kirk, 1995.)

Matching--Matching is a third technique that researchers use to balance out extraneous variables. Like counterbalancing, matching does not remove the effects of the extraneous variable, but distributes its effects equally across the levels of the IV. The basic idea of matching is simple. The researcher assembles a group of participants and then divides them into sets. Each member of a set is like the other members of that set on some extraneous variable that is related to the DV. Each member of a set supplies data for one level of the IV. Thus, the number of participants in a set is equal to the number of levels of the IV.

"Why not use matching rather than random assignment all the time?" The answer is that although matching guarantees equality on the matched variable, it carries with it no assurance about other variables. Random assignment, however, assures us of approximately equal samples on all variables. In addition, researchers do not often have a list of their participants before the study, much less a list that includes scores on a variable highly correlated with the DV.

BETWEEN-SUBJECTS AND WITHIN-SUBJECTS DESIGNS COMPARED

Number of Participants--Fewer participants may mean less paperwork, fewer debriefings, and quicker data gathering. Some of this efficiency for the researcher comes at a cost to the participants. For their part, they spend more time participating in a within-subjects experiment than they do in a between-subjects experiment. If the time they are required to spend is too great, attrition might become a problem for the researcher.

Statistical Sensitivity--The statistical tests for within-subjects design data, however, remove the effects of the individual differences from the scores. The result is that the effect of the IV has less noise to compete with to be detected. Because there is less noise, within-subjects designs are more likely to detect differences among the levels of the IV. You encountered the issue of sensitivity under a different name in chapter 6. Thus, another way to express the increased sensitivity of within-subjects designs is to say that within-subjects designs are more powerful than between-subjects designs.

Carry-Over Effects--A carry-over effect occurs if the administration of a level of the IV has an effect on participants that influences their response to the next level of the IV. Carry-over effects can be a problem for within-subjects designs but not for between-subjects designs. The usual solution when carry-over effects are recognized as a problem is to use a between-subjects design.

INTERNAL VALIDITY AND EXTERNAL VALIDITY

Definitions

In an experiment, the researcher typically makes two claims about what is being measured. One claim is about the IV and the DV. Researchers claim to be measuring the effect of changes in the IV on the DV. If the claim is true, the experiment has internal validity. If an experiment has internal validity, the IV and not an extraneous variable that is responsible for changes in the DV. An internally valid experiment has no uncontrolled extraneous variables.

A second claim of researchers, either expressed or implied, involves generalization. After examining a sample, researchers generalize to a population. Generalization is a goal that characterizes all research. This second claim, that the results of the experiment can be generalized to some larger, untested, unmeasured group, is a claim of external validity. Thus, the results of an externally valid experiment can be generalized to some larger group, some additional situations, or some other time.

Threats to External Validity and Meeting Those Threats

If an experiment is externally valid, the results can be generalized to some larger population, most of whom were not in the experiment. Reasons why the results from the sample should not be generalized are called threats to external validity. We discuss three of these threats.

Biased Sampling--Every research sample is a sample of some larger population of participants (or subjects), situations, and times. The population might be sophomores, schizophrenics, or Spaniards. It might be another interesting group such as cities, cats, or archival records. The situation might be a college campus, a factory, or a deserted island. The time might be a day of the week, the past, or a particular time in the future. Of course, every particular experiment is limited and specific. Only certain sophomores, cities or Sundays are in the experiment. Thus, asking, "How representative is the sample?" is always a fair question. Sample can refer to the participants, situations, or times. If the researcher's goal is to generalize to a population and the sample is not representative, then the generalization will not be accurate.

The elegant response to the threat of biased sampling is to use a random sample. Unfortunately, obtaining a random sample is not easy, and in many cases, it is a practical impossibility. In fact, random samples are rare in published research.

Probably the most convincing evidence of the generality of a finding is a replication. When an independent researcher repeats the procedures and gets essentially the same results, everyone's confidence in the external validity of the original finding goes up.

Reactivity--When psychologists manipulate or designate an IV and measure a DV, they do so in a particular setting. Sometimes the setting itself has an effect of its own on the DV, over and above the effect of the IV. If other settings produce different results, the experiment lacks external validity. This is the problem of reactivity, which is sometimes referred to as participant reactivity. The problem of reactivity is sometimes described as being due to the demand characteristics of the situation. That is, the situation "demands" a particular response from the participant.

Perhaps the most famous case of reactivity occurred in the early days of I/O psychology, the Western Electric Company conducted studies on a group of workers at their Hawthorne Works, a manufacturing facility near Chicago. The Hawthorne studies were conducted in a special setting. In one sense, however, any time participants know they are in a psychological study they may act differently.

Pretest sensitization--Pretest sensitization occurs when participants respond to a treatment differently if they experienced a pretest before the treatment.

If pretest sensitization is known to be a problem or reasonably appears to be a problem, researchers may turn to a fairly expensive solution, the Solomon four-group design.

Table 8.17 Notation for the Solomon Four-Group Design for Two Treatments

Group
Method of Assignment
Dependent Variable
Independent Variable
Dependent Variable

A

R/NR

O1

T1

O2

B

R/NR

T1

O1

C

R/NR

O1

T2

O2

D

R/NR

T2

O1

(Also see Chapter 7 to see another representation of the Solomon Four-group Design)

Time ---------------------------------------------------------------------------------->

R/NR - assignment to treatments - either random or nonrandom

T1 and T2 - two treatment levels of the independent variable

O - observations that constitute the dependent variable

The Solomon four-group design is not popular because it requires nearly twice the resources just to be able to detect pretest sensitization. A search for "Solomon four group" in PsycINFO found only 113 studies since 1950. In the early stages of a research program, researchers usually choose a design that devotes its resources to controlling extraneous variables. Once internal validity is assured, researchers turn to establishing external validity. If there is a question about pretest sensitivity, the Solomon four-group design can answer the question.

Pretests are not the only things that occur before a treatment is administered. For within-subjects designs with multiple treatments, treatments occur before other treatments are administered. Thus, when there are multiple treatments, there may be a question of how well the results apply to those who have not experienced previous treatments. Counterbalancing assures you that the effects of order (experience) are distributed evenly, but it does not assure you that the results would be the same if there were no previous treatments.

Which is more important, internal or external validity?

Most researchers reason that internal validity is more important than external validity. If you have internal validity, you know that the IV (and not some extraneous variable) caused the changes in the DV. If you have external validity, you not only know that the IV caused changes in the DV, but that this relationship is true in some context wider than that of the specific experiment. Thus, according to most researchers, you cannot have external validity without first having internal validity. Once internal validity is secured, the question of external validity is appropriate. The relative importance of internal and external validity is a topic on which not all researchers agree (Mook, 2003).

BASIC DESIGNS: AN OVERVIEW

In addition to internal validity, researchers strive for external validity. Experiments with external validity produce conclusions that are true for other populations, situations, and times. As you study methods of research, analyze the research of others, and perhaps conduct studies of your own, we are sure that your efforts will prepare you not only for future courses and research, but also for decision-making in your world outside of academics.


Back to Main Page