Threats to Construct Validity
Before we launch into a discussion of the most common threats to construct validity, let’s recall what a threat to validity is. In a research study you are likely to reach a conclusion that your program was a good operationalization of what you wanted and that your measures reflected what you wanted them to reflect. Would you be correct? How will you be criticized if you make these types of claims? How might you strengthen your claims. The kinds of questions and issues your critics will raise are what I mean by threats to construct validity.
I take the list of threats from the discussion in Cook and Campbell (Cook, T.D. and Campbell, D.T. Quasi-Experimentation: Design and Analysis Issues for Field Settings. Houghton Mifflin, Boston, 1979). While I love their discussion, I do find some of their terminology less than straightforward – a lot of what I’ll do here is try to explain this stuff in terms that the rest of us might hope to understand.
Inadequate Preoperational Explication of Constructs
This one isn’t nearly as ponderous as it sounds. Here, preoperational means before translating constructs into measures or treatments, and explication means explanation – in other words, you didn’t do a good enough job of defining (operationally) what you mean by the construct. How is this a threat? Imagine that your program consisted of a new type of approach to rehabilitation. Your critic comes along and claims that, in fact, your program is neither new nor a true rehabilitation program. You are being accused of doing a poor job of thinking through your constructs. Some possible solutions:
- think through your concepts better
- use methods (e.g., concept mapping) to articulate your concepts
- get experts to critique your operationalizations
Mono-operation bias pertains to the independent variable, cause, program or treatment in your study – it does not pertain to measures or outcomes (see Mono-method Bias below). If you only use a single version of a program in a single place at a single point in time, you may not be capturing the full breadth of the concept of the program. Every operationalization is flawed relative to the construct on which it is based. If you conclude that your program reflects the construct of the program, your critics are likely to argue that the results of your study only reflect the peculiar version of the program that you implemented, and not the actual construct you had in mind. Solution: try to implement multiple versions of your program.
Mono-method bias refers to your measures or observations, not to your programs or causes. Otherwise, it’s essentially the same issue as mono-operation bias. With only a single version of a self esteem measure, you can’t provide much evidence that you’re really measuring self esteem. Your critics will suggest that you aren’t measuring self esteem – that you’re only measuring part of it, for instance. Solution: try to implement multiple measures of key constructs and try to demonstrate (perhaps through a pilot or side study) that the measures you use behave as you theoretically expect them to.
Interaction of Different Treatments
You give a new program designed to encourage high-risk teenage girls to go to school and not become pregnant. The results of your study show that the girls in your treatment group have higher school attendance and lower birth rates. You’re feeling pretty good about your program until your critics point out that the targeted at-risk treatment group in your study is also likely to be involved simultaneously in several other programs designed to have similar effects. Can you really label the program effect as a consequence of your program? The “real” program that the girls received may actually be the combination of the separate programs they participated in.
Interaction of Testing and Treatment
Does testing or measurement itself make the groups more sensitive or receptive to the treatment? If it does, then the testing is in effect a part of the treatment, it’s inseparable from the effect of the treatment. This is a labeling issue (and, hence, a concern of construct validity) because you want to use the label “program” to refer to the program alone, but in fact it includes the testing.
Restricted Generalizability Across Constructs
This is what I like to refer to as the “unintended consequences” treat to construct validity. You do a study and conclude that Treatment X is effective. In fact, Treatment X does cause a reduction in symptoms, but what you failed to anticipate was the drastic negative consequences of the side effects of the treatment. When you say that Treatment X is effective, you have defined “effective” as only the directly targeted symptom. This threat reminds us that we have to be careful about whether our observed effects (Treatment X is effective) would generalize to other potential outcomes.
Confounding Constructs and Levels of Constructs
Imagine a study to test the effect of a new drug treatment for cancer. A fixed dose of the drug is given to a randomly assigned treatment group and a placebo to the other group. No treatment effects are detected. Perhaps the result that’s observed is only true for that dosage level. Slight increases or decreases of the dosage may radically change the results. In this context, it is not “fair” for you to use the label for the drug as a description for your treatment because you only looked at a narrow range of dose. Like the other construct validity threats, this is essentially a labeling issue – your label is not a good description for what you implemented.
The “Social” Threats to Construct Validity
I’ve set aside the other major threats to construct validity because they all stem from the social and human nature of the research endeavor.
Most people don’t just participate passively in a research project. They are trying to figure out what the study is about. They are “guessing” at what the real purpose of the study is. And, they are likely to base their behavior on what they guess, not just on your treatment. In an educational study conducted in a classroom, students might guess that the key dependent variable has to do with class participation levels. If they increase their participation not because of your program but because they think that’s what you’re studying, then you cannot label the outcome as an effect of the program. It is this labeling issue that makes this a construct validity threat.
Many people are anxious about being evaluated. Some are even phobic about testing and measurement situations. If their apprehension makes them perform poorly (and not your program conditions) then you certainly can’t label that as a treatment effect. Another form of evaluation apprehension concerns the human tendency to want to “look good” or “look smart” and so on. If, in their desire to look good, participants perform better (and not as a result of your program!) then you would be wrong to label this as a treatment effect. In both cases, the apprehension becomes confounded with the treatment itself and you have to be careful about how you label the outcomes.
These days, where we engage in lots of non-laboratory applied social research, we generally don’t use the term “experimenter” to describe the person in charge of the research. So, let’s relabel this threat “researcher expectancies.” The researcher can bias the results of a study in countless ways, both consciously or unconsciously. Sometimes the researcher can communicate what the desired outcome for a study might be (and participant desire to “look good” leads them to react that way). For instance, the researcher might look pleased when participants give a desired answer. If this is what causes the response, it would be wrong to label the response as a treatment effect.