Wednesday, October 12, 2022

Looking Back at Meta-Analysis: Letter to Dr. Carrano

2022

In the summer of 2022, I received an email from Dr. Francesco Carrano. Dr. Carrano is an MD PhD professor of gastroenterologist at the University of Rome Tor Vergata. He happened to be teaching a course that involved meta-analysis, and he quite sensibly asked if I had any thoughts that might help orient his students to this technique that has come to play such a prominent role in medical research.

          ________________________________________________
    

Dear Professor Carrano,

I am honored by your request. There were a few things I might have added to the discussion of meta-analysis some years ago had I not been distracted by other interests. Perhaps the most significant observations begin with the often repeated criticism of meta-analysis, namely, “You can’t combine the results of two studies unless they are the same.”

I always had a sense that this criticism was unfounded, but it wasn’t until I read the work of the late Harvard philosopher Robert Nozick that I understood why it was so.

To my amazement, Nozick spent the first one hundred pages of his book Philosophical Explanations on the problem of "identity," i.e., what does it mean to say that two things are the same? Starting with the puzzle of how two things that are alike in every respect would not be one thing, Nozick unraveled the problem of identity and discovered its fundamental nature underlying a host of philosophical questions ranging from "How do we think?" to "How do I know that I am I?" Here, I thought at last, might be the answer to the "apples and oranges" criticism of meta-analysis. And indeed, it was there.

Consider an even more troubling example that stems from the problem of the persistence of personal identity. How we know that Professor Carrano who is before us today is the same Professor Carrano we saw last week, or last year, or more troubling to his family perhaps, 25 years ago? Probably no cells are in common between this organism and the organism that responded to the name "Francesco Maria Carrano" forty years ago. Nozick argued that the only sense in which personal identity survives across time is in the sense of what he called "the closest related continuer." Francesco Maria Carrano is still recognized as the same person across time because he is the closest related continuation of that person referred to as Francesco Maria Carrano in the past. Now notice that implied in this concept of the "closest related continuer" are notions of distance and relationship. Nozick was quite clear that these concepts had to be given concrete definition to understand how in particular instances people use the concept of identity. In fact, to Nozick's way of thinking, things are compared by means of weighted functions of constituent factors, and their "distance" from each other is "calculated" in many instances in a Euclidean way.

So here was Nozick saying that the fundamental riddle of how two things could be the same ultimately resolves itself into an empirical question involving observable factors and weighing them in various combinations to determine the closest related continuer. The question of "sameness" is not an a priori question at all; apart from being a logical impossibility, it is an empirical question. For us, no two "studies" are the same. All studies differ and the only interesting questions to ask about them concern how they vary across the factors we conceive of as important.

Perhaps it can be put simply this way: No two studies are ever the same; it is important to examine the relationship between what they show – their “result” or “finding” – and how they are different.

So the most common “error” made in modern meta-analysis, in my opinion, is the combining of studies in such a way as to obscure the relationship of outcomes – like “cure rate,” or “days hospitalized,” or “months symptom free” – to conditions or characteristics of the studies. A simple example would be a meta-analysis that averaged together the variable “days hospitalized” for males and females and never averaged that outcome separately for males and females.

Here is a concrete example.

Long ago, my colleagues and I studied the efficacy of psychotherapy. Many years later, for no reason that I can recall, I revisited the data to examine some issues about the differences between “talking therapies” and “behavioral therapies.” (Sorry, but I don’t have the time to explicate these two categories, but your common understanding of what they might be is undoubtedly sufficient.) I was able to identify nine studies in which a talking and a behavioral therapy were compared in the same experiment. I was curious as to how the benefits of the therapies might persist across time – “on follow-up,” essentially. When I graphed the difference in outcomes of the two types of therapy as a function of when after therapy ended the benefits were measured, I saw the graph below:

I’ve never published these results because by the time I did the analysis I had gone on to other things. But I love the finding. Behavioral therapies initially look superior but that superiority fades across time. Verbal therapies actually appear to gain in benefits as time passes. I suspect we all might have some ideas about why this relationship exists.

The bottom line is this. If I had merely averaged the difference in benefits between verbal and talking psychotherapies across all these 12 follow-up times, an important fact would have been obscured. But by examining the way in which study outcomes differ as a function of study conditions, something interesting might be found.

No two studies are the same – or else they would not be distinguishable as two studies. The important thing to do is to study the relationship of what studies show to the circumstances under which the study was conducted.

I hope this helps.

Gene V Glass
June 14, 2022
Boulder, Colorado

No comments:

Post a Comment

Evaluating testing, maturation, and gain effects in a pretest-posttest quasi-experimental design

1965 Glass, G.V. (1965). Evaluating testing, maturation, and gain effects in a pretest-posttest quasi-experimental design. American Edu...