BLOG POST

Pearl's Wisdom on Causality

December 14, 2009

I've had no regrets since I left mathematics almost 19 years ago. My last memories of studying it are of sitting in a library at the University of Cambridge, staring at lecture notes full of lifeless equations, struggling and failing to care. But occasionally I read something that reminds me of the beauty that can be found in math, and of the remarkable power of formal analysis. Then I see again what once made her attractive to me, if you'll forgive the metaphor.That happened a few days ago as I read Judea Pearl's Causal inference in statistics: An overview (hat tip to Holden Karnofsky). Everything I know about Pearl I learned from his web site. He works in the computer science department at UCLA and is older than he looks in the picture. He researched advanced electronic devices in the 1960s. He has apparently written the bible on Causality, which is to say, the formal study of what we mean when we talk about variables such as diet and health influencing each other. He just released that overview paper in September, which gives a "gentle introduction" to a subject he appears to have done much to develop.The analysis is beautiful, but not merely that. It is insightful enough about what statisticians in economics, medicine, and other fields do every day that I think the paper should be required reading for graduate students in all fields that use statistics to study causality. Reading the paper may be a particular thrill for me because I entered econometrics untutored, running my first regressions in 2002 for a project with Bill Easterly. I have learned econometrics on the fly, and only gradually grasped what I was really doing. From my point of view, Pearl has systematized a few things I had come to understand while offering a much greater vista.The core idea is this: The bulk of the vast field of statistics is about distributions. You can, for example, graph the probability of rolling a 2 or 3 or 4, etc., in a throw of Monopoly dice; the graph is bell shaped, and is a distribution. Distributions can also be multidimensional, which let us speak precisely about how different variables vary in tandem or independently. From such associations and non-associations, we can attempt to infer causal relationships, but that takes a leap of logic and at least until recently the nature of that leap had not been carefully analyzed. Statisticians have used precise mathematical symbols to speak of distributions, but sloppy words to speak of causality:

The aim of standard statistical analysis, typified by regression, estimation, and hypothesis testing techniques, is to assess parameters of a distribution from samples drawn of that distribution. With the help of such parameters, one can infer associations among variables, estimate beliefs or probabilities of past and future events, as well as update those probabilities in light of new evidence or new measurements. These tasks are managed well by standard statistical analysis so long as experimental conditions remain the same. Causal analysis goes one step further; its aim is to infer not only beliefs or probabilities under static conditions, but also the dynamics of beliefs under changing conditions, for example, changes induced by treatments or external interventions.This distinction implies that causal and associational concepts do not mix. There is nothing in the joint distribution of symptoms and diseases to tell us that curing the former would or would not cure the latter. More generally, there is nothing in a distribution function to tell us how that distribution would differ if external conditions were to change---say from observational to experimental setup---because the laws of probability theory do not dictate how one property of a distribution ought to change when another property is modified. This information must be provided by causal assumptions which identify relationships that remain invariant when external conditions change.These considerations imply that the slogan "correlation does not imply causation" can be translated into a useful principle: one cannot substantiate causal claims from associations alone, even at the population level---behind every causal conclusion there must lie some causal assumption that is not testable in observational studies.
As I did in a post on the study of microcredit's impacts, if you have ever tried to talk about how certain variables relate to each other causally, you probably drew a picture with nodes (variables) and arrows to link them, something like this diagram from Pearl's paper:Pearl causality figure 3.PNGHere, Z1 affects only X and Z3, Z2 affects only Y and Z3, and so on. Pearl's project has been to hang a grand theory of causality on pictures like this. Such graphs (as they are called) might seem too flimsy to carry the weight, but mathematicians can speak of them with precision. The nodes can be numbered and a table drawn up, like a times table, using 0's and 1's to indicate whether, say, the variable at node 1 affects the variable at node 2 (which would put a 1 in row 1, column 2, of the table). While this formal tabular expression is essential structure, the nice thing about the graphical representation, as in the picture above, is that it is intuitive to work with directly. You don't need to think about 0's and 1's. To get the flavor of the discourse, read this definition from the paper (no need to fully understand it):Pearl causality definition 1.PNGNaturally, I noticed the kinship between my diagram:
landholdings => microcredit borrowing => household well-being
and Pearl's:Pearl causality figure 2aIn mine, landholding, corresponding to Pearl's Z, is assumed to affect microcredit borrowing, X (because Bangladeshis owning more than half an acre were formally ineligible for microcredit), and borrowing is assumed to affect household well-being, Y. Crucially (and simplifying for the sake of exposition), landholdings are assumed not to affect well-being directly, so no single arrow links those two variables. The U variables along the top of Pearl's diagram make explicit what I left implicit: that various unknown factors affect landholdings, and that various factors also affect microcredit borrowing and household well-being in addition to the ones I highlighted.Being explicit about causal assumptions forces an analyst to confront them and gauge their credibility---more generally, to appreciate, as I wrote, that "you have to assume something to conclude something." In the microcredit example, if you assume that landholdings are related to well-being only through microcredit (an arguable assumption), and if landholdings and well-being are nevertheless associated, that proves that both arrows in my diagram are at work, in particular that microcredit affects well-being. My impression is that only recently, thanks in part to Angus Deaton, have the majority of development economists come to fully appreciate the implications of this kind of analysis. (I learned it from Brock and Durlauf 2001.) The assumptions econometricians make are often not much more obviously true than the hypotheses they test.Pearl goes on to wrestle with bigger ideas than I convey here, and which I probably do not fully understand and appreciate: formal statements of when quantities of interest in a causal model can be estimated ("identification") and what constitutes a complete set of controls; a confrontation with the "potential outcomes framework," an analytical set-up often used to discuss methods of impact evaluation; and the limits on what even randomized trials can tell us about the world.He writes:
Associational assumptions, even untested, are testable in principle, given sufficiently large sample and sufficiently fine measurements. Causal assumptions, in contrast, cannot be verified even in principle, unless one resorts to experimental control….This makes it doubly important that the notation we use for expressing causal assumptions be meaningful and unambiguous so that one can clearly judge the plausibility or inevitability of the assumptions articulated. Statisticians can no longer ignore the mental representation in which scientists store experiential knowledge, since it is this representation, and the language used to access it that determine the reliability of the judgments upon which the analysis so crucially depends.
This paper should, in both the normative and predictive sense, speed the day when statisticians pay adequate attention to the causal assumptions they make.Postscript: I was stunned to spot this button on Pearl's page:Daniel Pearl foundation button.PNGJudea begat Daniel. (On Daniel Pearl's role in microfinance history see this.) Evidently the family has poured some of its grief into this foundation, which seeks to build understanding across faiths and cultures. Quite a thing to think about as you admire the transcendent, detached, ethereal beauty of the mathematical work.

Disclaimer

CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.

Topics