829
Views
0
CrossRef citations to date
0
Altmetric
Book Reviews

The Effect: An Introduction to Research Design and Causality

by Nisar Ahmad Khan, New York, Chapman and Hall/CRC, 2021, 646 pp., $23.99 (pbk), ISBN 978-1-032-12578-7.

ORCID Icon

Statistics has everything to do with research. It is inevitable to bring about meaning out of data. Statistics has a great role to play in decision-making (Tufféry Citation2011), language (Mala Citation2022a), and life sciences (Stewart and Day Citation2015). Research is indispensable for all human progress. It is, thus, essential to know the bits and pieces of what constitutes research and how it is carried out. These include research design, research ethics, and many more. It is with this target in mind that this book has been penned down.

This book, divided into two parts and comprising twenty-two chapters, attempts to explain the various aspects of the research process.

The first part of the book, The Design of Research, the shorter of the two parts, consists of eleven chapters, and spans over 172 pages.

Chapter 1, Designing Research, is about research design and empirical research. It is the shortest chapter of the first part and could be considered introductory. It also attempts to answer why designing research is important. A research question is defined as one that a researcher tries to answer through his or her research. Empirical research is any research that attempts to answer research questions by making use of data from the real world.

Chapter 2, Research Questions, is about what constitutes a good research question. It also discusses the central question: Where do research questions come from? A good research question takes us from theory to hypothesis. Research questions can come from lots of places. Mostly, it is our curiosity that leads us to know how the world works and that naturally leads to questions.

Chapter 3, Description of Variables, is about variables and the description of variables. It is the longest chapter of the first part. It summarizes what variables are and talks a good deal about theoretical distributions. A variable, in the context of empirical research, is a bunch of observations of the same measurement. The chapter talks about the different types of variables—continuous, discrete, count, ordinal, and categorical. The idea of a variable distribution is a key concept in research. It attempts to describe how often certain values occur. The chapter is rich in its contents and discusses, besides these things, the theoretical distributions that include the normal and the log normal. There is also a detailed account of the testing of the hypothesis.

Chapter 4, Describing Relationships, is about the relationship between two variables, conditional distribution, and the concept of fitting the regression line. The relationship between two variables shows us what learning about one variable tells us about another variable. Conditional distribution is the distribution of one variable given the value of another variable.

Chapter 5, Identification, is about the data-generating process, variation, identification, and other things. This is divided into two ideas. The first is the idea of looking for variation and the second is the idea of identification. In context and omniscience, if a person doesn’t understand where the data came from, he will not be able to identify the answers toquestions.

Chapter 6, Causal Diagrams, is about causal diagrams, research questions in causal diagrams, and moderators. A causal diagram is a graphical representation of a data-generating process. These diagrams are used to identify the answers to our research questions. A causal diagram contains only two things: A variable in the data-generating process and a casual relationship in the data-generating process. The research question in the causal diagram figures out how to identify the answer to our research question. Moderators in causal diagrams modify the effect of one variable on another.

Chapter 7, Drawing Causal Diagrams, focuses on how to design and put together a causal diagram.

Chapter 8, Causal Path and Closing Back Doors, is focused on the paths from one variable to another on a diagram. It describes a path between two variables on a causal diagram and it is a description of the set of arrows and nodes you visit when walking from one variable to another. The chapter also gives an idea about good paths and bad paths, front doors, back doors, and open and closed doors.

Chapter 9, Finding Front Doors, discusses how we can find a set of variables that closes all the back doors. If we can estimate the front doors directly, we don’t need to worry about closing the back doors. The author gives the cleanest application of this approach as a randomized controlled experiment.

Chapter 10, Treatment Effects, is about treatment effects, average treatment effect, and treatment effect distribution.

Chapter 11, Causality with Less Modeling, is about confidence and wide open spaces. We have to draw causal diagrams to map out our idea of the data-generating process, use that diagram to less out all the paths from treatment to outcome, and close pathways so that we have only good paths that we want.

The second part of the book, The Toolbox, the longer of the two parts, consists of eleven chapters, and spans over474 pages.

Chapter 12, Opening the Toolbox, is about the concept of the toolbox that is commonly used by researchers. Statistics makes extensive use of tools. Thus, being adept at the tools of Statistics is pivotal. There was a time when statisticians relied heavily on the use of print tables. Now, the world of statistics is flooded with soft tools. The life of a statistician is more meaningful and worth living thanks to the presence of soft packages, such as Excel, and SPSS, and user-friendly code languages such as Python.

Chapter 13, Regression, is about the basics of regression, regression tables and model fit statistics, turning a causal diagram into a regression and subscript in regression equations. When it comes to identifying causal effects, regression is the most common way of estimating the relationship between two variables while controlling for others allowing you to close backdoors with these controls.

Chapter 14, Matching, is about single-matching variables and multiple-matching variables. Matching is the process of closing the backdoors between a treatment and an outcome by constructing comparison groups that are similar according to a set of matching variables.

Chapter 15, Simulation, is about simulation and power analysis with simulation. In the context of this chapter, simulation refers to the process of using a random process that we control to produce data that we can evaluate with a given method. Simulation can be used to rule out bad estimators and prove the value of good estimators to ourselves.

Chapter 16, Fixed Effects, is about fixed effects, random effects, and fixed effects in the non-linear model and regression estimators. Fixed effects are methods of controlling for all variables whether they are observed or not, as long as they stay constant within some layer category.

Chapter 17, Event Studies, is about event studies and how they work. The event study is probably the oldest and simplest causal inference research design. Event studies and performed in the stock market, event studies with regression, and event studies with multiple affected groups.

Chapter 18, Differences-in-Differences, is about a method, a quasi-experimental approach that is concerned with comparing the changes in outcomes over time between a population enrolled in a program (the treatment group) and a population that is not (the comparison group). Its usefulness in data analysis has already been felt and appreciated.

Chapter 19, Instrumental Variables, is about the working of instrumental variables and isolating variation instrumental variables designs seize directly on the concept of randomized control experiment. For work on instrumental variables, we must satisfy two assumptions: relevance of the instrument and validity of the instrument.

Chapter 20, Regression Discontinuity, is about regression discontinuity. Regression discontinuity focuses on treatment that is assigned at a cutoff. It also focuses on the concept of running variable or forcing variable, cutoff, bandwidth, regression discontinuity with ordinary least squares, and the density discontinuity test.

Chapter 21, A Gallery of Rogues: Other Methods, shows that the world of research design is too wide to be accommodated in a single book. There are methods and designs, both old and new, both tested and untested. This chapter demonstrates that there is a world of other methods that are new and yet developing. Among such exotic methods, the chapter talks about synthetic control, matrix completion, causal discovery, double machine learning, modeling of heterogeneous effects, causal forests, sorted effects, and structural estimation. This chapter could be inaccessible at the first reading, as the content of the chapter is either too advanced or too new to be grasped and understood.

Chapter 22, Under the Rug, is all about the assumptions and concerns that are a part of pretty much any causal inference research study, but which often gets ignored or at least brushed aside.

Overall, this book, though very voluminous, is an excellent addition to the world of literature. The book contains a good number of examples and wonderfully drawn diagrams, that facilitate a clearer understanding of the concepts. It is a wonderful exhibition of the parts and parcels of research design and causality.

Nisar Ahmad Khan
GDC Sopore, Jammu and Kashmir, India
[email protected]

References

  • Mala, F. A. (2022), Statistical Universals of Language: Mathematical Chance vs. Human Choice: By Kumiko Tanaka-Ishii, Cham: Springer,236, pp.
  • Stewart, J., and Day, T. (2015), Biocalculus: Calculus, Probability, and Statistics for the Life Sciences, Boston, MA: Cengage Learning.
  • Tufféry, S. (2011), Data Mining and Statistics for Decision Making, Chichester: Wiley.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.