89
Views
0
CrossRef citations to date
0
Altmetric
Book Reviews

Book Reviews

Pages 408-414 | Published online: 24 Aug 2018
 

David E. Booth

Kent State University

Fundamentals of Statistical Experimental Design and Analysis, by Robert G. Easterling. Chichester, West Sussex: Wiley, 2015, xxi + 272 pp., $90.00 (HB), ISBN: 978–1–118–95463–8; ISBN: 978–1–118–95464–5 (ebook).

As stated in an interview posted by the publisher, the author, Professor Rob Easterling, wrote Fundamentals to provide an “awareness of situations in which a well-designed experiment could resolve or clarify important issues they would encounter in their subsequent careers.” He tries to achieve this by putting “more emphasis on the context of an experiment (why it needs to be done and what should be done based on its results) and less on the technical details of the analysis.” However, in our opinion, the book goes a bit beyond the level of statistics of those with no training in this field. From our perspective, engineers with some quantitative training, perhaps an introductory statistics course, would be an ideal audience for this text.

Easterling starts off well in his quest to bring DOE to the masses. The artwork on the hard cover of Fundamentals is delightful, including a bit of intrigue—the word “information” in a scramble (it is explained early in the text). Cartoon illustrations lighten things up all along the way. Also, readers will be drawn in by the many engaging stories that make the fun of “fun”damentals first and foremost. Easterling draws liberally from Box, Hunter, and Hunters’ (Citation2005) book, for example, their delightful case on boys shoes—a clever experiment to test two alternative materials for the soles. The color graphics, generated with Minitab software, provide helpful visualization of the results from this and the other studies. A minor quibble is the lack of least-significant-difference (LSD) bars on effects plots, thus making it hard to see what really stands out for factor impacts.

Easterling does far better for making DOE easy than Funkenbusch did in his book that we reviewed in 2005 (Anderson and Whitcomb Citation2005). Furthermore, Fundamentals is more rigorous statistically. However, we would have liked to have seen less on some topics, for example, blocking, Latin Squares and one-factor designs in general, and more on two-level designs (treated as a special case of factorials), response surface methods, split plots, optimal designs, and mixture experiments (overlooked entirely).

We like Fundamentals for its practicality. For example, the discussion of experimental units in Chapter 2 stands out for its usefulness to beginners at DOE. Also, in Chapter 4, Easterling lays out a great list of questions (p. 94)—again doing well to provoke thought about the importance of proper planning before starting in on an experiment. In addition, his focus on plotting is a plus. However, due to Fundamentals falling into the gap between too little and too much statistics, we only recommend the book for a small audience: naturally inquisitive engineers and scientists with some quantitative training, who do not mind a healthy dose of theoretical depth at the expense of content breadth.

Mark Anderson, PatrickWhitcomb, and Martin Bezener

Stat-Ease, Inc.

Expect the Unexpected (2nd ed.), by Raluca Balan and Giles Lamonthe. Singapore: World Scientific, 2017, xi + 302 pp., $33.95 (eBook), ISBN: 978–981–3209–05–3.

This book is a beginning biostatistics text. The chapters are:

Probability

1.

Introduction to Probability

2.

Axioms of Probability

3.

Conditional Probability

4.

Discrete Random Variables

5.

Continuous Random Variables

6.

Supplementary Problems (Probability)

  • Statistics

7.

Introduction to Statistics

8.

Confidence Intervals

9.

Hypothesis Testing

10.

Comparison of Two Independent Samples

11.

Paired Samples

12.

Categorical Data

13.

Regression and Correlation

14.

Supplementary Problems (statistics)

  • Additional Topics

15.

Sample Size and Power

16.

Nonparametric Methods

17.

Answers to odd-numbered problems

18.

Tables

Bibliography

Index

The text of this book reminds me most of a 1960s style introduction to statistics book for noncalculus students, as can be seen from the chapter titles. The only things that relate primarily to biostatistics are the problems (which are taken from the natural sciences) as are the examples. Datasets are “hypothetical” based on “real-life” situations. A particularly good thing about the book is that the authors introduce computing using the R language. While this is not extensive, it goes along with the topics covered and is of use for any student going on in biostatistics or statistics.

The student that completes this course will, in some cases, be “statistically literate” but not likely to be able to perform any but the most elementary analyses. For example, let us consider the material on my favorite topic, regression. One of the good things about the book is that for each regression example (limited to straight line models), a scatterplot is given. The normal equations are given and slope and intercept coefficient formulas are written out. Then the computations are discussed both by hand calculation and by R functions lm and plot. No mention of R2 or residual plots is made, which severely limits the use of this material. No information on confidence intervals or hypothesis tests is included either. I would strongly suggest that the authors include this additional material for the regression line in the next edition.

All in all, this seems to be a useful book for a one semester “statistical literacy” course or as a very basic introduction to statistics for someone that is going to take a more advanced follow-up course (e.g., a standard introduction to biostatistics). It will not make the student a competent data analyst but will give him/her the flavor of statistics and may give encouragement to a few students to try a more usual introductory course. Such a course should certainly be doable by a student who has done well in this “literacy” course. If such is your goal, this is a reasonable book to consider for your class.

David E. Booth

Kent State University

Frontiers of Biostatistical Methods and Applications in Clinical Oncology, by Shigeyuki Matsui and John Crowley (Eds). Singapore: Springer, 2017, vi + 438 pp., $103.20, ISBN: 978–981–10–0124–6.

This book is a collection of 25 outstanding articles produced by leading biostatisticians in the field of clinical oncology. They include articles on:

Cancer clinical trials

Machine Learning Techniques in Cancer Prognostic Modeling

Evaluation of Cancer Risk and much more.

I strongly recommend that those statisticians working in the clinical oncology field check out these articles. Several were of use to me and I know that others will find them to be of use as well.

David E. Booth

Kent State University

Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood (2nd ed.), by Youngjo Lee, John A. Nelder, and Yudi Pawitan. Boca Raton, FL: CRC Press, Taylor & Francis Group, 2017, xviii + 446 pp., $95.96 (HB), ISBN-13:978-1-4987-2061-8.

This is a well-written book dealing with generalized linear model (GLM) and its extensions. The book offers a new method for the GLMs with additional random part. This is the second edition of the book designed on the previous works of the authors. Several topics have been added to the first edition (mostly in the later chapters). The book consists of 14 chapters. There is no end of chapter exercise. However, each topic is generally followed by some examples.

The book begins with classical likelihood topic and presents quantities derived from the likelihood, the distribution of the MLE and the Wald statistics, model selection including the authors’ views on model parsimony, marginal, and conditional likelihood, followed by a brief discussion on similarities and differences between the Bayesian and the likelihood methods.

In the next chapter, the authors fit generalized linear models using the iterative weighted least-square (IRLS) method along with additional topics on the model checking with some examples.

In Chapter 3, the authors introduce the extension of GLMs to quasi-likelihood (QL) and extended quasi-likelihood (EQL). Other topics covered in this chapter include dispersion models, joint GLM of the mean and dispersion, and also joint GLMs for quality improvement.

The extension of the likelihood appears in the next chapter where the authors use this extension as a tool for making inferences about the unobserved random variables. They also establish an asymptotic normality of the random parameter estimators and its finite sample adjustments. Examples like the wallet game show the need for the extended likelihood covered in this chapter. It is shown that the h-likelihood leads to an adequate estimation of both fixed and random effects with efficient fitting algorithms.

In Chapter 5, linear models with fixed effects are extended to models with additional random components followed by the developments of normal mixed linear models. Likelihood estimation of fixed parameters including estimation of variance component of the fixed effects appears next. The next topic is the discussion on the random portion of the model and the authors’ views of when the random effects should be used. The authors then show how the restricted maximum likelihood (REML) can be described as maximizing adjusted profile likelihood. They also compare the marginal likelihood with the h-likelihood for fitting the fixed effects.

In the following chapter, the extension of GLM to HGLMs by adding additional random effects to the linear predictor is presented. Special case of HGLM includes GLMM where the random effects are assumed to be normal and conjugate HGLM where the random effects are assumed the conjugate distribution to that of the response variable.

The topic of HGLM continues in the next chapter where this model is extended by allowing the dispersion parameter of the response to have a structure defined by its own set of covariates. The authors deploy IRLS to fit the extended model.

In Chapter 8, the analysis of correlated data via the use of HGLMs is presented and it is shown that the previously developed models for correlated data are instances of HGLMs. Topics covered here include HGLMs with correlated random effects and random effects described by covariance and precision matrices. Random effect models then are used in animal breeding and genetic epidemiology.

In Chapter 9, the authors discuss smoothing or nonparametric function estimation. They then present spline models followed by a long discussion on mixed model framework along with an example and followed by another detailed discussion on non-Gaussian smoothing which ends the chapter.

Chapter 10 continues with extending HGLMs to include additional random effects in their various components to include what is called double HGLMs (DHGLM) where the random effects can be in both the mean and the residual variances. Heteroscedasticity is also the subject of discussion in this class of models. Additional topics include the application in analysis of financial data. The authors also show that HGLMs and DHGLMs provide robust analysis for various misspecifications.

In the next chapter, the authors discuss a very important issue of variable selection using DHGLMs. The important issue to consider when selecting variables is the possibility of overfitting especially if the sample size is small relative to the number of covariates. It is easy to produce models fitting well on the training data, but not on the validation data. The authors also show how the group and bi-level selections can be done by using the random effects model approach.

In the following chapter, it is shown that multivariate models are easily developed by assuming correlations among random effects in DHGLMs for different responses. Furthermore, the multivariate approach allows us to handle missing data by providing an additional response for missingness.

Advances in technology have provided us the ability to perform multiple testing (several hypotheses tested at the same time). In Chapter 13, the authors present multiple testing beginning with single testing and then extending it to multiple testing including two and three states.

In the last chapter, it is demonstrated that the class of GLM models can be applied to the analysis of data where the response variable is the lifetime of a component or the survival time of a patient. Alternative models, namely, frailty models and normal–normal HGLMs for censored data are also presented in this chapter.

In sum, this is a well-written book suitable for researchers in the field or a senior level undergraduate course in a statistics discipline. An introductory course in probability and statistics should give sufficient background to comprehend the material covered in the book. The book is based on the authors’ previous work covering a variety of topics. Although there is no end of chapter exercise, the book generally includes a large number of examples following each topic. I found these examples very helpful in comprehending the topics covered in the book.

Morteza Marzjarani

Saginaw Valley State University (Retired)

Group-Sequential Clinical Trials with Multiple Co-Objectives, by Toshimitsu Hamasaki, Koko Asakura, Scott R. Evans, and Toshimitsu Ochiai. New York: Springer, 2016, vii + 113 pp., $54.99, ISBN: 978–4–431–55898–9.

Randomized clinical trials have now long stood as the gold-standard approach to assessing the safety and efficacy of experimental treatment regimens. Classically, most clinical trials have selected a single primary outcome variable upon which to assess treatment performance. This paradigm has led to the identification of many efficacious interventions. In recent years, however, the escalating cost of clinical trials has contributed to a stagnation in the identification of novel effective treatments. Consequently, there has been increased interest in the advancement of new procedures that can improve the efficiency of the drug development process, in terms of both time and money.

In Group-Sequential Clinical Trials with Multiple Co-Objectives, the authors’ focus on two statistical challenges that have arisen as part of this pursuit of increased efficiency in clinical research, namely, the use of multiple endpoints, and the conduct of noninferiority trials. While, as the book's title suggests, the discussions are set within the context of group-sequential trial design.

Impressively, the authors manage to write throughout in a style that is detailed yet concise, which contributes to their ability to cover an extensive range of topics in only 113 pages. However, as noted in Chapter 1, to truly appreciate the discussions within, a solid understanding of conventional group-sequential design methodology for a single parameter of interest is required. In my opinion, Chapters 1–3 of Jennison and Turnbull (Citation2000) should suffice.

In all, the book is split into seven chapters. The style is consistent in each, taking on the form of essentially separate, but interrelated, research articles. Indeed, each effectively summarizes the work from at least one of the relevant articles the authors have published in recent years. However, the technical details are pushed into four appendices, which permits a nice level of detail to remain in the core content.

In Chapter 1, the authors set the scene for the remainder of the book, describing in more detail the discussions outlined above on the changing landscape of drug development. Importantly, the difference between a trial with co-primary and multiple primary endpoints is detailed. The former meaning we wish to demonstrate a treatment is effective on all of the endpoints, and the latter on at least one endpoint, which has strong implications for the methods required to control the trial's error rates. In what follows, the focus is then predominantly on design for co-primary endpoints.

Chapter 2 introduces methodology for conducting group-sequential trials in studies with two co-primary endpoints. As the authors note, group-sequential designs are of particular relevance to trials with multiple endpoints as the sample size associated with fixed sample sizes can be prohibitively large. Two important testing frameworks, which repeatedly arise throughout the book, are then presented: that which requires the null hypothesis to be rejected for each endpoint simultaneously, and that which relaxes this to allow success when efficacy boundaries are crossed at different analyses. The aforementioned techniques are then extended to binary outcomes, via a normal approximation approach; one that should work well provided the sample size is large and the event probabilities are not small. Several examples are provided, which illustrate the performance of the described techniques, along with practical advice on how to choose a testing framework, and how to choose a correlation upon which to design the trial; an important issue given extended consideration.

To my pleasant surprise, based on my expectations from the book's title, Chapter 3 sees the authors diverge from group-sequential trials to consider sample size reestimation techniques devised to help deal with uncertainty at the design stage. The focus is on procedures for reestimating the required sample size based on the conditional power, with methodology for controlling the Type I error rate to the nominal level described. This though is the last foray away from group-sequential design methodology, so those looking for information on designs for, say, adaptively randomizing treatments in trials with multiple interventions will need to look elsewhere.

In Chapter 4, the methods presented in Chapter 2 are extended to allow the incorporation of group-sequential trials with early stopping for both efficacy and futility. Of particular note is that the presented framework allows the timing of the futility and efficacy analyses to be specified differentially, an interesting idea that is rarely allowed for in group-sequential design procedures. Significantly, the authors demonstrate how this allows for the delay of efficacy analyses to improve power, as they once more explore the influence of the various design parameters through a motivating example based on an Alzheimer's disease trial.

Chapter 5 differs from the rest of the book, and considers design for two primary endpoints. This, the authors note, opens up many opportunities for methods for controlling the familywise error rate. A simple weighted Bonferroni procedure is introduced, along with an extension to a more power graphical-based technique. Once more, the important issue of endpoint correlation is deliberated, with a conservative approach now provided through an assumption of a correlation of one.

In Chapter 6, the focus shifts once more, this time-to-group sequential design for noninferiority trials. The utility of such studies is first outlined through arguments appealing to scenarios in which a two-arm placebo-controlled superiority trial would be considered unethical because of the availability of an effective treatment. As in the previous chapters, the presented sequential procedures are demonstrated to confer valuable efficiency advantages in terms of requisite sample sizes compared to a single-stage approach.

The final chapter then considers a host of possible extensions to the methods discussed throughout the book. This is particularly interesting, though it raises the question of how soon a new book could well be required in this fast-changing research domain. Indeed, with, for example, methods for reestimating important correlations in a blinded fashion having recently been proposed (Kunz et al. Citation2017), there is certainly scope for a vastly expanded text in the future.

In the appendices, the authors provide the extended technical details necessary for implementing the methods of the previous chapters. Their derivations are extensive enough that anyone should be able to implement the methods in their preferred software language, using standard multivariate normal integration techniques. This should not be too complex a task, however one minor criticism is that it would have been nice for some accompanying code to have been made available.

Nonetheless, this book is a highly stimulating read for anyone who is interested in group-sequential trials, or anybody who wishes to learn about how studies with multiple endpoints can be made more efficient. As a standalone text, it is a highly effective method for learning the core principles of this emergent class of design methodology.

Michael J. Grayling

University of Cambridge

Wavelets in Functional Data Analysis, by Pedro A. Morettin, Aluísio Pinheiro, and Brani Vidakovic. New York: Springer, 2017, viii + 105 pp., $54.99, ISBN: 9783319596228.

This short monograph presents a collection of important ideas relating to the application of wavelets in functional data analysis in several domains. Wavelets have received serious attention as techniques useful in statistical learning methods, data mining, and many other applications in sciences and engineering fields. Functional data models are also becoming very relevant in recent years with the explosion of sensor data and improvements of computational capabilities. Even though wavelets are a common tools used among researchers of functional data, a comprehensive reference was needed, hence this work is very relevant in the present context.

Chapter 1: Examples of Functional Data

The authors have pointed out four ways functional data analysis is different and challenging from other type of data analysis. These are high dimensionality of data, strong time dependency, need for localization and regularization. Wavelets offer solutions for all four of these concerns as they are very successful in dimension reduction, de-correlation and they are also local and can be used to regularize. These properties make them suitable for use in functional data analysis. The authors also describe five datasets that they will use to demonstrate the application of wavelets on statistical learning methods on functional data.

Chapter 2: Wavelets

In Chapter 2, the authors introduce the concepts of wavelets and discuss their important properties in the context of functional data analysis. Especially they focus on how wavelets address the four aspects of commonalities for functional data. They have also provided sample to understand explanation of how wavelets work, including example implementation in Matlab. The examples consist of construction of wavelet matrix, discrete wavelet transform, and inverse wavelet transform. They have also considered theoretical details of few other related topics including Daubechies-Lagarias Algorithm and Covariance Wavelets transform.

Chapter 3: Wavelet Shrinkage

Chapter 3 focuses on the shrinkage properties of wavelets. Wavelet thresholding being the most important shrinkage strategy has been discussed in detail. The authors provide interesting contrast between shrinkage properties wavelet methods and trigonometric basis like Fourier expansion-based methods. Two standard choices of thresholding rules, soft and hard thresholding, as well as choice of threshold has been considered. The authors also have provided Matlab implementation of function estimation using wavelets.

Chapter 4: Wavelet-Based Andrew's Plots

Andrew's plots are popular visualization tool of multivariate data analysis. They work by projecting data points into a one-dimensional subspace often described by the Fourier basis. However in this chapter, the authors considered wavelet-based Andrew's plots. They have also demonstrated these plots in several examples both in the domain of multivariate and functional data.

Chapter 5: Functional ANOVA

In Chapter 5, the authors discuss functional ANOVA models. They have explained why a point wise application of standard multivariate ANOVA method would not work in this case, by reiterating the challenges with functional data, namely, dependence and high dimensionality. A common approach to solve this is to transform the data into Fourier domain and continue with high-dimensional ANOVA tests. The authors then consider using a wavelet basis and explore the properties of the resulting test statistic. The Orthosis data, introduced in the first chapter has been used to demonstrate the functional ANOVA models as well as an application of cloud and temperature mapping has been discussed.

Chapter 6: Further Topics

The final chapter discusses application of wavelets in statistical learning methods for functional data including classification and regression problems. There are several approaches for application of wavelets in functional data analysis, including multi-fractal spectra, wavelet transform with classification expectation maximization and using kernels and basis decomposition. The authors have considered in detail the DWT-CEM algorithm, with an illustration on the FMRI dataset introduced in Chapter 1. There are also short discussions on functional regression methods following the work of Ramsey and Silverman and dimension reduction.

Summary

In summary, this book is short and offers quick reference on common techniques for application of wavelets on functional data analysis using some real data examples. The authors have provided code examples in Matlab for some of the methods discussed in this book. However, a more complete compendium of all codes used in this book would be much more helpful. Overall this is a useful book for quick reference for researchers in this field.

Abhirup Mallik

Bosch Center for Artificial Intelligence

Editor Reviews

Statistics for Nonstatisticians (2nd ed.), by Birger Sternholm Madsen. New York, NY: Springer, 2016, xxi + 185 pp., $49.99 (HB), ISBN: 978–3–662–49348–9.

The first edition of this book was released in 2011. I did not find any record of review of any previous editions in past Technometrics journals. The format and intention of this second edition seems to be the same. Like the previous edition, author presents some basics and commonly used statistical methods avoiding the use of technical and mathematical treatments. Seemingly the revision is not extensive, the new addition includes:

  • Lognormal distribution, control charts, and process capability are added to Chapter 4

  • A section on ANVA is included in Chapter 8.

  • References, links and software have been updated.

I find this is a good introductory book on the topics useful to a broad class of audience. This book is clearly written for practitioners in the host of applications. This is also a good reference book for professionals. The interesting feature of this book is that it approaches explanatory rather than mathematical. The organization and structure is good and logical. The major weakness of the book is that it does not offer an exercise and problems section, an essential tool for adopting any book as a textbook especially at the undergrad level.

Like the first edition, this revised edition in total presents eight chapters and appendices collected in chapter nine.

Below is the list of the chapters.

Chapter 1: Data Collection

Chapter 2: Presentation of Data

Chapter 3: Description of Data

Chapter 4: The Normal Distribution

Chapter 5: Analysis of Qualitative Data

Chapter 6: Error Sources and Planning

Chapter 7: Assessment of Relationship

Chapter 8: Comparing Two Groups

All the chapters are nicely structured and presented without requiring too much mathematical knowledge. In summary, this is a good contribution, providing a good coverage of selected topics in a logical and systematic manner, making the subject of statistics a beautiful and useful tool in many walks of life!

S. Ejaz Ahmed

Brock University

Mixed-Effects RegressionModels in Linguistics, by Dirk Speelman, Kris Heylen, and Dirk Geeraerts (Eds.). New York: Springer, 2018, vii + 146 pp., $80.99 (HB), ISBN: 978–3–319–69828–1.

The mixed Mixed-Effects regression models are widely used by researchers and practitioners are like in linguistics and in a host of areas. This edited volume offers a new research agenda for the application of mixed models in linguistics dealing with complex situations resulting in complex data. This refereed edited volume includes an array of interested topics including:

  • Use of huge datasets

  • Dealing with nonlinear relations

  • Issues of cross-validation

  • Model selection

  • Random structure and other

This edited volume contains seven chapters emerging from a range of topics in linguistics focusing on new research.

The titles of respective chapters in the book are as follows:

  • Introduction

  • Mixed Models with Emphasis on Large Datasets

  • The L2 Impact on Learning L3 Dutch: The L2 Distance Effect

  • Autocorrelated Errors in Experimental Data in the Language Sciences: Some Solutions Offered by Generalized Additive Mixed Models

  • Border Effects Among Catalan Dialects

  • Evaluating Logistic Mixed-Effects Models of Corpus-Linguistic Data in Light of Lexical Diffusion

  • (Non)metonymic Expressions for GOVERNMENT in Chinese: A mixed-Effects logistic Regression Analysis

All the articles in this text are very well organized and are consistent in style and presentation. Each article begins with an introduction and concludes with a list of references. The chapters provide examples from related subfields in linguistics, and it provides R codes for analyzing the data at hand.

As one can imagine with this type of text, the topics are very well diverse in nature, but interesting and informative. It would be nice if the editors could have grouped all of the chapters into a couple of sub-groups for the reader's for a smooth reading of the book. Having said above, editors have done a good job in introduction chapter introducing the remaining six chapters of the volume. I would recommend interested reader to read the Chapter 1 first for a smooth reading of the book and to get maximum benefit of it.

I assume that the intended primary audience for this book is those scientists working linguistic domain. I would safely conclude that that book is also useful for those who are interested in, collecting, and analyzing such data in other fields of applications.

S. Ejaz Ahmed

Brock University

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 97.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.