1,987
Views
1
CrossRef citations to date
0
Altmetric
INTERVIEWS WITH STATISTICS EDUCATORS

Interview With Richard De Veaux

&
This article is part of the following collections:
Interviews with Statistics and Data Science Educators (1993-2018)

Richard De Veaux is the C. Carlisle and Margaret Tippit Professor of Statistics at Williams College. He is a Fellow of the American Statistical Association and has served on the ASA Board of Directors.

Beginnings

AR:

Thanks very much, Dick, for agreeing to this interview for the Journal of Statistics Education. I'll start by asking about when you were 18 years old. Where were you then? What were you thinking in terms of career plans and goals at that point?

RD:

The summer before going to Princeton for college, two amazing events took place—Woodstock and the moon landing. I lived about 3 hours from Woodstock in the hills of Western Pennsylvania, but it looked like rain so I thought, “Maybe I'll go next year.” So I never got to Woodstock and I hadn't gone to the moon either, but I decided that I'd like to go to Mars. My Dad told me that I should (read must) major in Engineering, so I signed up for Aerospace Engineering freshman year, thinking that I'd be among the first scientists going to Mars. I turned 18 that December of freshman fall—and that's about as far as my career plans had gotten by then. And that's what I love about advising first-year students. I try to gently point out that their ideas about what they are going to study and what they are going to be might change just a bit in the next few years.

AR:

Wow, traveling to Mars is a pretty specific (and cool) career goal! How quickly did you deviate from that path? Did you continue to study engineering at Princeton, or did you pursue something else?

RD:

Well, NASA pulled the plug on Mars pretty quickly after the lunar landing. I had no idea, but my Aerospace professors were honest enough to get all of us freshmen into a room in the spring to tell us that space travel was essentially on hold and that we weren't going to Mars. If we wanted to stay, we were welcome and we could probably get jobs designing planes when we graduated. Definitely not my thing. Now, my Dad was still serious about the Engineering degree, so I started looking at other engineering. Civil Engineering had the fewest requirements and so I started down that path. The next fall I took a required course: CIV245 Engineering Statistics, taught by this guy named Stu Hunter. That course wound up changing my life. Stu was charismatic and looking back it's amazing how exciting he made an Intro Stats course with essentially no graphics, very small datasets, slide rules for calculating and chalk! This was still a couple of years before the scientific calculator came out. We have it so much easier today!!

AR:

Before we leave Mars completely, I have to ask: Have you read the book The Martian by Andy Weir, or seen the movie with Matt Damon? If so, what did you think?

RD:

At first I thought… “I can't finish this. It's way too nerdy.” This is exactly why I started having doubts about Aerospace Engineering even before NASA pulled the plug. But, then, of course I got totally engrossed by it. It seems hopeless at every turn and somehow he “sciences” his way out of it. As much as I love Matt Damon, I found that tension completely missing in the movie.

AR:

Once Stu Hunter changed your life, what came next? Did you immediately start taking more Statistics classes? Did you complete the Civil Engineering major?

RD:

The fall of my freshman year I took Honors Linear Algebra from Harold Kuhn (of Kuhn-Tucker). Kuhn was an imposing figure. He wore expensive suits, consulted with the Feds, had an eye-patch, and taught an incredibly abstract Linear Algebra course. I understood almost nothing. So I switched “down” from Honors to Engineering Math (Multivariable Calculus the next semester). It was taught by Norm Steenrod, a famous topologist, but also a devoted educator and very good teacher. Then, the semester I took Stu's course I took Differential Equations and did very well. So I thought “wait, this math stuff isn't that hard. Linear Algebra must have been a fluke.” Let's take Abstract Algebra. Which I did—in sophomore spring. I barely survived. But, I declared a double major (and double degree–AB in Math and BSE in Civil Engineering), which, miraculously, I finished. There were two of us the year we graduated who graduated with two degrees and they banned it after that. But yes, I finished both the math and civil majors. I took three more courses and an independent work with Stu and did a thesis (required) in ergodic theory with Gil Hunt (of Hunt-Stein processes) senior year. I also took Geoff Watson's Math Stat graduate course senior year, but sadly, never took any of Tukey's courses. I got to know him later. When I applied to Stanford, and got in, Tukey and Watson were very sweet and invited me to lunch at the Faculty Club Senior year. They both wanted me to go to Stanford and in fact, I was accepted to Princeton for graduate school in Statistics too, but they gave me only half support so I'd go west. It was a lunch I'll never forget.

AR:

Tell us a bit about your Stanford experience. What did you end up specializing in, and what was the topic of your dissertation?

RD:

Ok. Well, I give less and less advice, but I would warn students about going right to graduate school unless they're really sure that's what they want. I graduated from Princeton in June 1973. I was 21 years old and I wanted to go to California right away just to go to California for the summer. I mean, who wouldn't? I told this to Ingram Olkin, who said, “That's great! Why don't you TA a course or two and we can support you for the summer Quarter.” I thought “fantastic.” I can hang out and get paid for grading. Then he said, “Well, as long as we're paying you as a TA, you get 9 tuition units too. Why don't you take a couple of courses?” Big mistake!!! I needed a break and didn't realize it until too late. I took a very, very abstract 300-level course on the basis of my thesis in ergodic theory “Convergence of Probability Measures.” I survived, but it was torture and I understood almost nothing. Not a good way to start.

I made it through quals at the end of the first year and then took the summer off (!) and went home to Pittsburgh, where my parents had moved, for a summer job. In the fall, a young guy named Persi Diaconis was the new Assistant Professor. Rumors were flying about his previous career as a magician and I rushed to be his TA. After passing orals the next summer (now I was 23), I asked Persi to be my advisor. He agreed (I was his first student … to start with him anyway). The topic was a big one–robust correlation.

But that summer, to help me deal with the rigors of studying on my own for orals, I took a Modern Dance class (actually two) that I'd always wanted to take. I had tried swimming, yoga, biking, everything, but I couldn't stop thinking about my orals' subjects. Dance was so hard I couldn't think about anything else for 2 hr except which foot to move so I wouldn't fall down! It was fantastic!

So, in the fall I took a couple of courses and started on my dissertation. I was completely not ready to do independent work at this level. Plus, I started taking more and more dance classes. By the second year, I was taking four modern dance, 2–3 ballet classes a day and an occasional folk, tap, or jazz class. I wasn't getting much done on the thesis. Plus, my modern dance teacher, Inga Weiss, was an amazing inspiration (see pictures). She was a large part of the reason I continued in dance, and eventually got my Master's degree in it.

Persi stopped funding me after my fourth year (he should have cut the money off sooner, but he was a nice guy), so I got a part time job as statistical analyst at the Center for Advanced Studies at Stanford. It was great. I worked mornings and took dance the rest of the day. That winter, I auditioned for a dance company that did yearly winter tours of the West Coast. To my pleasant surprise I was accepted into Tandy Beal and Dancers (www.tandybeal.com). They were unique because they rehearsed intensely for about a month, then toured for 2 months and then disbanded until the next year. My boss gave me a leave of absence and off I went. Persi was incredibly understanding. Since he had toured as a magician for 12 years, he said “I hope it works out for you, but if the dance career ends and you want to come back, I'll be here.” I couldn't have hoped for a better response.

To make a long story somewhat shorter, my future wife, whom I'd met in the dance studio (see above in white and below bottom left) graduated in June 1978, after my tour with Tandy ended. We decided to head to Europe to see if we could get dance jobs, which we did. We worked for a season with Tanzproject in Munich–a group that is alive and well 40 years later (http://www.tanzprojekt-muenchen.de/), and then we moved to Utah to help found Wimmer, Wimmer and Dancers (http://www.nytimes.com/1982/12/19/arts/dance-wimmer-troupe-marital-reminiscences.html), which moved to Philadelphia in 1980.

Persi told me of a guy at Penn I should talk to if I wanted to do some part-time statistical work. Both my money and my back were wearing out, so I went right over to the Wharton School. After a couple of meetings, Larry Mayer hired me as a full-time research associate doing energy analysis. I retired from dance within the year and went back to Stanford in the summers to try to see if I could resurrect my Stats career.

About the third summer, Persi found a project better suited to my talents (or lack thereof). A graduate student had done an interesting experiment with musical perception whose data looked like two crossing regression lines. My dissertation was on “Mixtures of Linear Regressions” that I finally finished in 1985 and got my PhD in 1986, by which time the Analysis Center had folded under less than ideal circumstances (http://www.library.upenn.edu/docs/kislak/dp/1981/1981_11_05.pdf). I had moved into the tenure track position of Assistant Professor in the Statistics Department at Wharton (U Penn) in 1984 while finishing up my dissertation in the summers.

Embarking on an Academic Career

AR:

My goodness, what an exciting 12 years! That was quite a meandering path toward a Ph.D. in statistics and a position in academia. I think we now have a good sense of where your passions for statistics and for dance originated. Where and when did your interest in teaching arise?

RD:

I've been a teacher since I can remember. Like many of us, I was a little prodigious when I was very young (it quickly wore off). I taught myself to read when I was about three and was pretty good with numbers. I wanted to play more ambitious games with my friends, other than Candyland and Uncle Wiggly, so I tried to teach them poker. Of course, first I had to teach them to recognize number and letters, then count, and then add. The parents in the neighborhood had mixed reactions…

The desire to teach resurfaced when I went away to boarding school at 15. There were some Mr. Chips-like teachers that had a big influence on me, and I thought I wanted to be one. Then at Princeton, once I knew that my draft lottery number was high enough, graduate school and teaching seemed like the right path for me. I guess I've never really imagined doing anything else–not very imaginative. Even when I was dancing I got a Master's degree in Dance Ed so that I could teach dance when I stopped performing. Things just didn't work out that particular way, but I've taught modern dance during Williams' Winter Study four times. I love teaching, no matter what the subject. Now, I've replaced dance during Winter Study with a course on the wines of France. Teaching is such a great way to push you to learn more. It's probably my laziness, but if I didn't have to teach a subject I probably wouldn't go as deeply into it. Having to stand in front of others and answer every imaginable (and some unimaginable) questions is such a great way of finding out how deep your knowledge is.

AR:

Well said! Back to your biography: What did you teach at the Wharton School, and how was your overall experience there?

RD:

Ok. That's a loaded question. First, I was a research associate for the first 3 years at the Wharton Analysis Center. That was the group Larry Mayer headed that did sponsored research for the Department of Energy. It was an incredibly vibrant group, at least at first. Larry was charismatic, energetic, and a great motivator of young researchers including Bob Stine (Wharton), Scott Zeger (Johns Hopkins), and Yoav Benjamimi (Tel Aviv University) among others. Unfortunately, it all got to be a bit much for Larry and he resigned in 1982 amidst a scandal of sex, drugs, rock and roll, and misuse of funds. Bob Small and I took over the center as director and associate director, but we had neither the experience nor charisma of Larry and the center folded in 1983. It had been a wild ride to say the least. I took the next 6 months to work on my dissertation back at Stanford, and while there I applied for a lecturer position at Wharton starting in January 1984. That first semester I taught three intro classes (back to back to back), Stat 100 for Wharton Sophomores, Stat 530, Mathematical Statistics for juniors and seniors in the management and technology program and an intro stats for MBA's. It was fun, but confusing (!). During the spring I applied for the regular assistant professor position, which I started in the fall of 1984. Most of the interesting upper level courses were already claimed by the senior members of the department, so I taught the year-long Stat 100-101 to sophomores and an intro course to PhD students in marketing. We did some complex models, but I was told not to use any matrix algebra, so there were a lot of subscripts! I think I also taught the year-long Stat 530/531 Probability and Math/Stat sequence. It's a bit of a blur now, but that's where I got my start teaching introductory statistics. The department was great. It was a much sleepier department than the powerhouse that it is now, but it was very friendly. I really didn't know what I was doing, but I worked with Abba Krieger on marketing problems when I wasn't trying to finish my dissertation. The transition back from touring as a dancer wasn't that easy. I had been going back to Stanford every summer since moving to Philadelphia to try to make progress on it. Persi lost patience with me at some point (who wouldn't?), and I tried to work on a different project with Lincoln Moses for a year or so, but during the fall of 1983 when I went back for 6 months, Persi found the problem on regression mixtures I mentioned before and took me back. In the spring of 1985, I called Persi to make sure that it was ok for me to come to Stanford again, and he told me that it would be fine, but he'd be in Paris (!). I asked if he would mind if I went to Paris to try to finish and he said that would be no problem. So, I headed off to Paris in June, rented an apartment, and worked on my dissertation for as many hours a day as one can stand. I remember visiting Persi after I submitted several chapters. He said, “I think you're getting close,” but all I remember was that there was more red ink on the pages than typing. To relieve stress, I jogged around the Luxembourg Gardens nearly every day, but thanks to my dance injuries, it went right into my back and I herniated a disc. My fiancée Sylvia joined me in Paris in July, and we were married in Essex, Connecticut in August.

By the fall, my back pain was so bad that I had to stand to work on my dissertation, and I started walking with a cane. (This was before standing desks became popular, and of course laptops didn't exist. I had to write on a pad of paper on the fireplace mantle.) I was also consulting at a Hewlett-Packard plant about an hour south of Philly every week. This was great experience. I got to see all sorts of problems, and it really honed both my communication and applied statistics skills. But this all culminated with a back spasm in November that had me immobile and bed-ridden for a week. It was a tough year, but I finished the dissertation and officially graduated in March 1986. About the same time Mike Steele invited me to give a talk at Princeton. The Statistics Department had folded the year before, and the group had moved to Civil Engineering (now Civil Engineering and Operations Research). It turns out that the invitation was a job interview. Sylvia, who had been living and dancing in New York, and I bought a small house on the other side of Route 1 from Princeton and I started as an assistant professor there in the fall of 1986.

AR:

My goodness, your story has so many twists and turns! Did you settle into something resembling normalcy at Princeton? Did your interest in teaching blossom there?

RD:

Well, it certainly depends on what you mean by normalcy. It seemed to start off that way. I taught Intro Stats to Engineers, Design of Experiments, and then either Regression or what we would now call Data Mining each year for a three-course teaching load. Of course, I was using Box, Hunter, and Hunter (Citation2005) for Design of Experiments, which I thought was a wonderful book. (More on that later.) I was also trying to get my research program off the ground. Mike Steele was publishing about 12 papers a year in top journals, so the bar was pretty high (to say the least). Mike tried to mentor me by showing me “how it's done.” We found a problem in soil science with some data. The soil scientists were arguing in their literature about the right scientific model for soil water diffusivity, so we tried applying some “modern” regression ideas to their data-like generalized additive models, which were hot off the shelf in 1987. It was a tricky estimation problem because there were known differential equation constraints as well. The way we worked it was Mike would write a section, give it to our secretary (this was clearly in the old days) and she'd put the typed version in my mailbox. Then, I'd work on it and give it back to her and she'd put it in Mike's box and so on, back and forth. Well, Mike is also a good speed chess player and it seemed like we were playing. Whenever I saw the latest draft in my box, I'd start working on it. I'd work for days, put it back in the box and then… it seemed within minutes it was back in my box! This went on for about 3 weeks. We sent it off to Technometrics and it was accepted with minor revisions and later wound up winning the Wilcoxon award for application paper (De Veaux and Steele Citation1989). Mike said “see how it's done?” That sure seemed easy. But, apparently I didn't really “get it.” I never wrote another paper that quickly or that was that successful with a paper again.

Meanwhile, although I thought Box, Hunter, and Hunter (BHH) was the “bible” for DOE, I was getting consistent feedback from students that they felt it was hard to read and confusing. For a few years, I ignored the feedback, thinking simply that they were wrong. Then, I started looking at it more from their point of view and I began to see what they meant. There was a formality and a style that wasn't wearing as well in the late 80s as it did in the early 70s when I was taught from the notes.

So, I started writing my own DOE textbook, following BHH's syllabus (starting from Chapter 6). I used those notes and students seemed to like the language better. But I also realized that part of the problem was that they weren't remembering the right things from their Intro Course. So I wrote some chapters to “remind” them of the basics that they should have learned. This part of the book kept getting longer and longer. Meanwhile, I had taught Stat 101 to liberal arts majors at Princeton once and used Freedman, Pisani, and Purves (Citation2007). I loved the book—well, at least until the end when the box model started getting a little too restricting. So I tried to inject the things I liked from FPP into those chapters as well. More on that later as well. And I have to mention two people who had an enormous impact on my writing. I never really liked writing, because I found it too formal and artificial. But Mike Steele showed me that you can write like you talked. If you ever read anything by Mike (and you should!) you can hear his voice as you read it. Once he showed me that, it completely changed both my attitude about writing and my style. And writing a paper with Howard Wainer showed me that one could be elegant as well as clear. Howard is a wonderful writer and working with him was eye opening. I owe a lot to both him and Mike.

It became somewhat obvious that I wasn't going to produce either the number or the quality of papers that Princeton wanted, so when I didn't get tenure I wasn't shocked. I'd also had two kids by then with one more on the way, so life was busy (!). But I was shocked when the dean offered me “another” position in the same meeting that he told me I hadn't received tenure. They wanted me to stay on indefinitely as a Senior Lecturer in Statistics and head of the Committee on Statistics at Princeton. That seemed fine to me—at least for the moment. My teaching load went from three to four a year and I negotiated a raise, so life wasn't bad in that respect. But … I started exploring other possibilities as well.

Arriving at Williams College

AR:

Yes indeed, normalcy is in the eye of the beholder! I know, as many readers of this interview do, that Williams College eventually provided you with one of these other possibilities. Have we now reached that point in your story?

RD:

Yes, I was in my second year as Senior Lecturer. Things were fine at Princeton, but I didn't see any realistic path to tenure there, so I started looking at other opportunities. I interviewed for a few jobs, including the Associate Director at NISS (which Alan Karr deservingly won) and leading a group at the National Center for Atmospheric Research (NCAR), the position that Mark Berliner and later Doug Nychka held. But another intriguing ad was the one from the Williams College Mathematics Department. I had heard rumors that math departments were not the ideal place for a statistician–I might have to teach calculus classes, my papers not in math journals wouldn't count, and I'd generally be treated as a second-class citizen. But this ad was amazing. It sounded not only like they understood what a statistician did, but they wanted one! Moreover, it really sounded (to me) like they wanted me. They talked about the importance of teaching and applied statistics, both things I felt passionate about. So, I applied and noted that the closing date was November 15. For some reason, I really felt that this was my job, so I was disappointed that the phone was quiet on November 16. It seems ridiculous now, but I was really expecting that phone call. I just felt that the job was a perfect fit.

Frank Morgan, the chair of the department, called on November 17. It was a wonderful conversation. The place sounded too good to be true. They had been trying to find the right person for several years. To them I seemed to have just the right amount of experience and interest in teaching–and still be willing to take an untenured position. (Williams almost never tenures from the outside). To me, it sounded ideal. I'd be able to start a statistics curriculum from scratch, teach the courses I wanted to, and get support and respect for doing so. So, after our talk I was ready to come up, interview and start the job! Then Frank said, “How's the first week in January for you?” The first week in January? I was ready to come up next week.

It turns out that January 1994 was the coldest and snowiest January for many years. There were two feet of snow in Princeton when we left. Frank had invited my whole family to come, so we loaded up the Subaru wagon with our 5-, 3-, and 1-year-old and headed up the Taconic. We crossed the Petersburg pass just before sunset and headed down Route 2 to Williamstown. That's a four-mile, 7% downhill ride. Fortunately, like you, I learned to drive in Western Pennsylvania so I managed to maneuver the car down the hill without winding up in a ditch. It turns out that the locals never come down that way when the roads are like that–but no one had told us. Anyway, we had a great visit–we stayed at Frank's house, the snow was taller than my 5-year-old's head, and the high temperature in 3 days was 9 degrees—as we left town! We loved it. I accepted the job a week later and started the next September.

AR:

Williams is an elite liberal arts college, always appearing near or at the top of national rankings. Before I ask specifically about the role of statistics at Williams, please tell us about the overall ethos and culture of the place.

RD:

First, let me admit that, as a teacher, I've been incredibly spoiled. I've taught at the University of Pennsylvania, Princeton, and Williams. The students are, in general, bright, capable, and hard working. They are also, typically, overcommitted and stretched thin. One of the main challenges in teaching them is pushing them to do their best while at the same time not overloading them with your course. The good students at any large state university are probably the same. The main difference is that I have, by and large, only these students. That's not to say that they are all gifted in math–and Stat 101 is still very challenging for that reason, but for our main intro course, Stat 201 with a multivariable calculus prerequisite, the students are really quite capable.

Now, each of the places I've taught has its own personality. Of course, I taught at the Wharton School at Penn (the business school) so that gave it a certain flavor (they're not all Donald Trumps, but there is a certain business aggressiveness about the place). And at Princeton I taught mostly engineers (talk about an easy group to teach!). On paper, students at all three look the same, but Williams' students are definitely the most laid back. They are much less outwardly competitive than the other two groups, and, in general, they are really nice people. In the last 20 years, the student body has become more and more diverse and that has been really nice too. I remember a conversation I had with the former registrar. I said, “we have diversity in everything here—race, country of origin, family income, etc., but all the students are nice. We really need more (I'll put “jerks” here, but it wasn't the expression I used).” Without skipping a beat, Charlie said “Why? We have the faculty.”

Charlie was kidding. The main difference between Penn, Princeton, and Williams is the faculty. Williams' faculty really care about their teaching and they are generally truly collegial. If we have differences, we talk them through–at least most of the time. In fact, the math department decides almost every important decision by consensus. We rarely resort to voting. We'd rather spend the time to get everyone on board with an idea rather than leave a minority unhappy.

As far as teaching, I knew how to give entertaining lectures at the first two places. I'm still learning how to teach after 20 years at Williams. My colleagues are a constant inspiration.

AR:

Now please tell us about the statistics courses at Williams at the time. What did you teach in the first few years? Did you develop new statistics courses for math majors, or service courses for students in other programs, or both?

RD:

At Princeton, we had a basic series of courses that our Engineering Management System students had to take: Intro Stats (calc-based) followed by Experimental Design and then Regression with a 400 level elective in what became Data Mining. Thinking about Williams, I knew I'd have to add a non-calc-based Intro course, but for the first few years, I taught four courses: the two Intro flavors, Experimental Design, and Regression (theory and practice with a Linear Algebra prerequisite). That last course was taken mostly by Math majors (or our Math/Econ double majors). So, I started developing a small group of Math/Econ majors who took their electives in Stat.

The service course, which started as Math 143 and was eventually renamed Stat 101, got huge. One year I taught a section of 140 students. That isn't a huge number at a lot of places, but I had only undergraduate TAs who could grade only up to 15% of the total grade, so I had to grade all the midterms and finals and projects. The other intro course had about 90–100 students. Add to that a small Design course with 8–10 and the regression course with 15 and in a typical year I was teaching about 200 to 250 students. (The college-wide average load is about 80).

Of course, I could have capped the courses and made hundreds of students really unhappy. And even though Williams says they don't add faculty lines due to enrollment, I suspected that eventually these numbers would get some attention. I got the green light to hire a second statistician. We hired a visitor (Steve Wang!) in 1997 and Jerry Reiter (!) joined in 1998. Jerry later left for greener pastures. A year or two later, I hired Bernhard Klingenberg, and he's just recently been promoted to full professor. He's a wonderful colleague. Three years ago, we hired Brianna Heggeseth and next year Laurie Tupper of Cornell and Daniel Turek from Berkeley will be joining us.

At first, we taught mostly service courses (Stat 101 and 201) with a small group who would take Stat 346 (Regression) and maybe Experimental Design. But now we're in our second year of our Statistics major. We still teach the service courses, but our emphasis has shifted. The required 10 courses are Math 150 (Multivariable Calculus), Math 250 (Linear Algebra), Math/Stat 341 (Probability), an introductory Computer Science course, Stat 201 (Intro Stats) or Stat 202 (Modeling), Stat 346 (Regression Theory), Stat 360 (Inference), two more 300 level stat courses, and a 400 level course. This year we taught 355 (Mulitivariate Statistics) and 442 (Data Mining). Next year it's 365 (Bayesian Statistics), 440 (Categorical Data Analysis), 351 (Time Series), and 442 (Data Mining).

We'll teach three sections of Stat 101 and five sections of Stat 201. Because of our location, it's virtually impossible to hire adjuncts, so the five of us have to teach all the sections. We'd like to get the size of the Stat 201 classes down to 25, but we're still a ways from that. A couple of years ago we started offering Stat 202 as an alternative entry to statistics. We found that so many of our first years had a four or five on the AP Stat exam, and even though we teach R and multiple regression in Stat 201, there was too much overlap, so 202 starts with a review of intro stats, teaches them to use both R and JMP, and then does modeling and design.

AR:

Please say more about the statistics major at Williams. How many students are pursuing the major at this point, and how does that compare to the number of math majors? Do you think students are selecting statistics who might otherwise have chosen to be math majors, or do you think they're coming from other interests? How big do you hope the program becomes, and what do you envision as typical career paths for those students after graduation?

RD:

We got approval for the major in the spring of 2014 and had two rising seniors declare that year, so our first class of two graduated in 2015. We are graduating 12 majors in 2016 and right now we have 33 rising seniors. This is too many! We bit our nails during registration this spring fearing we were going to 50, but fortunately it looks like we'll end up at around 20, which I think is a great size. We decided to add probability theory as a requirement, making it a 10-course major, which I think may have kept a few away. And no, they're not really coming from the math majors. Math has 85 (!) majors in the class of 2018, so together we'll have 105 majors out of 500 seniors at Williams. Of course some are double majors. In fact among our 12 seniors, 11 are double majors with Econ, but the more recent majors are more diverse. In the class of 2017, 12 are only Stat majors, 10 are Econ/Stat, and 4 are Comp Sci/Stat. We even have German/Stat, English/Stat, and in the class of 2018 Japanese/Stat, Arabic/Stat and, my personal favorite, Theater/Stat.

I had always hoped that we'd have about 18 majors. I'm hopeful that we stay around that number, maybe a few more. Like math majors at Williams, our graduates go on to do … . everything. Of our 12 seniors, most are going to consulting or banking, two traditional Williams careers. But several of them plan to go on to graduate school after a year or two of “making money.” In the classes of 2017 and 2018, I think their plans are more diverse. I know some are planning to work in environmental studies, data journalism, and public health. Combined with a liberal arts background I think it's a great major to pursue whatever career you might want.

Textbook Writing

AR:

Many readers of this interview will know you as one of the co-authors of a series of introductory textbooks including Intro Stats (De Veaux, Velleman, and Bock, Citation2016a), Stats: Data and Models (De Veaux, Velleman, and Bock, Citation2016b), Stats: Modeling the World (Bock, Velleman, and De Veaux, Citation2015), and Business Statistics (Sharpe, De Veaux, and Velleman, Citation2015). What motivated you to take on such an ambitious project, and how did you come to team up with Paul Velleman and Dave Bock and Norean Sharpe?

RD:

Now that's a great story. So, you remember that I was writing the Experimental Design book? Well, about 2 years after I got to Williams and had gotten tenure, Deirdre Lynch of Addison-Wesley called me up and said she'd heard about my book and wondered if she could visit a class I was teaching. After the class, she said that she wanted to sign the book. It was very exciting. I went to Boston to meet the team and started in, now really in earnest, writing more chapters. But about 6 months later, she called me up and said she had a different project. She wanted me to expand the intro part of Experimental Design and write an Introductory Statistics book with Paul Velleman. Now, I didn't know Paul, but I had certainly heard of him, and had just seen a demonstration of ActivStats that I thought was pretty amazing.

So, we met at a conference at Babson and agreed to give it a try. Deirdre was thrilled. She was his editor for ActivStats and had wanted Paul to write a book, but he needed a co-author. After signing me for Experimental Design, she told him “I've found you one.” Now, from what Deirdre says, she knew we were both doing innovative things in the classroom and that although there were a lot of similarities in our approach, she suspected that we wouldn't agree on everything. She also knew us well enough to know that each of us had a lot of confidence in the way we were doing things and might not want to give in to the other's ideas. She says that she couldn't wait to watch the fun.

Since Paul had done ActivStats already, and I had all those chapters from the other book, I agreed (as Tukey would say) to do the “odd” drafts—including the first draft. So I started setting down chapters in 2000, and I had a sabbatical 2001–2002 to finish them up. Well, I'd work for about a month on a chapter, then send it to Paul. In about the time it takes an e-mail to go from France to Ithaca and back, Paul would rip it to shreds and write his own version. I have to say that I liked his version about as much as he liked mine, so by about Christmas 2001, Deirdre was starting to pull her hair out. She had been right about the “wouldn't give in” part, but suddenly it seemed less fun.

I'm hazy as to exactly why now, but in January I was scheduled to come back to the U.S., and Paul and I agreed to meet in Ithaca to see if we could hammer things out. The day before coming back I was giving a talk in Paris, and during the middle of it I heard the disk drive in my laptop emit the strangest sound I'd ever heard coming from a computer. I made it through the talk, but my laptop was dying. Amazingly, it gave me enough time to download everything off it. That was the good news. The bad news was that once I got to the States, Sony told me it would take 2 weeks to replace and that they couldn't mail it out of the U.S. So I was stuck—in Ithaca. It turns out it was the best thing that could have happened.

Paul and I had hacked up each other's drafts so much that by now we really couldn't tell which parts belonged to whom. But it also read like a cut-and-pasted mess. So, Paul got a projector that we set up in his dining room, and we started from Chapter 1, reading it out loud (!) together and talking through it until both of us were happy. In 2 weeks, we'd worked our way through most of the book. I think to this day we both feel that it's a much better book than either of us could have done alone. I may never want to work that way again—well at least for 2 weeks straight—but, whenever we get stuck, we go back to that method and 15 years later it's still working.

Deirdre also wanted someone to help us on an AP Statistics version. Paul's son Dave had had a fantastic, award-winning teacher at Ithaca High School named Dave Bock, who Paul convinced to join the team—another incredibly lucky random event!! Dave's been a fantastic co-author and amazing mentor for so many AP teachers.

Oh–one more story about Intro Stats. We had to insist on that title. Most other introductory statistics books had titles—like Introductory Statistics: A Data Analytic Approach, or some other formal title with long words and a colon. But we insisted—I think we just wore our publishers down.

As for the business book, I taught the introductory business statistics course at Wharton for 3 years in the 1980s, but it had been a while, so we wanted to find someone whose contact was current. I had known Norean from early Isolated Statistician meetings and knew her writing from her book on living in Holland (Sharpe Citation2005). I thought she'd be great, so we convinced her to help us finish it up. It's now in two versions, both in third editions.

AR:

You mentioned that being student-friendly was your primary goal for the book. How did you pull that off? What do you think makes the book student-friendly?

RD:

For a long time, I didn't like writing. As a math and engineering double major in college, I avoided writing courses–unless they were pass/fail. I didn't like the process because I always thought that you had to write formally. All the articles and textbooks that I'd read seemed awfully formal—and artificial. When I started writing the paper with Mike Steele, he said “why don't you just write it like you talk?” I thought that was crazy suggestion, but then I read some of his books and papers and was amazed at how much of his voice came through. He's a great storyteller, and I could hear him as I read.

So, Paul and I tried to do the same thing with Intro Stats. We tried to write it as if we were giving a lecture and talking to the students instead of writing a textbook. I always try to imagine an 18-year-old with a baseball cap backward reluctantly opening up this textbook as I write. They don't necessarily want to hear about statistics–not the things we're interested in anyway. They want to hear about things they're interested in. So instead of telling them what statistical technique they were going to learn at the beginning of each chapter, we opened with a real-world example. And we tried to keep the language informal. We even kept our corny (i.e., “Dad”) jokes as if we were lecturing—sometimes putting them in the footnotes. Our first footnote claims that no one reads footnotes. David Hildebrand had warned me that you can't put jokes in a textbook. But then again, people told us we couldn't call it Intro Stats. We even got mentioned on Reddit (twice!)—once for our footnotes and once for our “Dad” jokes, so I guess it's working, at least to some extent.

It's pretty well known that people learn better when they laugh from time to time, and it's certainly true that students won't understand a textbook if they don't read it. Now, we know that fewer and fewer students read as a general rule, but we're encouraged to hear from teachers at both the high school and college level that students tell them they're reading it. How do they know? Because they usually mention a footnote or corny joke they remembered. And I've gotten number of e-mails from students telling me “No! I really did read the footnotes!” I get several e-mails a month from students thanking us for the books. Those comments from teachers and students are what keep me going.

I give talks on data mining all over the world to scientists, engineers, analysts, and statisticians, and I try to use the same philosophy of keeping things informal and sometimes irreverent at the same time trying to get some pretty technical material across. I've very rarely (although it has happened) heard complaints that I wasn't serious enough or that someone couldn't learn because my style was too informal. If someone wants a more serious textbook, or lecture, there are plenty available.

Training Industry Professionals

AR:

How did you know that I was planning to ask next about your work with data mining and with industry professionals? But wait, that's not actually my question. My question is: How did you get involved with data mining and training professionals to use those tools?

RD:

I forgot to mention an event that, looking back, had a profound influence on my career. In the fall of 1973, I took a field trip with some other stat grad students to the Stanford Linear Accelerator (SLAC) to visit a young physicist named Jerry Friedman who had been working with John Tukey on visualizing high-dimensional elementary particle data. Now, you have to remember what computing was like in 1973. Atari was founded in 1972 and in 1973, you could find an arcade version of the game Pong (the home version didn't come out until 1975). So, while the rest of the world was looking at a monochrome screen on which two line segments moved vertically up and down to hit a few pixels, Friedman and Tukey were visualizing nine-dimensional data in color (!), rotating them in interesting ways until they found an interesting projection (projection pursuit) and then highlighting them with a light pen. Our jaws literally dropped as Jerry showed us what he could do.

After the Wharton Analysis Center folded in 1983, I went back to Stanford to work on my thesis, and after 6 months I decided to apply for both academic and industrial jobs. Wharton offered me the instructorship and Hewlett-Packard offered me a position in Palo Alto. It was a tough choice. By the way, I also talked with sabermetrician Bill James, who had been looking for an editorial assistant for his baseball blog. We had a wonderful conversation late into the night, but he had just hired an English major and was sad that he didn't have room for a statistician on his editorial team. (That would have been an interesting detour.) Anyway I was torn between the two possibilities, academics and industry, but finally decided that I needed to give teaching a real try. The icing on the cake was that there was a scientific equipment plant of Hewlett-Packard about an hour south of Philly that had no statistician. Deb Shenk of HP asked if I'd “mind” going out there one day a week to help. What a great opportunity! I got paid to listen to chemists and engineers and to try to help them with data and experimental design issues. It was fantastic. The problems were varied and the tools I needed ranged from basic data analysis to modern regression.

When I moved to Princeton in 1986, I found that my fellow Stanford grad student, Trevor Hastie was nearby at Bell Labs. We talked frequently and as we both had young kids, we got the families together as well. Trevor was working on putting statistical modeling into S, and I was one of his beta test sites. About the same time, the CART book came out and Jerry Friedman was already working on his MARS program. I played around with all of these algorithms and used MARS to model the effect of sea bottom topography of sea ice around Antarctica using a huge (at least at the time) dataset of images from NASA. I found that these “modern regression” algorithms were also useful in my consulting problems. For one, I had started consulting at First USA Bank in Wilmington, a fast-growing bank that had started warehousing all their transaction data. The CTO of the company had some futuristic ideas about clustering customers and even thought of providing purchase recommendations based on past purchases (this was 1986!). At Princeton we had a lab of Silicon Graphics (SGI) workstations that were probably the coolest computers I had even seen (and maybe ever will see). They had incredible graphics and came with a “Data Mining” package with trees, neural networks, and other algorithms. I added the models I had been working with to the mix and found that it was a really useful set of tools for solving both scientific and industrial questions. I was finally doing the cool stuff that I'd seen Jerry do 20 years earlier!

I was lucky to have access to both the machines themselves and the software, and I realized that most of the statistics community didn't really know the tsunami of ideas that was coming out of Stanford and other places. So I gave some overview lectures that introduced the basics of all these methods. I gave a trial one at the Gordon Research Conference in the mid-1980s. Afterward, Brian Ripley took me aside and said, “I see what you were trying to do, but I don't think it really worked.” Brutal, but he was right. It took several iterations, but when I gave a similar talk at the Fall Technical Conference in 1996, I won the Shewell Award. The next year I gave one of the first Introductory Overview Lectures of the Joint Statistical Meetings in Anaheim on Neural Networks. Around the same time I gave a similar talk at NC State, where John Sall of SAS and JMP was in the audience. He came up to me afterward and asked if I'd consider giving some talks on data mining using JMP. I had used JMP since coming to Williams in 1994 and loved the software. This eventually became part of JMP's Explorers' Series, which I've been doing for the past 10 years. With JMP I've given talks on data mining about 50 times in the U.S., a dozen or so in Europe, and last year in Japan. It's been a fantastic experience. I've given the data mining short course at the ASA as well about a half dozen times. It's a practical introduction based on my consulting experience.

Presentations on Statistics Education

AR:

In addition to your presentations on data mining, you've also given many conference presentations and workshops about teaching introductory statistics. I want to ask about two of my favorites, starting with your presentation titled “Math is Music, Statistics is Literature,” which I first heard at the 2007 U.S. Conference on Teaching Statistics. For the benefit of those who have not heard or read about this, could you summarize your thesis of this presentation?

RD:

Sure. I think many of us have seen that, although math skills are important, it's not always the lack of them that makes an Intro Stats course challenging for students. If you list the concepts we try to cover in one semester (as opposed to say the first semester of calculus), it becomes pretty overwhelming. We touch on exploring and summarizing data, some concepts of probability, sampling, experimental design, the scientific method, statistical reasoning, inference, and modeling. And that's not exhaustive. I remember a “blue sheet” comment from the end of one semester, “This course should be more like a math course, with everything laid out beforehand, and concepts following in a logical ordered sequence.” Ha. Students may expect it to be “like a math course,” but then we throw in things from the real world. Does the slope make sense? Are there patterns in the residuals? Did you check the assumptions? They never asked their calculus professor if the cone containing the water that flowed at a constant rate was really a cone! Suddenly we want them not only to learn the material, but question it, becoming skeptics and critics of the methods—all in the same 10 to 15 weeks!

People always talk about the relationship between math and music, and why it makes sense that so many mathematicians are interested and even occasionally gifted in music. I think the reason is that both are abstract and have their own internal rules and order. And this is why you can have prodigies in both. If you learn the rules, you can master them. Mozart didn't know a lot about life at age 6 when he started composing. Of course, his music became much more profound later, but you don't need to know a lot about the world to start in math or music. But how untrue that is of statistics—and literature. There are no great pre-teen novelists! And successful statistics students need to bring their life experiences to the course to make sense of the interplay between models and the world. It's certainly not just math.

AR:

I also remember another comment you made at that 2007 conference in comparing mathematics with statistics. You said that you envy calculus professors, because they get one full term to teach derivatives, another full term to teach integrals, and another full term to generalize those ideas to more than one variable. But in statistics we feel obligated to teach our entire subject in a single introductory course. You mentioned this in your previous answer, but would you care to elaborate?

RD:

I'm not trying to say that Calculus doesn't have big ideas or that it's easy to teach. Limits, continuity, and the idea of a derivative are substantial mathematical ideas. But those ideas probably won't fundamentally change the way someone looks at the world, as we're trying to do in an introductory Stats course. And, as my daughter just pointed out, AP calculus is a yearlong high school course that covers two semesters in college. AP statistics is a yearlong high-school course that covers one semester in college. So… something's wrong.

AR:

This brings me to another conference presentation that has provided me and others with considerable food for thought. At the 2014 International Conference on Teaching Statistics (ICOTS) in Flagstaff, and then again at the 2015 USCOTS and JSM, you argued that what's wrong with introductory statistics courses is that we teach the wrong things in the wrong way and in the wrong order. Please summarize your argument for those who might not have seen one of those talks.

RD:

Our intro course is in danger of becoming obsolete. It hasn't really changed since the mid-1980s when David Moore and others made it more data-centric. The AP course still uses a hand-held calculator to produce most graphics! I've heard from high-school students who are taking data science or computer science courses that deal with visualizing data and manipulating and analyzing big interesting datasets. If we don't change, we'll lose our students to others who, while claiming to teach data analysis, don't think statistically. One prospective student asked if he could take my data mining course in his first year. He didn't want to bother with the boring intro courses.

So, to keep relevant, we need to get to the punch line of statistics sooner. In the next few years, more and more students will have seen the basics of univariate data display and analysis before getting to college, even if they don't take a statistics course. It's in the Common Core. One of the most dangerous things I think we can do is to end an intro stats course with the topic of testing a single proportion (or even two means!). The world is highly multivariable and getting more so all the time. So let's start with real, complex problems to motivate students and extract what we need to display and analyze the data from there, rather than the other way around.

Not only do we need to think more multivariately, but we need to streamline inference as well. We spend too much time on the mechanics of tests of proportions, tests of two means, more than two means, more than two proportions, etc., etc. We know it's all basically the same idea. Why not teach it more conceptually?

Brianna Heggeseth and I have been doing this at Williams for a few years now (she gave a presentation at e-cots this spring) and it's a much more exciting way to go. Of course there are challenges and issues to consider, and we all need to think how to do this better. I'd love to get the statistics education community united to figure out how we make this intro course a truly exciting experience. We need to change soon or someone else will wind up teaching our subject.

AR:

Can you describe an example to illustrate what you and Brianna teach in your intro course, and how?

RD:

Some of my favorites are listed in the “Stat 101” materials on the ASA website: http://community.amstat.org/stats101/home. These are case studies designed to help teachers of Intro Stats (whether AP, or 2- or 4-year college) with real examples that show the complex nature of questions that statistics can answer. The first of these, which I use often because I've found it to be very successful, is: “How much is a fireplace worth?” I want them to think about more than two variables right from the beginning. So, we first look at collection of homes near us and discover that houses with fireplaces are worth about $65,000 more than houses without fireplaces, on average. That already brings up lots of questions about sampling, reproducibility, and extrapolation to other areas (Minnesota maybe, but Florida?). Then, we talk about the uncertainty around the $65,000 figure. Our sample size is reasonably large, so our confidence interval is small. I use resampling to see how much that difference varies. It doesn't vary much, so then I ask them whether I should spend the $40,000 it will take to add a fireplace to my house to increase its value. Now we're getting at causation and lurking variables. Typically, many students will be hesitant to spend the extra money, and it's great to explore why. Some are afraid the $65,000 isn't repeatable, but others think there might be something else going on. I find that bringing up a simple example like this early in the course and then adding complexity to it is a great motivator for all the univariate ideas that we typically teach. I used to teach all those univariate ideas before getting to complex questions, but I find it works better for them to have a problem they can't quite think their way through as motivation. Now questions about typical values, spread, outliers, etc. come up even though we want to know about the difference in price. The real advantage is getting them to think about all the statistical techniques in the context of a more complex problem they'd like to solve, rather than build it up from one variable and often running out of time before we get to the good stuff.

AR:

That's a great example, thanks. I'm sure you've heard this next question before. One of the obstacles for teachers wanting to revise their introductory course along the lines that you've suggested, with multivariable thinking from the beginning and streamlining the presentation of statistical inference, is the availability of textbooks that adopt this approach. Will you be writing a textbook for a substantially different Stat 101 course, and will you be able to convince a publisher to move forward with such a textbook?

RD:

One of the best things about my job is that I get to experiment. I've probably taught introductory statistics close to 30 times, and I've never done it exactly the same way twice. I keep seeing what works and what doesn't and what kind of technology best helps students master the material. For the past few years, I've been introducing multivariate thinking early and exposing students to resampling fairly early on as well. I think those innovations work well, but that's not to say that everyone should follow me off the cliff. I get to think revolutionarily, but for the book, we're going to move things evolutionarily. We know that most teachers out there are overburdened as it is, trying to include the material they think is most important and having to deal with outside demands to include topics as well. So, we will include new topics only if we think they will decrease the burden and help students understand the basic topics more efficiently. I really think that introducing the idea of several variables early doesn't take much time and is of enormous benefit. So, in the next edition of Intro Stats (out next year), we'll put a chapter on multiple regression right after the chapters on simple linear regression. We'll cut probability back even more, we'll include more simulation, and we'll offer the bootstrap (optionally) as another way to get confidence intervals. We're also working on streamlining the inference chapters so the different tests are more connected. Since this is an evolutionary rather than revolutionary change, our editors are both excited and fully behind it. We think it's the right balance of new ideas without changing the syllabus too much. We're striving for a mix that will be appropriate for most instructors.

Now, I should say that we're also working on a more revolutionary book—stay tuned ☺.

Pop Quiz

AR:

Such a tease! Now I'd like to begin what I call the “pop quiz” portion of this interview, where I'll ask a series of short questions for which I'll ask that you keep your responses brief. First, I can't resist asking: How often, and in how many different ways, has your name been mis-spelled?

RD:

Ha. You didn't say mispronounced! The de Veaux (de Vaux, de Veau, de Vault) family of France fled to neighboring Switzerland, Germany, and Holland in the early 17th century because of Protestant persecution where they changed the spelling to Devoe and Devos among others, although I think not Devo. Oddly, many people add an r when pronouncing my name making it Devereaux. Not sure why. I was staying at a B&B in Burgundy during a bike trip with my wife, and the owner's name was Devault (pronounced the same as mine). At some point a teenager came up to me and asked if I was Mr. De Veaux because her name was the same too. I said, “Oh, are you the granddaughter of the owner?” and she said “No, we're just another family staying here.” So we had three families with the same pronunciation at the chateau. (The wonders of French rhyming!). And yes, I am usually on every mailing list at least three times.

The space in my name causes all sorts of confusion. I get a lot of mail addressed to Mr. Veaux and a lot of people have told me that I'm not in the ASA—because some lists put De Veaux ahead of Dean, as if the space is the letter before A! I asked Mr. Devault if they ever had a space in their name and he said “We got rid of it during the revolution.” Not all of my family uses the space. I got it from my grandfather's marriage license, but now we find out he may have made up the name then (he was an orphan in Brooklyn), but we don't know!

We have a land line that we rarely answer these days. I love answering it and waiting for the solicitor on the other end say, “Hi is this Mr. er, uh… Mr. deev… Mr. deevee…..” at which point I usually hang up. “Dee-vee-ax” is one of my favorites.

AR:

Please tell us about your family.

RD:

You've already met my amazing wife Sylvia. We postponed having children until she decided that she was through with touring as a professional dancer. I essentially retired from dance in 1981, but Sylvia went on to tour with Jennifer Muller and the Works based in New York City until about 1988. In 1986, we had moved to Princeton, so Sylvia was commuting to the city to rehearse and to work both at the Hard Rock Café and at a Diet Center to supplement a dancer's salary. She did several world tours in additions to seasons in New York, but by summer 1988 she thought she'd toured enough. Sylvia's father, Alan, was an accomplished sailor who sailed the Mediterranean and the Atlantic in retirement. We joined him for a trip off the coast of Italy. I usually get seasick the first night, and this trip was no exception. But amazingly, Sylvia was sick too. We later found out that she was pregnant with Nicholas, who was born in May 1989. You can imagine the fun Alan had with that…

Our daughter Scyrine appeared in January 1991, Frederick in February 1993, and Alexandra in April 1995. Persi Diaconis says that I ruined them because they all show some interest in data. Nick is currently a data scientist at the Simons Foundation in New York. Scyrine is an analyst at the advertising firm Sapient Nitro, also in New York. Frederick is currently a data journalist for Dalia in Berlin Germany and the drummer for two rock bands there. Alexandra is about to start her senior year at Wesleyan as an Econ major. She tutors at the Quantitative Analysis Center in statistics and R. So I guess maybe Persi was right.

AR:

What are some of your hobbies outside of statistics and education?

RD:

When I gave up dance performing, I started singing seriously. I've taken voice lessons now for more than 30 years and have performed the bass solo in many masses, requiems, and other classical pieces with university choral groups. I sing regularly, when I'm in Paris, with the Choeur Vittoria (http://www.choeur-vittoria.fr/), the semi-professional regional choir of the Ile de France, and I was in the chorus on one of their recent CD's. I've done dozens of concerts with them around Paris and on tour outside France. I'm also a regular swimmer and cyclist. Vacations are often 2-week bike trips across or around various parts of France.

Speaking of France, when I reached 60 and decided to hang up my dance belt and not teach Introductory Modern Dance during Winter study anymore, I proposed a course called “The History, Geography and Economics of the Wines of France: A Travel Course.” It actually got approved 3 years ago, and I took 10 students to France during January for 10 days. We visited an amazing collection of French wine chateaux. I'm going back this January for another tour.

AR:

What are some books that you've read recently?

RD:

Because we live in France ¼ time, I need to keep up my French (it slips backward easily). The last novel I finished was a strange novel by Michel Houllebecq, a recent winner of the Prix Goncourt. I say strange because the author appears in his own book as a rather unsympathetic character who, toward the end of the book, is brutally murdered. The author talks about his funeral and why so few people showed up. Bizarre. I have 50 pages to go. I love the idea of historical biographies, but seem to get bogged down. I'm part way through Hamilton, and The Martian (not exactly a historical biography) was probably the book I finished before that.

AR:

Let me ask about some of your favorite travel destinations. Perhaps you could cite one place that you've visited for work and one just for fun.

RD:

Well, Paris is both for me. I've worked there and escape there as often as I can. It never gets boring for me. We bought an apartment there a couple of years ago and have lots of family and friends there now. And I love going just about anywhere in Europe. I also got to go to Senegal and Morocco on tour with the French choir. Both are great, but Senegal was more surprising for me with the friendliness of the people and the richness of the culture.

AR:

Next I'll ask some questions that I have used to collect data from students. Let's start with some binary variables: Do you use a PC or Mac? Do you consider yourself an early bird or a night owl? Do you prefer window or aisle?

RD:

(1) Mac—although I could go either way. (2) I love getting up early, but I often stay up too late. (3) Window—I hate getting hit on the head with everyone's bags as they board.

AR:

And now a nonbinary categorical variable: On what day of the week were you born? (You can use www.timeanddate.com to produce a calendar for your birth year.)

RD:

Thursday—didn't need to look it up ☺

AR:

Next a discrete quantitative variable: How many Harry Potter books have you read?

RD:

One and a half. Loved the story—hated the writing, so I gave up and watched the movies.

AR:

Here's a continuous quantitative variable: How many miles do you live today from where you were born? (You can use www.distancefromto.net to calculate this distance.)

RD:

Driving distance 169 miles.

AR:

Here's a fanciful question that I have asked of students. Suppose that time travel were possible, and you could take one trip. You can only observe, not change anything, when you get there. Would you travel to a time in the past or in the future? What time would that be? Explain your choice.

RD:

Ok, I'm going to have to cheat. I love Paris, but I feel I missed some golden ages. I'd love to go back and hear Josephine Baker, John Coltrane, and Edith Piaf perform. But really I'm curious as to how things turn out, so the future. The question is only the near future, say 100 years. But no, I think things will be a mess, so the real choice for me is 1000 years or 10,000 years in the future. I'd love to see if we solve the mess we seem to be headed into.

AR:

Here's another question that is completely hypothetical: Suppose that you are offered dinner for four anywhere in the world, with the caveat that you converse about statistics education. Who would you invite to dine with you, and where would you go?

RD:

Are you paying?

AR:

Sure, I'm very generous when it comes to hypotheticals. Where would you like to go, and with whom?

RD:

And more important—are you making reservations? There are two restaurants I've always wanted to go to but I'm not enough of a fanatic to make the reservation at exactly 0 hours 11 months to the day ahead of when I want to eat. That's what you have to do to reserve at El Cellar de Can Roca, the Catalonian restaurant in Northern Spain. Noma in Copenhagen would be my next choice. Oh—did I get fixated by food again?! Who? That's harder. I've learned so much from so many people in Stat Ed. So, as a foodie, I'll have to limit it to other people who I know will enjoy it as much as I will. So, Rob Gould is on the list for sure. He makes the cut for both. Chris Franklin, Rob, and I had some pretty good meals in New Zealand together. We need to do that again and talk more about Stat Ed. Finally, I'm going to put Lyle Ungar, my comp sci colleague at Penn, at the table. He always has interesting insights about statistics, computer science, and data science, and we've been to great restaurants together.

AR:

Please tell us something about yourself that is likely to come as a surprise to JSE readers.

RD:

Wow, what's left? I once walked 7 days in Yosemite with full backpack but no shoes (ok, yes, it was the 70s).

AR:

Speaking of surprises, I believe that I once heard through the grapevine that you personally know a princess. Is that true?

RD:

No, that's false. I know a queen (now the queen mother). Lisa Halaby was a classmate of mine at Princeton. I didn't know her well, but she stood out as someone destined for something remarkable even then. She lived off campus (which was already very exotic at Princeton), and I once went to a party at her house. When I saw a full page picture of her in a wedding dress in the San Francisco Chronicle, I wasn't that shocked, until I read who she was marrying—King Hussein. She became Queen Noor and has had a remarkable career advocating rights for women and children in the Middle East.

We visited England when my youngest daughter was 6. As we were leaving Buckingham Palace, after seeing the changing of the guards, Alexandra asked me, “Daddy, aren't we going to stay for lunch?” I had no idea what she was talking about until she said, “You told me that you knew the Queen and if we ever got to her country she might invite us to lunch!” In spite of my protest and explanations, I knew there was no way I was ever going to win back any points or credibility after that one….

AR:

What has been your favorite course to teach?

RD:

Probably data mining. It's a capstone course with seniors, usually in the spring. It's great to see them put it all together and present projects.

AR:

The theme of the U.S. Conference on Teaching Statistics in 2017 is “Show Me the Data.” Let me ask if you have a favorite data visualization that you can show and describe to us. The only rule is that you can't choose the famous Minard graph of Napoleon's invasion of Russia or one of the Nightingale rose graphs, because they were taken in my previous interview.

RD:

I saw an amazing animation at the London Conference on the Future of Statistics last year that Hadley Wickham presented. It was mind boggling. It was a time and space visualization of some data from Netflix (as I remember) that swirled and gyrated beautifully. I couldn't imagine producing such an animation myself. I remember asking Hadley how long it had taken him to code that and when he said about 3 months I felt a little better.

Concluding Thoughts

AR:

As we begin to wrap up, I want to follow up about your conversation with Bill James: What's he like, and what did you talk about? And while we're on sports statistics, I also want ask about your work on age adjustments in sports.

RD:

Wow. That's a long time ago. I was trying to figure out what direction I wanted my life to go once I'd finished my PhD. I'd always loved baseball and baseball statistics, so when I saw that ad of Bill James I had to call him. He was very sweet on the phone. We must have talked for an hour or so. I was in California and I think he was in Kansas, but we talked late into the evening about baseball and life. As I said before, he had been looking for an editorial assistant, and had found one, but he was really intrigued with the idea of hiring a statistics assistant. I think we just talked about the possibilities of data analyses that one could do—remember, this was about 1983, so there was a lot of “by hand” work to do.

I gave up my interest in baseball statistics not long after that. The baseball strike may have been the final blow. But, I have been keeping personal statistics on my swimming, running, and biking since the 1980s. I'm not really good at any of them. But Howard Wainer is an amazing swimmer who's swum the English Channel 1.9 times (as he says). I actually met Howard in the Princeton pool. He was in the “fast lane,” which I only dared to enter when I felt especially motivated. I remember that after one workout he was asking about my swimming. I told him that I was really more of a runner. That seemed to placate him until a couple of weeks later when he heard me tell a runner that I was really more of a swimmer. He'd caught me. We decided to put my data to the test and see how I stacked up in the three sports. It turns out that I'm just about equally bad in all three, so there's really no advantage for me in different ratios of distance in a triathlon. But that's not true for swimmers. Runners do much better. So we wrote a paper in Chance about “fair” triathlons (Wainer and De Veaux, Citation1994). Swim magazine also picked it up and they said it was one of the most commented on articles they'd ever had. Triathlon magazine wanted nothing to do with it—because we were recommending lengthening the swim by about a factor of 3! They thought we were crazy.

So I've stopped running (because of knees—or lack thereof), but I'm swimming and cycling regularly and keeping track of my data. I know this will come as a shock, but for some reason I'm not getting any faster, which can be pretty depressing. So I started to wonder—but how am I doing relative to everyone else my age? I don't run age group races anymore, so I had no personal comparisons. But I have data. There are about 300,000 records on the Master's swim site for every age group, every year, and every event. And the Dipsea Race in California is wonderful because they handicap it by age and sex. Ray Fair analyzed data about 20 years ago (and it recently got picked up in the New York Times by Gina Kolata), but the cohorts have changed so much in 20 years! In fact the time I swam today for 1650 yards would place me squarely in the top 10 in my age group in 1973 nationally. But in my age group now? I'm nearly 10 minutes behind (out of about 28 minutes!). The participation is so much greater now. In running, there's a new 105–110 men's record for the 100 m dash! So, I've been trying to estimate the age effect for different events and men vs. women. I have preliminary results, but there's a lot of data. It's a fun project because there are so many ways to model it and I can try out all sorts of methods. It's probably a never-ending project—especially because older athletes keep getting faster. I'm just trying to stay ahead of the curve—compared to my 35-year-old self.

AR:

Turning back from your athletic prowess to your even greater accomplishments in statistics education, let me ask if you can identify the professional accomplishment of which you are most proud.

RD:

I'm very proud of the impact that our books have had on both high-school students and undergraduates. I am thrilled to get emails from students all over the world thanking us for writing the books—something I never imagined would happen.

AR:

Finally, what advice do you offer to those who are just beginning their careers in statistics education? To make this a bit more specific, think of someone entering their last year of graduate school who is thinking about making statistics education the focus of their professional life: what do you say to such a person?

RD:

What a great time to be a statistician and a statistics educator. It's really true that 40 years ago, whenever someone asked me what I did and I said statistician, I'd have to hear about the horrible course in statistics that they suffered through. And they'd add that it was the course they did the worst in. But now, people say, “I loved that course!” Of course, everyone in academics has to find their own balance of teaching and research and the institution that will align with their own goals. But teaching is a wonderful and satisfying career. As the slam poet Taylor Mali says in his poem “What do teachers make?” (Mali, Citation2002): “Teachers make a goddamn difference!” Good luck and enjoy!

References

  • Bock, D., Velleman, P., and De Veaux, R. (2015), Stats: Modeling the World (4th ed.), New York: Pearson.
  • Box, G., Hunter, J. S., and Hunter, W. (2005), Statistics for Experimenters (2nd ed.), Hoboken, NJ: Wiley.
  • De Veaux, R., and Steele, M. (1989), “ACE-guided Transformation Method for Estimation of the Coefficient of Soil-Water Diffusivity,” Technometrics, 31, 91–98.
  • De Veaux, R., Velleman, P., and Bock, D. (2016a), Intro Stats (4th ed.), New York: Pearson.
  • ———. (2016b), Stats: Data and Models (4th ed.), New York: Pearson.
  • Freedman, D., Pisani, R., and Purves, R. (2007), Statistics (4th ed.), New York: W. W. Norton and Company.
  • Mali, T. (2002), “What Teachers Make,” in What Learning Leaves, Newtown, CT: Hanover Press. Available at: www.taylormali.com/poems-online/what-teachers-make/
  • Sharpe, N. (2005), Living with the Dutch, Amsterdam: KIT Publishers.
  • Sharpe, N., De Veaux, R., and Velleman, P. (2015), Business Statistics (3rd ed.), New York: Pearson.
  • Wainer, H., and De Veaux, R. (1994), “Resizing Triathalons for Fairness,” Chance, 7, 20–25.