Statistics Done Wrong: A Woefully Complete Guide, by Alex Reinhart
The problem with this book, such as it is, is that it by no means a woefully complete guide. To be sure, the state of statistics abuse in contemporary society is rather woeful, and this book demonstrates that a great many people, including those who engage in data analysis as a profession, lack a fundamental understanding of the terminology and meaning of the field they work in, have a terrible understanding of statistics. The author, though, does not go the direction that one would expect, and he deserves a great deal of credit for his restraint. It would be easy to direct an awakening evidence of the terrible knowledge and mistaken use of statistics principles and practices in scientific fields into a feeling of snobbery towards those who know less than even the reader, but the author does not want to do that. What he is trying to do is more complicated, and that is revealing the sad state of statistics knowledge even in many well-respected and well-regarded places, while at the same time trying to let the reader think that things are doing better and that one need not treat anything that seeks to use data to make a recommendation as being suspect, even if things do often appear that way.
This book is a short one at a bit less than 150 pages. It begins with a preface, acknowledgements, and an introduction. This is followed by an introduction to statistical significance (1), including confidence intervals. The author then discusses statistical power and the frequency of underpowered statistics (2), something that appears not to be well recognized. After this comes a look at pseudoreplication and the importance of choosing one’s data correctly (3). The author discusses the problems of p value and base rate fallacies (4), how people are bad judges of significance (5), how people regularly double-dip in the data and engage in circular reasoning (6), and problems of continuity errors (7). There are chapters on such matters as model abuse (8), researcher freedom and its pitfalls (9), the fact that everybody makes mistakes with data (10), and ways that data are hidden in ways that hinder our ability to understand what is going on (11). The author closes with a chapter on what can be done about this (12), as well as notes and an index.
I am not sure that I ultimately buy what the author is trying to sell. It is not as if statistical knowledge is too difficult for people to attain to. The author, after all, expects the reader to understand what he is saying, at least from a conceptual level. The author, also, it appears, wishes to preserve the prestige of certain gatekeepers within the scientific community whose legitimacy would be undermined if one takes a position of extreme skepticism relating to the use of statistical inferences and reasoning. Yet the author’s discussion of the characteristic flaws of how people tend to use statistics is highly damning when it comes to large areas of the world where people try to argue based on studies. Problems like confirmation bias are something that all of us are prone to, and to the extent that we are aware of our own vulnerabilities when it comes to sound reasoning, we can also be properly skeptical of others when it comes to their own attempts to engage in such reasoning where they have a motive and plenty of opportunity to be less than honest.