The Seven Pillars of Statistical Wisdom

The Seven Pillars of Statistical Wisdom ePUB ↠
    The Seven Pillars of Statistical Wisdom ePUB ↠ allows one to gain information by discarding information, namely, the individuality of the observations Stigler s second pillar, information measurement, challenges the importance of big data by noting that observations are not all equally important the amount of information in a data set is often proportional to only the square root of the number of observations, not the absolute number The third idea is likelihood, the calibration of inferences with the use of probability Intercomparison is the principle that statistical comparisons do not need to be made with respect to an external standard The fifth pillar is regression, both a paradox tall parents on average produce shorter children tall children on average have shorter parents and the basis of inference, including Bayesian inference and causal reasoning The sixth concept captures the importance of experimental design for example, by recognizing the gains to be had from a combinatorial approach with rigorous randomization The seventh idea is the residual the notion that a complicated phenomenon can be simplified by subtracting the effect of known causes, leaving a residual phenomenon that can be explained easily The Seven Pillars of Statistical Wisdom presents an original, unified account of statistical science that will fascinate the interested layperson and engage the professional statistician."/>
  • Paperback
  • 240
  • The Seven Pillars of Statistical Wisdom
  • Stephen M. Stigler
  • 04 May 2018
  • 0674088913

About the Author: Stephen M. Stigler

The Seven Pillars of Statistical Wisdom ePUB ↠ seven pdf, pillars free, statistical download, wisdom download, The Seven ebok, Pillars of epub, The Seven Pillars of Statistical WisdomSeven Pillars of kindle, The Seven Pillars of Statistical Wisdom PDF/EPUBIs a well known author, Pillars of Kindle Ð some of his books are a fascination for The Seven PDF \ readers like in the The Seven Pillars of Statistical Wisdom book, this is one Seven Pillars of ePUB ↠ of the most wanted Stephen M Stigler author readers around the world.


The Seven Pillars of Statistical WisdomThe Seven Pillars of Statistical Wisdom ePUB ↠ seven pdf, pillars free, statistical download, wisdom download, The Seven ebok, Pillars of epub, The Seven Pillars of Statistical WisdomSeven Pillars of kindle, The Seven Pillars of Statistical Wisdom PDF/EPUBWhat gives statistics its unity Pillars of Kindle Ð as a science Stephen Stigler sets forth the The Seven PDF \ seven foundational ideas of statistics a scientific discipline related to but distinct from mathematics Seven Pillars of ePUB ↠ and computer scienceEven the most basic idea aggregation, exemplified by averaging is counterintuitive It allows one to gain information by discarding information, namely, the individuality of the observations Stigler s second pillar, information measurement, challenges the importance of big data by noting that observations are not all equally important the amount of information in a data set is often proportional to only the square root of the number of observations, not the absolute number The third idea is likelihood, the calibration of inferences with the use of probability Intercomparison is the principle that statistical comparisons do not need to be made with respect to an external standard The fifth pillar is regression, both a paradox tall parents on average produce shorter children tall children on average have shorter parents and the basis of inference, including Bayesian inference and causal reasoning The sixth concept captures the importance of experimental design for example, by recognizing the gains to be had from a combinatorial approach with rigorous randomization The seventh idea is the residual the notion that a complicated phenomenon can be simplified by subtracting the effect of known causes, leaving a residual phenomenon that can be explained easily The Seven Pillars of Statistical Wisdom presents an original, unified account of statistical science that will fascinate the interested layperson and engage the professional statistician.

You may also like...

10 thoughts on “The Seven Pillars of Statistical Wisdom

  1. Lee Richardson says:

    As a PhD student in Statistics, I found this book absolutely fascinating Seeing the conceptual linkages between statistical topics, and how one piece of research leads to another, was really revealing It s also written in a clear although there s a little notation fashion, so people can take home the stories of statistical wisdom, as opposed to the details of the methodology I now fully appreciate simply how revolutionary Galton s analysis was, and the same goes for inter comparison Indeed As a PhD student in Statistics, I found this book absolutely fascinating Seeing the conceptual linkages between statistical topics, and how one piece of research leads to another, was really revealing It s also written in a clear although there s a little notation fashion, so people can take home the stories of statistical wisdom, as opposed to the details of the methodology I now fully appreciate simply how revolutionary Galton s analysis was, and the same goes for inter comparison Indeed, seeing the relationship between calculation the standard error leading to the bootstrap, cross validation, was something I would have never thought of It s just fascinating to see the linkages between ideas in statistics relating to one another The design experiment was also great, with a real insight into Fisher s opportunism and ideas When he talks about asking nature a well thought out questionnaire , and how the ideas of design date back thousands of years to the medical literature, it s just an insightful book I can t even image how long it took to read and dig up all of these older documents, but I m really glad Sigler has done this Just a great book, packed with insight, and I m fortunate to have stumbled across it

  2. Daniel Christensen says:

    Kind of a disappointment for me I was hoping for statistical inspiration, or at least some ideas to guide my day today practice.For me this was less statistical wisdom,a thematic history of statistics, even if it has some good illustrations of the impact of the 7 key themes For me wisdom is that applied knowledge, and some advice on pros and cons.Hardly worthless Well written, and summarises an incredible amount of information effectively, just not what I was hoping for Maybe it ll ben Kind of a disappointment for me I was hoping for statistical inspiration, or at least some ideas to guide my day today practice.For me this was less statistical wisdom,a thematic history of statistics, even if it has some good illustrations of the impact of the 7 key themes For me wisdom is that applied knowledge, and some advice on pros and cons.Hardly worthless Well written, and summarises an incredible amount of information effectively, just not what I was hoping for Maybe it ll benefit from a re read later on

  3. Philipp says:

    This is something else Stigler looks at 7 sections areas pillars basic foundations of statistics, summarises their history, their pitfalls, their current developments 1 Aggregation taking a mean it seems quaint now how mind blowing taking the mean was the first time around, just a mere 400 ish years ago but moving all the way to least squares etc.2 Information Measurement when do you have enough measurements How did the idea of measuring novel information in data come about This is something else Stigler looks at 7 sections areas pillars basic foundations of statistics, summarises their history, their pitfalls, their current developments 1 Aggregation taking a mean it seems quaint now how mind blowing taking the mean was the first time around, just a mere 400 ish years ago but moving all the way to least squares etc.2 Information Measurement when do you have enough measurements How did the idea of measuring novel information in data come about 3 Likelihood how did p values come about, how did Bayes theorem become popular 4 Intercomparison t tests all the way to bootstrapping5 Regression aaalll the way to multivariate analysis6 Design of experiments how did the idea of trials themselves evolve, and how did Fisher s multifactor trials in Rothamsted revolutionise things 7 Residual how do we explore the stuff that s left unexplained by our models This isn t aimed at the general public,statisticians and people interested in statistics, you need at least fresh undergrad level knowledge of stats to appreciate what s going on here If you re into stats you ll find a book to love.Somewhere in this book s message is a great way to teach introductory stats not as in here are the formula, please apply them , but as in here s the problem the original inventor was trying to solve, here s the way they solved it, now apply this solution to a similar problem

  4. Andrew Louis says:

    Wish I paidattention in my stats classes to appreciate this

  5. Duncan McKinnon says:

    A pretty comprehensive coverage of the major advances that led to modern statistics I felt the seven pillars were well chosen and justified, but the examples drawn seem to have been selectec solely for their historical significance and not because they are a good representation of the concepts or easy to understand It would help to havecontrived examples that would be easier for a modern audience to understand and make sense of, rather than relying only on examples from the preeminent A pretty comprehensive coverage of the major advances that led to modern statistics I felt the seven pillars were well chosen and justified, but the examples drawn seem to have been selectec solely for their historical significance and not because they are a good representation of the concepts or easy to understand It would help to havecontrived examples that would be easier for a modern audience to understand and make sense of, rather than relying only on examples from the preeminent statisticians of the 18th, 19th and early 20th centuries

  6. J. Boo says:

    Written by the author of The History of Statistics The Measurement of Uncertainty before 1900 , it seems to mostly cover new ground, or at least cover it differently Most of the stories related are new to me fine, it s been a while since I read his previous book.Stigler breaks statistics up into seven separate areas and traces the history of each aggregation, likelihood, experimental design, etc There was a really interesting story about the development of Bayesian statistics Philosopher Written by the author of The History of Statistics The Measurement of Uncertainty before 1900 , it seems to mostly cover new ground, or at least cover it differently Most of the stories related are new to me fine, it s been a while since I read his previous book.Stigler breaks statistics up into seven separate areas and traces the history of each aggregation, likelihood, experimental design, etc There was a really interesting story about the development of Bayesian statistics Philosopher David Hume had argued against miracles basically holding that if something does not happen after a million trials, it is an impossibility and the Reverend Bayes developed what became known as Bayesian probability to counter him.Definitely not an introductory textbook to follow what s going on you d need at least a year of stats under your belt, and some sort of continuing interest in the field T aking an arithmetic mean may come naturally now in repeated measurements of, say, a star position in astronomy, but in the seventeenth century it might have required ignoring the knowledge that the French observation was made by an observer prone to drink and the Russian observation was made by use of old equipment, but the English observation was made by a good friend who had never let you down

  7. Rebecca says:

    I thought this book was really good The author discusses the history behind what he has dubbed the seven foundational concepts that make statistics a science I really enjoyed reading this book.

  8. Jerzy says:

    Three stars seems harsh, but all I mean is that this isn t the book I was expecting hoping for Although Stigler is a great writer and historian, the book didn t hang together all that well for me It feelslike a nice collection of fun fact trivia from the history of statistics, loosely organized using seven fundamental concepts though I m not convinced they are our seven most fundamental concepts I was hoping for a bitdepth On the plus side, it s a fairly quick read.Still, Three stars seems harsh, but all I mean is that this isn t the book I was expecting hoping for Although Stigler is a great writer and historian, the book didn t hang together all that well for me It feelslike a nice collection of fun fact trivia from the history of statistics, loosely organized using seven fundamental concepts though I m not convinced they are our seven most fundamental concepts I was hoping for a bitdepth On the plus side, it s a fairly quick read.Still, it s helpful to see how some of these concepts which we take as obvious today used to be shockingly counter intuitive Or, at least, we professional statisticians think of them as obvious, forgetting how unintuitive they were the first time we took a stats course, and how difficult they are for new students to understand For instance Pillar 1, Aggregation It s surprising that you can often gain information by throwing information away By summarizing several observations with a simple mean, you discard a ton of info, yet it can give a better answer as when astronomers average several repeated measurements of a star s position, instead of arguing over which individual measurement should be trusted Pillar 2, Information Measurement It s surprising that the precision of the data often improves with sqrt n , not linearly with n itself There are diminishing returns if you want to double your precision, you have to quadruple not double the sample size Pillar 4, Intercomparison It s surprising that you can often make valid statistical comparisons interior to the data with no external standard Tools like the t test let you compare two groups using a dataset and assess that comparison using the variation in the same dataset, instead of requiring outside data to help you judge the quality of this internal comparison This can be a dangerous tool, often misused just because we can doesn t mean we always should But it s surprising that this is possible at all Pillar 6, Design It s surprising that using randomization and carefully planned multi factor trials can often lead to better ,rigorous,precise inferences than using judgment samples and single factor trials The pillar of Design has my vote for the most under appreciated, at least the way Statistics classes are taught today Apparently Pillar 5, Regression, was also revolutionary, but I didn t really follow why It wasn t clear to me why the much older least squares regression lines were not in this chapter how did their history lead up to Galton s regression to the mean Likewise, the examples in Pillar 7, Residual, seemed loosely thrown together and didn t gel into a single solid concept for me Maybe it ll click on a future reading.Finally, Pillar 3, Likelihood the calibration of inferences with the use of probability , is certainly an important technical tool, but wasn t as interesting to me as the others This is the obvious pillar of statistics if you ve ever taken a stats course beyond 101 or cracked open a statistical journal, we spend SOOOO much time on this one But I feel we over focus on it and argue over it use this to get the right p value no, that method is inexact, this is a better way no, be a Bayesian, don t use p values to the detriment of all the other important concepts Likelihood has been treated as a rock solid foundation, even though it should just be icing on the cake if what you really care about is doing good science and not just writing mathematical proofs to get a fancy journal publication.Overall, this book is worth reading As I prepare to teach intro stats, I ll try to benefit from Stigler s reminder that these are revolutionary concepts hopefully I can help the students see it too But I don t think the book provides the solid unifying structure that its title promises Favorite fun facts p.7 Design of statistical studies is an ideal that can discipline our thinking, no matter what kind of study we re designing or even if we re just analyzing data that has already been collected p.9 Maybe it s worth using his rephrased list of pillars as a set of learning objectives for intro stats courses 1 The value of targeted reduction or compression of data 2 The diminishing value of an increased amount of data 3 How to put a probability measuring stick to what we do 4 How to use internal variation in the data to help in that 5 How asking questions from different perspectives can lead to revealingly different answers 6 The essential role of the planning of observations 7 How all these ideas can be used in exploring and comparing competing explanations in science p.26 An example of why this book felt disappointing Before such and such time period, the arithmetic mean was almost never used to combine observations the midrange wascommon after such and such time period, the mean was very popular but there s nothing about how this transition happened, or why it happened then in particular This omission is probably not Stigler s fault, just due to the paucity of the historical record But still, as a reader, it feels disappointing when Stigler sets it up as a big mystery how when did the mean become common , yet doesn t really resolve that mystery p.31 Nice early example of why the mean improves on using an individual measurement in the early 1500s, the basic unit of land measurement was the rod, defined as sixteen feet long but whose foot Instead of picking one person s foot, recruit 16 representative citizens after church to stand in a line, toe to heel, and the sixteen foot rod would be the length of that line Easy to reproduce anywhere, hopefully with adequate precision to be useful for land surveying purposes across the country Intro Stats project idea have students actually do this experiment several times, and compare the variation in individuals feet vs the sample to sample variation in average feet And yet, in the historical account Stigler cites, the idea that the individuals were collectively determining the rod was the forceful point their identity was not discarded it was the key to the legitimacy of the rod, even as the separate foot marks were a real average I can t go see the official international prototype metre bar anytime I like, but my community members and I can legitimize this measurement of a 16 foot rod anytime Sounds a bit like the way juries are used to help the community see the judicial process as legitimate There might be situations where relying on experts alone would give higher rates of correct decision making in court but having the jury there makes the whole thing look right , and that has its own value p.34 Antoine Augustin Cournot s nice counter to Quetelet s idea of the Average Man Cournot noted that if one averaged the respective sides of a collection of right triangles, the resulting figure would not be a right triangle in general That is, no single real person has the average height, weight, age, etc all at once p.43 Again, like on p.26, disappointing to simply hear that least squares methods became the most popular by such and such a date, without knowing how why How was this justified defended in comparison to other methods There are other loss functions you could optimize and it s not clear that this one is always the best or most natural Was it purely computational convenience of crunching the numbers easily or analytical convenience of proving results about least squares , or was thereto it p.48 50 Really nice example of the relevance of the sqrt n rule, simpler than confidence intervals or p values In England during the 1200s 1800s , in the Trial of the Pyx, the mint would bring the coins they had minted, and judges would weigh a sample of them, and that sample s weight had to be close to the target weight If it was too low, you d suspect the mint of cheating by failing to meet standards not enough gold in each coin, etc Say that one coin s weight had a target of T, and there was a small allowed tolerance of R in one coin s weight we know that minting coins isn t perfectly precise even if you are honest, so it s OK if some coins weight as little as T R Now if you weigh 100 coins at once, what should be the allowed tolerance The natural answer, 100R, is pretty bad statistically speaking If the coins weights vary independently, then by the sqrt n rule we should have something like 10R, not 100R If you allow 100 coins to weigh as little as 100T 100R, you can actually aim to make your coins too small by shaving off gold or whatever and still have a very good chance of being over 100T 100R If the total weight must be at least 100T 10R, it s still a fair test for honest minters, but much harder to cheat p.57 Charles Peirce, dissing the 1879 equivalent of p hacking It is to be remarked that the theory here given an example of optimizing your experimental design to get the most info from your limited time and money rests on the supposition that the object of the investigation is the ascertainment of truth When an investigation is made for the purpose of attaining personal distinction, the economics of the problem are entirely different But that seems to be well enough understood by those engaged in that sort of investigation Oooh, sick burn p.58 59 For certain loss functions, you don t just want to average the data together you should actually discard observations Discrete example from John Venn Two spies report on the fort you re about to capture and want to restock so you can defend it from re capture One says you should stock 8 inch cannonballs, the other says 9 inch Obviously it s better to pick one or the other than to bring 8.5 inch balls which wouldn t work in either case Francis Edgeworth apparently had some other, non discrete examples where literally throwing data away is better than averaging It depends on your data distribution and your loss function not every problem is about optimizing mean squared error p.77 Nice example of an early, correctly interpreted hypothesis testor less by Laplace, trying to detect the moon s effect on the tide at Paris this action is only indicated with a weak likelihood so that one can regard its perceptible existence at Paris as uncertain That is, he doesn t wrongly claim he s disproven the effect, but rather that any effect is too weak to detect with the available data How did we get from here to the modern setting, where failure to reject is so often misinterpreted as The null is true rather than simply Not enough data p.83 84 Fisher s development of MLE really was remarkable Nothing new in working with likelihoods but he turned it into a very strong general tool Besides using the likelihood to get an estimate, you can also take a couple of derivatives to get a standard error for that estimate, and the estimate so found expressed all the relevant information available in the data and could not possibly be improved upon by any other consistent method of estimation As such, it would be the answer to all statisticians prayers a simple program for finding the theoretically best answer, and a full description of its accuracy came along almost for free Of course there were edge cases and counterexamples, but it s no surprise this really took off and we still learn about MLEs as a core technique in every math stat course p.94 There are many reasons to be cautious about overuse of the t test, and there might be better topics to cover in intro stats courses However, Stigler points out how revolutionary this test really was the comparison, of the sample mean with the sample standard deviation, was made with no exterior reference no reference to a true standard deviation, no reference to thresholds that were generally accepted in that area of scientific research Butto the point, the ratio had a distribution that in no way involved sigma and so any probability statements involving the ratio t, such as P values, could also be made interior to the data If the distribution of that ratio had varied with sigma, the evidential use of t would necessarily also vary according to sigma Inference from Student s t was a purely internal to the data analysis In other words, earlier scientists had to get an estimate of sigma from somewhere else, or simply hope that the sample standard deviation was a good enough estimate of sigma in their particular sample, or ignore all this and use some other threshold for big effects or precise measurements that others in the field had agreed on With the t test, you were suddenly justified in using the sample standard deviation as is, trusting it directly with the help of a t table Of course there are dangers statistical and practical significance are not the same at all but it s still a nifty result, and before reading this I didn t appreciate how innovative it was p.101 The t test was the precursor tomodern intercomparison techniques such as the jackknife, bootstrap, and cross validation, in which we hope to get an estimate and learn about its precision from the same dataset without any ill consequences These are great tools but it s a shame that the t test and later tests have had the consequence of de emphasizing scientifically meaningful thresholds, and over emphasizing statistical significance just because we know how to compute it p 111 130 I already knew that Galton came up with examples of regression to the mean and this is why we call regression by that name today But this section went deep into why Galton was interested in his particular problem and how regression methods solved it interesting stuff Galton was Darwin s cousin and wanted to understand why a possible flaw in Darwin s theory didn t seem to occur in practice There s natural variation from generation to generation, say in the heights of humans we aren t exactly the same height as our parents or even as the average of our two parents heights and that s all necessary for Darwin s theory of evolution But then, why do we seem to have roughly the same amount of variability in each generation As long as we re talking on the short term scale of the same species, how do we get the same variance in human heights each generation, instead of ever increasing variance, where slightly taller than average people pair up begetting ever taller children and vice versa, until we turn into separate species of giants and dwarves So, I knew that the fact that this doesn t happen is illustrated by regression to the mean If you plot children s heights against the mean of their parents heights, it s not a perfect correlation The tallest parents tend to have kids who are taller than average but not as tall as they are, and vice versa the regression line has a slope less than 1 But now, Stigler also explains Galton s perspective on why this happens, not merely that it happens Basically, consider a model in which we don t inherit our parents actual height, just the genetic component of it A person s actual observed height is the combination of their genetically predisposed height plus a random effect due to chance assuming a setting where environment and diet aren t varying much So to be very tall, you probably have a little taller than average genetic component and a taller than average random component Your kids will only inherit the genetic piece, so they re all still likely to be a little taller than average, but a typical kid of yours will not be as tall as you, because there s no guarantee their random luck component will be as big as yours was Since the tallest people s kids heights don t vary randomly around the parent s observed height, but around the parent s genetic piece of height which is less tall than the observed height, for most very tall people then that is why we see regression to the mean Galton had discovered that regression toward the mean was not the result of biological change, but rather was a simple consequence of the imperfect correlation between parents and offspring This is then consistent with the population dispersion being in an approximate evolutionary equilibrium, with the movement from the population center toward the extremes being balanced by the movement back, due to the fact that much of the variation carrying toward the extreme is transient excursions from the muchpopulous middle I don t know if Galton said so, but I assume that this idea must also allow the genetic piece to have some randomness too, across generations surely that s needed in order for evolution to work, else where do differences in the genetic component come from, right But I guess that part isn t needed to explain the regression to the mean effect itself p.150 Statisticians today seem to think Fisher invented Design of Experiments out of whole cloth Of course that s not true though he did revolutionize it Stigler points out a clinical trial in the Old Testament Book of Daniel, as well as a list of rules for experimentation written by Avicenna around 1000 CE Later around p.159 he mentionsmodern work in experimental psychology by Charles Peirce, Gustav Fechner, etc., whose experiments validity relied on randomization, though Fisher later went further in establishing the link from randomization to inference p.153 A good Fisher quote No aphorism isfrequently repeated in connection with field trials, than that we must ask Nature few questions, or, ideally, one question, at a time The writer is convinced that this view is wholly mistaken Nature, he suggests, will best respond to a logical and carefully thought out questionnaire indeed, if we ask her a single question, she will often refuse to answer until some other topic has been discussed Then on p.156, Stigler s summary of why this is so, in a 2 factor setting If one approaches the data ignoring one factor, the variation due to the omitted factor could dwarf the variation due to the other factor and uncontrolled factors, thus making detection or estimation of the other factor impossible But if both were included in some applications, Fisher would call this blocking , the effect of both would jump out and be clearly identifiable In even a basic additive effects example, the result could be striking incomplex situations, it could be heroic p.161 a person s perception of chance is approximately linear on a log odds scale maybe another fun experiment to run on intro stats students p.197 Speaking of significance tests, Its growing use over the past century is testimony to the need for a calibrated summary of evidence in favor of or against a proposition When used poorly the summary can mislead, but that should not blind us to the much greater propensity to mislead with verbal summaries lacking even a nod toward an attempt at calibration with respect to a generally accepted standard Yes, but on the other hand, it s hard to even talk about significance tests correctly Is Stigler saying that signif tests evaluate the truth of a proposition, or that they evaluate the quality of the evidence Is he saying that one or the other is needed When even the pros can t talk about these things without tripping over their words, it goes to show we really need better words for these legitimately difficult ideas Meanwhile, maybe skipping the calibrated summary can be better than sincerely believing your wrong interpretation of that summary

  9. Robert says:

    A mostly enjoyable analysis of the fundamental ideas underlying the science of statistics, along with a heavy dose of history, though too uneven for me in depth and clarity.

  10. Dylan O& says:

    In this brisk read, Stigler articulates an answer to the rarely discussed question, What defines the field of statistics Rather than get stuck in pointless semantics, he outlines seven pillars of wisdom , which provide a foundation upon which the modern tools of the discipline are built To accomplish this, the substance of the text is largely historical, tracing the development of this theory with interesting anecdotes about the bumpy road to take to this point The argument is effective, In this brisk read, Stigler articulates an answer to the rarely discussed question, What defines the field of statistics Rather than get stuck in pointless semantics, he outlines seven pillars of wisdom , which provide a foundation upon which the modern tools of the discipline are built To accomplish this, the substance of the text is largely historical, tracing the development of this theory with interesting anecdotes about the bumpy road to take to this point The argument is effective, and the presentation is solid, but it s a book with a narrow target audience, and it s not going to be essential for just about anyone To be clear, while it is rarely technical, and laymen could probably follow most of it without issue, I don t think any of this would be remotely interesting unless you are already proficient with the underlying theory it certainly won t teach you any statistics, you need to know the material going in So it s fairly exclusively geared towards statisticians, but with the weird caveat that it s written in something of a general audience style So many statisticians won t be interested in a book that won t teach them anything new about statistics.So, it does just those two things It provides a framework for a unifying theory of the rather scattered field, and it supplements it with interesting historical anecdotes For statisticians who are interested in giving that theory some thought, this gets the job done And I wouldn t totally underrate the use of the historical anecdotes it s easy to take these ideas for granted, and lose track of their high level connections, and understanding the development of the ideas seems extremely useful when teaching the subject Overall, it was fascinating to see how statistical ideas were intuitively used long before they are given any rigorous theory Below, I ll record my brief chapter summaries, which are purely for myself, I just want to get in this habit of taking some notes so I retain a bitfrom nonfiction a bit of repetition of the basic ideas helps it stick a bit better, as well as tracking a few anecdotes I could follow up with later if I forget I didn t think very critically about what could be improved, this is just a simple log, but as said above, the book is quite solid, but not essential If something is missing, it s that it focuses too heavily on historical anecdotes and not enough on the philosophy behind this work But maybe any treatment of statistical philosophy is just wasted in such a short book Chapter 1 Aggregation The strong temptation is, and has always been, to select one observation thought to be the best, rather than to corrupt it by averaging with others of suspected lesser value Don t take the sample mean for granted Early practitioners who needed a summary statistic for a collection of data didn t necessarily choose it there are countless other options, like the average of the range, the median, or gut instinct The Pythagoreans knew of the mean arithmetic, geometric, and harmonic , but the leap to its power in practice is another matter It takes a real leap of faith to use them to summarize data, because to summarize is to discard information And it s good to remember why some find the leap so terrifying.Math is rife with references to Borges, but this one is really wonderful Stigler cites Funes the Memories , quoting To think is to forget details, generalize, make abstractions In the teeming world of Funes there were only details Aggregation can yield great gains above the individual components Funes was big data without Statistics Will definitely steal that myself Chapter 2 Information, Its Measurement, and Rate of ChangeI enjoyed the simplified thought experiment of the Empiricist vs Dogmatist The empiricist claims to follow evidence, the dogmatist follows theory The dogmatist s criticism is that one case isn t sufficient to draw a conclusion And yet, if at any point you are uncertain, how can the addition of a single case change your mind, when a single case is unconvincing When phrased that way, it sounds a bit silly many things exist on a continuum, and you keep adding drops of water until you have something we call a puddle But it raises the perfectly valid question that actually constructing a framework for how to interpret each new piece of evidence is enormously difficult, and people are right to be suspicious.More specifically, this chapter covers the diminishing marginal returns to information, and honestly, it basically just boils down to the fact that the standard deviation of the sample mean for independent, identically distributed observations is divided by the square root of n which implies that doubling your accurate requires four times the data But having helped teach intro stats a number of times, you can t really overstate the broad implications of that little insight so phrasing it in terms of diminishing marginal returns to information is useful Chapter 3 LikelihoodOne of the most essential pillars, but not quite satisfying a chapter Likely because the theory of likelihood is superficially pretty basic, but mechanically quite tricky when you need to make it explicit and is better suited to a textbook Would have appreciated adetailed discussion of the frequentist vs Bayesian understanding of the likelihood But I quite like the anecdote that the initial interest in Bayes original paper was in its use as a rebuttal to Hume, and his claims about the improbability of miracles Chapter 4 IntercomparisonThe power of understanding the internal variation of the data, without citing exterior criteria Not much to say here, except several quotes examples I enjoyed There was some mathematical luck involved Gosset implicitly assumed that the lack of correlation between the sample mean and the sample standard deviation implied they were independent, which was true in his normal case but is not true in any other case I m sure Gosset will be heartened to know that half of all intro stats students make the same mistake Great anecdote about how Edgeworth came so close to a really complex ANOVA theory, but just missed the mark because he was working numerically, not algebraically Exercising the right of occasional suppression and slight modification, it is truly absurd to see how plastic a limited number of observations become, in the hands of men with preconceived ideas GaltonChapter 5 RegressionGreat chapter Regression to the mean is not just an interesting bit of trivia, it s a fundamental description of how relationships between quantities manifest in the natural world We have an intuition for extrapolation, and apparently people used to call it The Rule of Three That is, if you expect the ratio a b to match c d, and you know any three quantities, you can extrapolate the third This makes intuitive sense, but is only reasonable practice if we have perfect correlation, and the natural variation in almost every part of the natural world means this is a very bad idea That is, we easily identified the power of linear relationships, but not the impact that natural variation has on this extrapolation It s a simple but powerful concept I ve sometimes struggled to articulate to students why linear models aren t quite so arbitrary, but are at least a little fundamental, and I think working back from how the Rule of Three can lead you wrong is a nice method.The fact that this both identifies and fixes a key part of the theory of evolution is famous, but still neat, and it s a nice reminder of a fairly early time when a purely mathematical fix was applied to a very grounded scientific theory.Chapter 6 Design of ExperimentsThe methods of data analysis has huge implications for the methods of data collection Great anecdote about the 7 rules for experimental design in the 1000 AD Canon of Medicine testing on lions , and a very good description of ANOVA how it allows you to discover effects of small variation amidst effects of larger variation Chapter 7 ResidualsBuilding our model of understanding can be iterative we add in a piece of the framework, and examine what is left The weakest chapter in terms of definition, like I m not sure I agree with Even a lowly pie chart, when it has any value beyond decoration, is a way of showing a degree of inequality in the various segments, through the chart s departure from the baseline of a pie with equal pieces , as proof that its an example of residuals But even if it s a stretch, it s fine to have a clean up pillar Enjoyed the explanation of the power of parametric families which can easily seem like something of a notational convenience