CLE & Other Errors



The Common Language Effect Size (CLE) converts an effect into a probability, similar to Pr(Z < 1) = 84%. Hattie then used this to interpret each effect size.

However, Hattie has calculated CLE probability values of between -49% and 219%. The researchers, Higgins and Simpson (2011), Topphol (2011) and Bergeron (2017) identified that these values are not possible.

Professor Pierre-Jérôme Bergeron states,

'To not notice the presence of negative probabilities is an enormous blunder to anyone who has taken at least one statistics course in their lives. Yet, this oversight is but the symptom of a total lack of scientific rigor, and the lesser of reasoning errors in Visible Learning.'

As a result, Hattie finally has now admitted he calculated all CLE incorrectly. Although, he says the calculation was not important; However, every statistics student is taught that a Z-score and a probability go hand-in-hand; the probability provides an interpretation of the Z score. Given Hattie's mantra that interpretations are the important aspect of a synthesis, this is a significant mistake.

Also, Eivind Solfjell correctly points out that most of Hattie's explanations of particular CLE's (VL (2009, p6)) are also incorrect. For example, similar to Pr(Z < 0) = 50%, a CLE of less than 50% indicates d MUST be negative. Hattie does not seem to understand this.

Professor Arne Kare Topphol, who was the first to publish that Hattie had calculated CLE's incorrectly in his paper, 'Can we count on the statistics use in education research?' and had a dialogue with Hattie: 

"My criticism of the erroneous use of statistical methods will thus probably not affect Hattie’s scientific conclusions. However, my point is, it undermines the credibility of the calculations and it supports my conclusion and the appeal I give at the end of my article; when using statistics one should be accurate, honest, thorough in quality control and not go beyond one's qualifications.

My main concern in this article is thus to call for care and thoroughness when using statistics. The credibility of educational research relies heavily on the fact that we can trust its use of statistics. In my opinion, Hattie’s book is an example that shows that we unfortunately cannot always have this trust.

Hattie has now given a response to the criticism I made. What he writes in his comment makes me even more worried, rather than reassured."

This has prompted one blogger to comment - People who think that probability can be negative, shouldn’t write books on statistics.

Hattie in a later defence of his work in 2015, contradicts the explanation he gives in 2012 to Topphol above, i.e., that he used a different calculation to Topphol:

'at the last minute in editing I substituted the wrong column of data into the CLE column and did not pick up this error; I regret this omission.' 

Hattie does not explain then the full page of incorrect analysis of the CLE probability statistic in VL (2009, p9) as mentioned by Solfjell above.   

Was this also a cut & paste error? Or do both mistakes show Hattie did not have an understanding of the statistic? Certainly, Topphol thinks so and also recently Prof Pierre-Jérôme Bergeron.

Hattie no longer promotes the CLE as a way of understanding his effect sizes, he now uses a value of d = 0.40 as being equivalent to 1 year's progress. But as already stated, this creates other more significant problems.

Other Errors:


The inclusion of studies that don't measure the influence in question or achievement:



There are many examples of this serious error:

The inclusion of Falchikov (2000) which measured peer-assessment NOT a self-report grade and Kuncel (2005) which measured a student’s memory of their GPA score from a year or so previously NOT a self-report grade in the future.

The inclusion of Karvale & Forness (1983) in the diet influence This paper is not measuring diet as it relates to improving student achievement. But rather ONLY ONE diet modification as a treatment of hyperactivity.


Hattie mixes up the X\Y Axis:



Hattie uses a funnel plot (p21) to show that publication bias does not affect his research.But, Higgins and Simpson (2011) show that Hattie has mixed the X/Y axis and if drawn correctly the funnel plot does in fact show publication bias  (p198).








For more information see - https://en.wikipedia.org/wiki/Funnel_plot

Standard Error:


"In his "Effect Barometers" Hattie also gives a standard error for the determined effect size. However, their calculation is flawed (see also Pant 2014 a, p. 96, note 4, 2014b, p. 143, FN 4). As a rule, the specified value is the arithmetic mean of the specified standard errors of the individual first-stage meta-analyzes. For example, in the case of inquiry-based teaching, the given standard error of 0.092 is the arithmetic mean of the two standard errors of 0.154 and 0.030 given two of the four meta-analyzes. However, the precision of the estimate from both meta-analyzes can not be less than that of the individual meta-analyzes; in fact, the standard error of an effect magnitude estimate from these two overlap-free meta-analyzes in the primary studies is 0.029"...

The reconstruction of Hattie's approach in detail using examples thus shows that the methodological standards to be applied are violated at all levels of the analysis. As some of the examples given here show, Hatties values are sometimes many times too high or low. In order to be able to estimate the impact of these deficiencies on the analysis results, the full analyzes would have to be carried out correctly, but for which, as already stated, often necessary information is missing. However, the amount and scope of these shortcomings alone give cause for justified doubts about the resilience of Hattie's results" Wecker, et al (2016, p30).

Average and Total:


Hattie seems to mix-up Total and Average on most of the tables in his book, e.g., the table below, from p61, d, SE and CLE are averages, not totals. Although minor, it represents Hattie's general disregard for detail.



Gender:

for the gender studies comparing boys versus girls, VL arbitrarily assigned a negative effect size when girls outperform boys. This affects the overall average effect size.