Other Issues

Conflict of Interest:

'It is difficult to get a man to understand something, when his salary depends on his not understanding it.' Upton Sinclair
Professor Ewald Terhardt (2011, p434)
'A part of the criticism on Hattie condemns his close links to the New Zealand Government and is suspicious of his own economic interests in the spread of his assessment and training programme (asTTle). Similarly, he is accused of advertising elements of performance-related pay of teachers and he is being criticised for the use of asTTle as the administrative tool for scaling teacher performance. His neglect of social backgrounds, inequality, racism, etc., and issues of school structure is also held against him. This criticism is part of a negative attitude towards standards-based school reform in general. However, there is also criticism concerning Hattie’s conception of teaching and teachers. Hattie is accused of propagating a teacher-centred, highly directive form of classroom teaching, which is characterised essentially by constant performance assessments directed to the students and to teachers.'
Hattie is once again promoting asTTle in his collaboration with Pearson - What Works In Education but fails to divulge his financial interest in the program (p13).

Professor John O'Neill wrote a timely warning, in 2012, about Hattie's influence on Education policy and his financial interest in the solutions proposed:
The 'discourse seeks to portray the public sector as ‘ineffective, unresponsive, sloppy, risk-averse and innovation-resistant’ yet at the same time it promotes celebration of public sector 'heroes' of reform and new kinds of public sector 'excellence'. Relatedly, Mintrom (2000) has written persuasively in the American context, of the way in which ‘policy entrepreneurs’ position themselves politically to champion, shape and benefit from school reform discourses' (p2).

Hattie quotes Cohen (1985) 
'New and revolutionary ideas in teaching will tend to be resisted rather than welcomed with open arms, because every successful teacher has a vested intellectual, social, and even financial interest in maintaining the status quo' (p252).This is a disappointing inference directed at teachers, given Hattie's collaboration with Pearson; who paid Hattie for the intellectual rights to Visual Laboratories and Hattie's financial interest in the solutions provided to schools.
Dr Jonathan Becker, similarly critics Marzano, for his lack of independence, due to his financial arrangement with Promethean in his research.

Nick Rose goes into more detail regarding financial conflicts of interest and research.


Joshua Katz's YouTube presentation went viral regarding financial conflict of interest in Education.



Contradictions & Inconsistencies:


Hattie defines 'influence' as any effect on student achievement. But this is too vague and leads to many contradictions & inconsistencies. 


Hattie states (preface) 

'The book is not about classroom life, and does not speak of its nuances ...'
However, his influences consist of a large number classroom nuances: behaviour, feedback, motivation, ability grouping, worked examples, problem-solving, micro teaching, teacher-student relationships, direct instruction, vocabulary programs, concept mapping, peer tutoring, play programs, time on task, simulations, calculators, computer assisted instruction, etc.

Also in his preface he states, 

'It is not a book about what cannot be influenced in schools - thus critical discussions about class, poverty, resources in families, health in families, and nutrition are not included.'
Yet he has included these in his rankings:

Home  environment, d = 0.57 rank 31

Socioeconomic status, d = 0.57 rank 32
Pre-term birth weight, d = 0.54 rank 38
Parental involvement, d = 0.51 rank 45
Drugs, d = 0.33 rank 81
Positive view of own ethnicity, d = 0.32 rank 82
Family structure, d = 0.17 rank 113
Diet, d = 0.12 rank 123
Welfare  policies, d = -0.12 rank 135

Passion


In Hattie's 2012 update to VL he states, 

'Throughout Visible Learning, I constantly came across the importance of ‘passion’; as a measurement person, it bothered me that it was a difficult notion to measure – particularly when it was often so obvious' (preface).
Passion is not included in Hattie's list of influences, yet he raises it as one of the most important influences!

Teacher Training and Experienced Teachers 



Hattie uses his own research on 65 teachers comparing NBC with Non-NBC teachers and reports this in the last chapter of VL. But, Hattie is using the research for a very different purpose, to demonstrate the difference between expert versus experienced teachers. Hattie makes the arbitrary judgement that NBC certified teachers are 'Experienced Experts' while Non-NBC teachers are 'Experienced'. He does not use student achievement but rather arbitrary criteria as displayed in the graph below.

Podgursky (2001) in his critique, describes them as 'nebulous standards'. Podgursky is also rather suspicious of Hattie's rationale for not using student achievement, 

'It is not too much of an exaggeration to state that such measures have been cited as a cause of all of the nation’s considerable problems in educating our youth. . . . It is in their uses as measures of individual teacher effectiveness and quality that such measures are particularly inappropriate' (p2).
Hattie concludes that expert teachers (NBC) outperform Non-NBC teachers on almost every criterion (p260).


Harris and Sass (2009) report that the National Board for Professional Teaching Standards (NBPTS) who administer the NBC generate around $600 million in fees each year (p4). Harris and Sass's much larger study 'covering the universe of teachers and students in Florida for a four -year span' (p1) contradict Hattie's conclusion, 'we find relatively little support for NBC as a signal of teacher effectiveness' (p25).

It is interesting that much of Hattie's consulting work to schools involves measuring teachers on the arbitrary categories listed on the graph, a significant omission is Teacher Subject Knowledge.

Yet, using the same type of research, e.g. Hacke (2010), comparing NBC with Non-NBC teachers, he uses the low effect size to conclude that Teacher Education is a DISASTER. See Hattie's slides from his 2008 Nuthall lecture.





Hattie's Rankings are Not Helpful to Teachers:


Professor Dylan William explains the problem in 'Inside the Black Box' (2001), 
'Teachers will not take up attractive sounding ideas, albeit based on extensive research, if these are presented as general principles which leave entirely to them the task of translating them into everyday practice - their classroom lives are too busy and too fragile for this to be possible for all but an outstanding few. What they need is a variety of living examples of implementation, by teachers with whom they can identify and from whom they can both derive conviction and confidence that they can do better, and see concrete examples of what doing better means in practice' (p10).
Then commenting on research again, William says,
'despite the many and varied reports of successful innovations, they fail to give clear accounts on one or other of the important details, for example about the actual classroom methods used, or about the motivation and experience of the teachers, or about the nature of the tests used as measures of success, or about the outlooks and expectations of the pupils involved' (p12).

Averaging:

Hattie's averaging hides much of the complexity, for example, Professor Ivan Snook, et al, on Homework:
'There is also the difficulty which arises amalgamating a large number of disparate studies. When results of many studies are averaged, the complexity of education is ignored: variables such as age, ability, gender, and subject studied are set aside. An example of this problem can be seen in Hattie’s treatment of homework: does homework improve learning or not?

Overall, Hattie finds that the effect size of homework is 0.29. Thus a media commentator, reading a summary might justifiably report: “Hattie finds that homework does not make a difference.” When, however, we turn to the section on homework we find that, for example, the effect sizes for elementary (primary in our terms) and high schools students are 0.15 and 0.64 respectively.

Putting it crudely, the figures suggest that homework is very important for high school students but relatively unimportant for primary school students.

There were also significant differences in the effects of homework in mathematics (high effects) and science and social studies (both low effects). Results were high for low ability students and low for high ability students. The nature of the homework set was also influential. (pp 234-236). All these complexities are lost in an average effect size of 0.29' (p4).
Similar detail is lost in the averaging Hattie uses for class size and ability grouping.

Hattie's interpretation of the average d=0.40 representing a year's progress is TOTALLY misleading. This is obvious when the table of the USA effect size benchmarks for each year level is viewed.

Many of Hattie's researchers warn about averaging:

Mabe and West (1982) 
'considerable information would be lost by averaging the often widely discrepant correlations within studies' (p291).
Slavin (1990), 

'In pooling findings across studies, medians rather than means were used, principally to avoid giving too much weight to outliers' (p477).

Professor Maureen Hallinan. (1990)  
'The fact that the studies Slavin examines show no direct effect of ability grouping on student achievement is not surprising. The studies compare mean achievement scores of classes that are ability grouped to those that are not. Since means are averages, they reveal nothing about the distribution of scores in the two kinds of classes. Ability grouping may increase the spread of test scores while leaving the mean unchanged' (p501).

Hattie attacks the easy target - The Teacher:

Hattie summarises his book 
'the devil in this story is not the negative, criminal, and incompetent teacher, but the average, let's get through the curricula… teacher' (p258).
This is an amazing critique and represents Hattie's focus throughout the book. He seems oblivious to systemic and political influences and seems all too eager to focus the blame on the easy target - the teacher.

Yet Hattie says, 
'Educating is more than teaching people to think – it is also teaching people things that are worth learning' (p27).
This is the realm of politicians and senior bureaucrats, who mostly decide what is worth learning by designing and enforcing a curriculum. So if following the curricula is the issue, why not focus on those who decide the curricular? They are most often not the teachers.

 In Hattie's jurisdiction, the state of Victoria, Australia; teachers can get dismissed for not teaching the curricula - click here for examples.


The whole may be more than the sum of the parts:

Bruce Springsteen says in his tribute to U2 using their song Vertigo - "Uno, dos, tres, catorce" - translated 1,2,3,14! - the correct maths of rock-n-roll, and maybe for classrooms too.

Also, Hattie's rankings distract us from some of the more useful teaching initiatives. For example, Professor Jo Boaler and Charles Lovitt, who focus on combining a number of influences together: problem-based learning, simulations, time on task, inquiry, and visual methods. Yet, Hattie rates these individual influences very lowly and ignores the major effect and usual classroom dynamic of combining a number of influences together.


An investigation of the evidence John Hattie presents in Visible Learning


At the 2005 ACER conference (p5) Hattie said,
'We must contest the evidence – as that is the basis of a common understanding of progression.'
Then in Visible Learning [VL] he quotes Karl Popper (p4)
'Those amongst us unwilling to expose their ideas to the hazard of refutation do not take part in the scientific game.'
The Maths curriculum for all Victorian schools (the state in which Hattie lives) details the following criteria for ALL students to achieve by the end of Year 10:
'Evaluate statistical reports in the media and other places by linking claims to displays, statistics and representative data.' Mathematics Statistics and Probability Levels 7-10A.
So we place a high priority on our students being able to evaluate statistical claims. 

Tom Bennett, the founder of researchEd,  wrote an influential paper 'The School Research Lead', where he states (p9&10),

'There exists a good deal of poor, misleading or simply deceptive research in the ecosystem of school debate.'

'Where research contradicts the prevailing experiential wisdom of the practitioner, that needs to be accounted for, to the detriment of neither but for the ultimate benefit of the student or educator.'
In his excellent analysis 'School Leadership and the cult of the guru: the neo-Taylorism of Hattie', Professor Scott Eacott says (p11),
'The uncritical acceptance of his work as the definitive word on what works in schooling, particularly by large professional associations such as ACEL, is highly problematic.'
Prof Adrian Simpson's detailed analysis of the calculation of effect sizes, The misdirection of public policy: comparing and combining standardised effect sizes states (p451), 
"The numerical summaries used to develop the toolkit (or the alternative ‘barometer of influences’: Hattie 2009) are not a measure of educational impact because larger numbers produced from this process are not indicative of larger educational impact. Instead, areas which rank highly in Marzano (1998), Hattie (2009) and Higgins et al. (2013) are those in which researchers can design more sensitive experiments.

As such, using these ranked meta-meta-analyses to drive educational policy is misguided."
Prof Dylan Wiliam writes in 'Getting educational research right', 
'Those ... who focus on ensuring that practice is based on ‘what works’, will find that no educational initiative can be implemented in the same way in every school. Adjustments need to be made, but they need to be made by people who understand the research so that the initiatives do not suffer what Stanford education professor Ed Haertel called “lethal mutations”. Teachers, leaders and policymakers all need to be critical consumers of research.'

Summary:




In Hattie's three published defenses (20102015 & 2017), he never addressed the specific examples of misrepresentation but only generally defends the meta-analysis methodology and the use of his benchmark effect size of 0.4.

Prof's Snook, Clark, Harker, Anne-Marie O’Neill and John O’Neill respond to Hattie's 2010 defense in 'Critic and Conscience of Society: A Reply to John Hattie' (p97),
'In our view, John Hattie’s article has not satisfactorily addressed the concerns we raised about the use of meta-analyses to guide educational policy and practice.'
Prof Arne Kare Topphol responds to Hattie's defense,
'Hattie has now given a response to the criticism I made. What he writes in his comment makes me even more worried, rather than reassured.'
Darcy Moore posts,
'Hattie’s [2017] reply to Eacott’s paper does not even remotely grapple with the issues raised ..'
Prof Eacott also responded to Hattie's defense,
'Disappointed that SLAM declined my offer to write a response to Hattie's reply to my paper. Dialogue & debate is not encouraged/supported ...'
Prof Dylan Wiliam casts significant doubt on Hattie's entire model by arguing that the age of the students and the time over which each study runs is an important component contributing to the effect size. 

Supporting Prof Wiliam's contention is the massive data collected to construct the United States Department of Education effect size benchmarks. These show a huge variation in effect sizes from younger to older students. 

This demonstrates that age is a HUGE confounding variable or moderator since, in order to compare effect sizes, studies need to control for the age of the students and the time over which the study ran. Otherwise, differences in effect size can be due to the age of the students measured!

Given Hattie's conclusion in his 2015 defense (p8),
'The main message remains, be cautious, interpret in light of the evidence, search for moderators, take care in developing stories, welcome critique, ...'
I'm extremely surprised Hattie has not addressed the massive implication of this evidence to his work, all he says in his summary VL 2012 (p14),
'the effects for each year were greater in younger and lower in older grades ... we may need to expect more from younger grades (d > 0.60) than for older grades (d > 0.30).'
Hattie finally agrees (2015 defense, p3) with Prof Wiliam:
'Yes, the time over which any intervention is conducted can matter (we find that calculations over less than 10-12 weeks can be unstable, the time is too short to engender change, and you end up doing too much assessment relative to teaching). These are critical moderators of the overall effect-sizes and any use of hinge =0.4 should, of course, take these into account.'
Yet Hattie DOES NOT take this into account, there has been no attempt to detail and report the time over which the studies ran nor the age group of the students in the question nor adjust his previous rankings or conclusions.

Professor Dylan Wiliam summarises, 
'... the effect sizes proposed by Hattie are, at least in the context of schooling, just plain wrong. Anyone who thinks they can generate an effect size on student learning in secondary schools above 0.5 is talking nonsense.'
The U.S Education Dept benchmark effect sizes support Wiliam's contention.

Hattie's Aim:


Hattie uses the REDUCTIONIST approach by attempting to break down the complexity of teaching into simple discrete categories or influences.

Although, Nick Rose has alerted me to another form of reductionism defined by Daniel Dennett - 'Greedy Reductionism' which occurs when,
'in their eagerness for a bargain, in their zeal to explain too much too fast, scientists and philosophers ... underestimate the complexities, trying to skip whole layers or levels of theory in their rush to fasten everything securely and neatly to the foundation.'
I think this latter definition better describes Hattie's methodology.

Additionally, Hattie only uses univariate analysis but complex systems require multivariate analysis. As Prof Jordan Petersen states,

No social scientist worth their salt uses univariate analysis.


Hattie states: 
'The model I will present ... may well be speculative, but it aims to provide high levels of explanation for the many influences on student achievement as well as offer a platform to compare these influences in a meaningful way... I must emphasise that these ideas are clearly speculative' (p4).
Hattie uses the Effect Size (d) statistic to interpret, compare and rank educational influences.

The effect size is supposed to measure the change in student achievement; a controversial topic in and of itself (there are many totally different concepts of what achievement is - see here). In addition, surprisingly, Hattie includes many studies that did not measure achievement at all, but rather something else e.g., IQ, hyperactivity, behavior, and engagement.

Also, Hattie claims the studies used were of robust experimental design (p8). However, a number of peer reviews have shown that he used studies with the much poorer design of simple correlation, which he then converts into an effect size (often incorrectly! see -Wecker et al (2016, p27)). Hattie then ranks these effect sizes from largest to smallest.

The disparate measures of student achievement lead to the classic problem of comparing apples to oranges and has caused many scholars to question the validity and reliability of Hattie's effect sizes and rankings, e.g., Higgins and Simpson (2011, p199):

'We argue the process by which this number has been derived has rendered it practically meaningless.'
Blatchford et al (2016, p96) state that Hattie's comparing of effect sizes, 
'is not really a fair test'.
Wecker et al (2016, p31)
'The reconstruction of Hattie's approach in detail using examples thus shows that the methodological standards to be applied are violated at all levels of the analysis. As some of the examples given here show, Hattie's values are sometimes many times too high or low. In order to be able to estimate the impact of these deficiencies on the analysis results, the full analyzes would have to be carried out correctly, but for which, as already stated, often necessary information is missing. However, the amount and scope of these shortcomings alone give cause for justified doubts about the resilience of Hattie's results...'
'the methodological claims arising from Hattie's approach, and the overall appropriateness of this approach suggest a fairly clear conclusion: a large proportion of the findings are subject to reasonable doubt' (p35).
Prof Pierre-Jérôme Bergeron
'When taking the necessary in-depth look at Visible Learning with the eye of an expert, we find not a mighty castle but a fragile house of cards that quickly falls apart.'

'To believe Hattie is to have a blind spot in one’s critical thinking when assessing scientific rigour. To promote his work is to unfortunately fall into the promotion of pseudoscience. Finally, to persist in defending Hattie after becoming aware of the serious critique of his methodology constitutes willful blindness.'
Dr Neil Hooley, in his review of Hattie - talks about the complexity of classrooms and the difficulty of controlling variables, 
'Under these circumstances, the measure of effect size is highly dubious' (p44).
Dr Mandy Lupton on Problem Based Learning,
'The studies have different effect sizes for different contexts and different levels of schooling, thus averaging these into one metric is meaningless.'
The Common Language Effect Size (CLE) is a probability statistic which usually interprets the effect size. However, three peer reviews showed Hattie calculated all CLE's incorrectly (he calculated probabilities that were negative and greater than 1!). As a result, he now claims the CLE statistic is not important and he focuses on an interpretation that an effect size d = 0.4 is the hinge point, claiming this is equivalent to a year’s progress. Although, there are significant problems with this interpretation.

Although Hattie seems to have another highly doubtful interpretation of probability in a recent interview with Hanne Knudsen (2017) John Hattie: I’m a statistician, I’m not a theoretician Hattie states,
'The research studies in VL offer probability statements – there are higher probabilities of success when implementing the influences nearer the top than bottom of the chart' (p7).

Hattie's Interpretation of the Meta-analyses:

'No methodology of science is considered respectable in the research community when experiment design is unable to control a rampant blooming of subjectivity' Myburgh (2016, p10).
Meta-analysis, as a methodology, has been widely criticised, for not representing the original studies faithfully.

Yet, Hattie takes this interpretation problem to another level as his methodology is META - meta-analysis or MEGA- analysis (Snook et al, 2009).
'the methodology used (by Hattie), neglects the original theory that drives the primary studies it seeks to review' Myburgh (2016, p4).
Wecker et al (2016, p35) are also critical of this META - meta-analysis methodology,
'... Hattie's work makes clear, a single meta-analysis cannot conclusively answer the question of the effectiveness of an influencing factor anyway. Therefore, meta-analyses should be updated when a significant number of additional primary studies have been added, but not in a second-stage meta-analysis, as in Hattie's work, but as a first-stage meta-analysis based on all existing primary studies ... '
Myburgh (2016, p19) but,
'Hattie assures the research community that he has arrived at sound conclusions based on his confidence that his mega-analysis of meta-analyses consists of quality studies, that the effect sizes faithfully represent a review of the original data and that he adequately explores moderators'
Yet Hattie uses a wide range of meta-analyses which use TOTALLY different experimental designs, on different groups of people (university students, doctors, nurses, tradesmen, and sometimes high school students!), with vastly different measures of student achievement or often no measure of achievement at all!

As Professor Peter Blatchford points out about Hattie's VL,
'it is odd that so much weight is attached to studies that don't directly address the topic on which the conclusions are made' (p13).
Wecker et al (2016, p28) Hattie mistakenly include studies that do not measure academic performance.


With Evidence Like This Who Needs Your Opinion:


In spite of these significant errors, Hattie uses trite slogans like 'know thy impact' or 'statements without evidence are just opinions'. This belittles teacher experience and opinion and raises his so-called evidence and rankings above them.

Nick Rose and Susanna Eriksson-Lee in their excellent paper 'Putting evidence to work', quote a more provocative slogan from Kevan Collins, Chief Executive of the Education Endowment Foundation (EEF),

'if you're not using evidence to inform your decisions, you must be using prejudice' (p5).

In his interview with Hanne Knudsen (2017) John Hattie: I’m a statistician, I’m not a theoretician, Hattie seems to have retreated from this polemic,
'Evidence can also be related to experience – and the extensive experience of many teachers is legitimate evidence – to be contested, to be examined, and to be evaluated – in terms of the best impact on the learning lives of students. 
When there are differences between the evidence from the research and from experience, then there is a need for examination, for reflection, for seeking more avenues of evidence – and I want this to be via the effects on the students' (p7).

The Problem of Breaking Down the Complexity of Teaching into Simple Categories, Influences or 'Buckets':

'The partitioning of teaching into smallest measurable units, a piecemeal articulation of how to improve student learning, is not too removed from the work of Taylor over 100 years ago. Despite its voluminous and fast expanding literatures, educational administration remains rooted to the same problems of last century' Eacott (2017, p10).
By Professor Robert Sapolsky in his course 'Introduction to Human Behavioral Biology' (see 47:20 - 48:30).


What about the sum of the Parts?:


Bruce Springsteen inducts U2 in the hall of fame:
"Uno, dos, tres, catorce. That translates as one, two, three, fourteen. That is the correct math for a rock and roll band. For in art and love and rock and roll, the whole had better equal much more than the sum of its parts ... "


I think it highly likely teaching is also the sum of its parts!

Prof John O'Neill agrees Material fallacies of education research evidence and public policy advice (p5),
"real classrooms are all about interactions among variables, and their effects. The author implicitly recognises this when he states that ‘a review of non-metaanalytic studies could lead to a richer and more nuanced statement of the evidence’ (p. 255). Similarly, he acknowledges that when different teaching methods or strategies are used together their combined effects may be much greater than their comparatively small effect measured in isolation (p. 245)."

Peer Reviews:


Professor John O'Neill has reviewed these influences: micro-teaching, professional development, providing formative evaluation, comprehensive interventions for learning disabled students, feedback, spaced vs. massed practice, problem-solving teaching, metacognition strategies, teaching strategies, co-operative vs. individualistic learning, study skills and mastery learning.

Dr Kristen Dicerbo has analysed self-report grades.

Dr Mandy Lupton has analysed Problem-Based and Inquiry-Based Learning.

Professors Higgins and Simpson have published Hattie's calculation errors.

Professor Arne Kare Topphol also published Hattie's calculation errors (in Norwegian) summary here.

Professor Ivan Snook et al, give a general critique of VL focusing on lack of quality studies, the problems of Hattie's ranks and generalisations. They use class size and homework as examples.

Professor Ewald Terhardt published a general critique of Hattie's methodology and issues of Hattie's conflict of interest.

Hattie's retort to Snook and Terhardt, which is basically a defense of meta-analysis as a methodology.

Snook, Clark, Harker, Anne-Marie O’Neill and John O’Neill, a reply to Hattie's retort.

Dr Myburgh analysed Hattie's retort to Snook, et al and Terhardt above. Myburgh focuses on the critique of meta-analysis as a methodology and not the specific critiques of Hattie's misrepresentations.

Professor Bergeron (2017) published Hattie's calculation errors plus other issues about correlation studies and misinterpretation.
'When taking the necessary in-depth look at Visible Learning with the eye of an expert, we find not a mighty castle but a fragile house of cards that quickly falls apart' (p1).
Pant (2014) critiques Hattie methodology showing major flaws in calculations and interpretations.

'an attempt was made to argue that the hitherto most comprehensive research findings synthesis in education, John Hattie's meta-analysis Visible Learning (2009), although in many places a commitment to the importance of differentiating considerations of the findings of meta-analyzes surrenders that his own methodological approach, however, this seems more like lip service' (p96).

Wecker et al (2016) published major issues of misrepresentation, major calculation & interpretation errors.
'the methodological claims arising from Hattie's approach, and the overall appropriateness of this approach suggest a fairly clear conclusion: a large proportion of the findings are subject to reasonable doubt' (p35).
Professor Timothy Shanahan Why You Need to Be Careful About Visible Learning (2017) shows that Hattie often counts the same studies twice resulting in unreliable effect sizes. Also, he shows that Hattie gives the same weighting to meta-analyses even though there are vast differences in their quality, which also leads to unreliable results.

Professor Scott Eacott's (2017) critique of the 'cult of Hattie'; how and why it came to be and its dangers.

Kelvin Smythe e.g., 'John Hattie: your research is now a con.'

Whilst not directly about Hattie's evidence for feedback, David Didau gives an excellent overview of the evidence for feedback here. Also, Gary Davies has an excellent blog - Is Education Research Scientific?

'Garbage in, Gospel out' Dr Gary Smith (2014)

What has often been missed is that Hattie prefaced his book with significant doubt 
'I must emphasise these are clearly speculative' (p4).
Yethis rankings have taken on 'gospel' status due to: the major promotion by politicians, administrators and principals (it's in their interest, e.g. class size), very little contesting by teachers (they don't have the time, or who is going to challenge the principal?) and limited access to scholarly critiques - see Gary Davies excellent blog on this.

'Materialists and madmen never have doubts' G. K. Chesterton

Interestingly, his reservation has changed to an authority and certainty that is at odds with the caution that ALL of the authors of his studies recommend, e.g., class size and ability group. Caution due to lack of quality studies, inability to control variables, major differences in how achievement is measured and the many confounding variables. Also, there is significant critique by scholars who identify the many errors that Hattie makes; from major calculation errors and excessive inference to misrepresenting studies, e.g., Higgins and Simpson (2011)Wecker et al (2016).



Ambiguous, Unclear or Vague?


There are many examples of ambiguity in the detailed analysis of each influence on the right menu. Although, the first striking one is in Hattie's preface to VL:
"It is not a book about classroom life, and does not speak to the nuances and details of what happens within classrooms."
However, many influences such as class size, teacher subject knowledge, teacher training, ability groupingstudent controlmentoringteacher immediacy, problem-based learning, exercise, welfare, and homework are considered to be about classroom life but Hattie has given them a low ranking. 

In Hattie's 2012 update of VL he does an about face and says,

'I could have written a book about school leaders, about society influences, about policies – and all are worthwhile – but my most immediate attention is more related to teachers and students: the daily life of teachers in preparing, starting, conducting, and evaluating lessons, and the daily life of students involved in learning' (preface).
Hattie also promotes Bereiter’s model of learning, 
'Knowledge building includes thinking of alternatives, thinking of criticisms, proposing experimental tests, deriving one object from another, proposing a problem, proposing a solution, and criticising the solution … ' (VL p27).

'There needs to be a major shift, therefore, from an over-reliance on surface information (the first world) and a misplaced assumption that the goal of education is deep understanding or development of thinking skills (the second world), towards a balance of surface and deep learning leading to students more successfully constructing defensible theories of knowing and reality (the third world)' (p28).
Prof Proulx, Critical essay on the work of John Hattie for teaching mathematics: Entrance from the Mathematics Education, explains the contradiction,
'... ironically, Hattie self-criticizes implicitly if we rely on his book beginning affirmations, then that it affirms the importance of the three types learning in education... '
'So with this comment, Hattie discredits his own work on which it bases itself to decide on what represents the good ways to teach. Indeed, since the studies he has synthesized to draw his conclusions are not going in the sense of what he himself says represent a good teaching, how can he rely on it to draw conclusions about the teaching itself?'

Also, in his presentations, he describes many of these low ranked influences as DISASTERS! 

This seems to DEFY widespread teacher experience.



Is Hattie’s Evidence Stronger than Other Researchers or Widespread Teacher Experience?


A summary of the major issues scholars have found with Hattie's work (details on the page links on the right):


  • Hattie misrepresents studies e.g. peer evaluation in 'self-report' and studies on emotionally disturbed students are included in 'reducing disruptive behavior'.
  • Hattie often reports the opposite conclusion to that of the actual authors of the studies he reports on, e.g. 'class-size', 'teacher training', 'diet' and 'reducing disruptive behavior'.
  • Hattie jumbled together and averaged the effect sizes of different measurements of student achievement, teacher tests, IQ, standardised tests and physical tests like rallying a tennis ball against the wall.
  • Hattie jumbled together and averaged effect sizes for studies that do not use achievement but something else, e.g. hyperactivity in the Diet study, i.e., he uses these as proxies for achievement, which he advised us NOT to do in his 2005 ACER presentation.
  • The studies are mostly about non-school or abnormal populations, e.g., doctors, nurses, university students, tradesmen, pre-school children, and 'emotionally/behaviorally' disturbed students.
  • The US Education Dept benchmark effect sizes per year level, indicate another layer of complexity in interpreting effect sizes - studies need to control for age of students as well as the time over which the study runs. Hattie does not do this.
  • Related to the US benchmarks is Hattie's use of d = 0.40 as the hinge point of judgments about what is a 'good' or 'bad' influence. The U.S. benchmarks show this is misleading.
  • Most of the studies Hattie uses are not high quality randomised controlled studies but the much, much poorer quality correlation studies.
  • Most scholars are cautious/doubtful in attributing causation to separate influences in the precise surgical way in which Hattie infers. This is because of the unknown effect of outside influences or confounds.
  • Hattie makes a number of major calculation errors, e.g., negative probabilities.


The Quality of the Research:

'Extraordinary claims require extraordinary evidence.' Carl Sagan
Generally, Hattie dismisses the need for quality and makes the astonishing caveat, that there is, 
'... no reason to throw out studies automatically because of lower quality' (p11).
Hattie's constant proclamation (VL 2012 summary, p3),
'it is the interpretations that are critical, rather than data itself'
is worrying, as it is opposite to the Scientific Method paradigm as Professor Ivan Snook et al (2009, p2) explain:
'Hattie says that he is not concerned with the quality of the research ..., of course, quality is everything. Any meta-analysis that does not exclude poor or inadequate studies is misleading, and potentially damaging if it leads to ill-advised policy developments. He also needs to be sure that restricting his data base to meta-analyses did not lead to the omission of significant studies of the variables he is interested in.'
Professor John O'Neill writes a significant letter to the NZ Education Minister regarding the poor quality of Hattie's research, in particular, the overuse of studies about University, graduate or pre-school students and the danger of making classroom policy decision without consulting other forms of evidence, e.g., case and naturalistic studies. 
'The method of the synthesis and, consequently, the rank ordering are highly problematic' (p7).
The U.S. Department of Education has set up the National Center for Education Research whose focus is to investigate the quality of educational research. Their results are published in the What Works Clearing House. They also publish a Teacher Practice Guide which differs markedly from Hattie's results - see Other Researchers.

Importantly they focus on the QUALITY of the research and reserve their highest ratings for research that use randomised division of students into a control and an experimental group. Where students are non-randomly divided into a control and experimental group for what they term a quasi-experiment, a moderate rating is used. However, the two groups must have some sort of equivalence measure before the intervention. A low rating is used for other research design methods - e.g., correlation studies.

Given most of the research that Hattie uses is correlation based, he has skillfully managed to sidestep the quality debate within school circles (but not within the academic community - see References).


Self-Report Grades - the Highest Ranked Influence??


Hattie concludes the ‘best’ influence is self-reported grades with d=1.44. Which Hattie interprets as advancing student achievement by 3+ years!

This is an AMAZING claim if true: that merely predicting your grade, somehow magically improves your achievement to that extent. I hope my beloved “under-achieving” Australian football team – The St Kilda Saints are listening – “boys you can make the finals next year just by predicting you will - you don't need to do all that hard training!"


Whilst it may be simpler and easier to see teaching as a set of discreet influences, the evidence shows that these influences interact in ways in which no-one, as yet, can quantify. It is the combining of influences in a complex way that defines the 'art' of teaching. 

A Teacher's Lament:


Gabbie Stroud resigned from her teaching position and wrote:
'Teaching – good teaching - is both a science and an art. Yet in Australia today [it]… is considered something purely technical and methodical that can be rationalised and weighed.

But quality teaching isn't borne of tiered 'professional standards'. It cannot be reduced to a formula or discrete parts. It cannot be compartmentalised into boxes and 'checked off'. Good teaching comes from professionals who are valued. It comes from teachers who know their students, who build relationships, who meet learners at their point of need and who recognise that there's nothing standard about the journey of learning. We cannot forget the art of teaching – without it, schools become factories, students become products and teachers: nothing more than machinery.'

John Oliver gives a funny overview of the problems with Scientific Studies:


Another overview the issues with published studies-

No comments:

Post a Comment