Why doesn't Insight generate 'auto-grades' from assessed objectives?

We occasionally get asked why Insight does not calculate an overall summative assessment from the objectives that have been assessed in the grids. This is something we have discussed - and it is something that other systems offer - but we have reached the conclusion that it is too problematic and, well, wrong.

Whilst it may seem intuitive and convenient for Insight to generate a grade from the assessments of the objectives, the reality is fraught with issues. For a start, teacher assessment is nuanced; it is not a simple case of adding up how many objectives a pupil has secured and applying a threshold to the result. This is the approach some systems took, assuming that pupils would achieve between 1 and 32% of objectives in the first term, 33 and 66% of objectives in the second term, and so on. Typically, the pupil would then be automatically defined as emerging and then developing.

But why were those particular thresholds used? No curriculum is designed to be delivered nice, neat blocks of coverage. It was done for convenience - dividing 100% of the objectives by 3 terms gives us 33% per term. This is, no doubt, completely out of kilter with the school's actual curriculum and risks teachers ticking off just the right number of objectives at a certain point to tip the pupil over into the next band and gain the next progress point. Indeed, teachers that have used certain systems have admitted doing just that. This is a particular problem with steps-style approaches to tracking where pupils are expected to move up a band each term.

There is also the question of weighting: objectives are not all of equal value. To solve this we would have to provide schools with the facility to weight each objective but this is easier said than done. The process would be highly subjective and prone to continual revision resulting in changes to the main assessment as the weightings of the objectives are adjusted.

We have also considered whether Insight's average depth score could be used as a better way of calculating an 'auto-grade'. Based on what has been assessed so far, perhaps we could say that any pupil with an average depth score of, say, 1.5 or more is automatically defined as 'expected' in the main assessment. This again seems intuitive until we start to unpick the issues. The main problem is that many of the objectives schools have in their grid are intended to be met by the end of the year, not before then. This means there can be a dichotomy between the assessments at the objective level and the overall main assessment. It could be that a pupil is working towards many of the objectives (and have mostly scores of 1 in the grids) but based on what has been covered so far, their overall (i.e. main) assessment is 'expected'. In other words, there are no concerns about that particular child and they are expected to meet objectives by the end of the year. The other issue is as mentioned above: the risk that teachers might be tempted to tick off more objectives and/or adjust the scores of those objectives in order to tip the pupil into the next - or previous - assessment band to force the system to display the required result. Not an ideal situation.

Ultimately, whatever method we came up with - and we looked at several options - we soon realised that teachers would not agree with the outcome and would seek to overwrite it. Teacher assessment is a complex human decision that takes account of all manner of information. It is therefore highly likely that two pupils could have very similar sets of assessments in the objective grids (and identical totals and average depth scores) and yet their overall main assessment might vary because of other factors at play.

Teacher assessment is just that - a teacher's assessment - and we don't want to replace it with a clumsy formula.


How did we do?


Powered by HelpDocs (opens in a new tab)