Background
The state of
Colorado has long been at the forefront of attempts to develop effective
methods for coming to terms with the risks posed by people who have sexually
abused. In the days when each county or jurisdiction seemed to have different
approaches, Colorado implemented the Containment Model. When there were no
actuarial measures or other tools for structured professional judgment for
grounding assessments, the Colorado Sex Offender Management Board (SOMB) assembled
a list of 17 factors, which are a focus of the study below. For purposes of the
study, these 17 items are treated as an alternative assessment measure, which
was apparently not the original intention, although many evaluators have
doubtless treated them as such. For its part, the SOMB is well aware that these
17 items are no longer the final word in assessments, even as they still
receive consideration.
It seems
important to note this background context, as Colorado’s efforts have indeed
been pioneering over the years. In retrospect, it can seem easy to criticize
the pioneering developments of groups of professionals. However, it should not
be forgotten that when knowledge was scarce and approaches to sex crime resembled
the Tower of Babel across the US, Colorado was among the first to develop
approaches that numerous other states have emulated. Just the same, there is
much we can learn from the study of these approaches, which is the subject of
this blog.
The Research
An Online-First
study by Katharine McCallum, Marcus Boccaccini, and Claire Bryson in the
journal Criminal Justice and Behavior, offers
fresh insight into the practical application of risk assessment research. The
abstract describes their findings succinctly:
In Colorado, evaluators conducting sex
offender risk assessments are required to assess 17 risk factors specified by
the state’s Sex Offender Management Board (SOMB), in addition to scoring
actuarial risk assessment instruments. This study examined the association
between instrument scores, the 17 SOMB risk factors, and evaluator opinions
concerning risk and need for containment in 302 Colorado cases. Evaluators’
ratings of risk indicated by noninstrument factors were often higher than their
ratings of risk indicated by instrument results, but only their ratings of
noninstrument factors were independently predictive of containment
recommendations. Several of the most influential noninstrument factors (e.g.,
denial, treatment motivation) have been described by researchers as potentially
misleading because they are not predictive of future offending. Findings
highlight the need for more studies examining the validity of what risk
assessment evaluators actually do, as opposed to what researchers think they
should do.
This is not the
first study finding that professionals often over-estimate risk across a range
of conditions. The authors provide an eye-opening literature review, and Dr.
Boccaccini has elsewhere found that the results of evaluations are often swayed
according to who is paying for the service. For a context in which evaluators consider
17 items originally developed by the SOMB as a part of their evaluations over
and above the far more scientifically proven actuarial measures, it is not
surprising that evaluators would give extra weight to the SOMB measure and the
items within it. In reading the study, several points become clear:
First, the
evaluators in Colorado seem to face a difficult assignment, having historically
assessed risk using items shown in research to have no predictive utility. What
is the evidence-based assessor to do? Among the most heavily weighted items in
the SOMB measure are defensiveness, psychopathology, and level of empathy,
which are
famously not associated with risk (and therefore with summary risk
ratings), but are very likely strong responsivity factors to consider. This
leads to questions as to what kinds of risk is actually being assessed, risk
for sexual re-offense or risk for problematic adjustment to the conditions of
community supervision. If it is the latter, perhaps the findings in this study
might be more understandable – even appropriate – if the SOMB tool became more
of a measure of risk, need, and responsivity? In this way, risk for sexual
re-offense would be evaluated as a first hurdle, with treatment needs and the
ability of the examinee to respond to treatment as the second and third hurdles
of a more comprehensive assessment. Whatever
the case, this study suggests that many evaluators were not pursuing
evidence-based approaches in making recommendations related to detention; this
should be of concern to anyone interested in effective policy and human rights.
Adding to the
complexity of the task, many of the SOMB items most considered in evaluations seem
to overlap with items in actuarial scales such as Static-99r, the VRAG, and
SORAG. Examples include criminal history, offense history and victim choice, and
the nature of the person’s social support system. All of these lead to
questions about conceptual double-dipping; how many times does one review
criminal history before assessment results become skewed?
At the risk of
appearing to be a Pollyanna, it is at least encouraging to see as much use of
empirically validated measures as there is. It wasn’t that long ago that risk
was assessed with little structure in the process and low accountability for
the examiner (e.g., even including the physical attractiveness of the examinee).
Although these findings point to much hard work ahead for professionals and
policymakers alike, we can at least take heart that our methods have improved
in many jurisdictions.
Just the same,
the apparent conflation of responsivity and risk factors should cause any professional
or lawmaker to be concerned. This comes along with the persistent overestimation
of risk, and the means by which conclusions take shape. Further, as
Boccaccini’s other research has shown, biases can enter the assessment process
through any number of ways, whether explicitly or beyond the awareness of the
examiner. This study reminds us that, for all of the rich scientific evidence
at our disposal, we are still human beings, subject to being judgmental,
opinionated, and biased.
Extending this last point further, one of
the most interesting findings in this study was also one of the least explored.
In the authors’ words: “In the current study,
evaluator differences accounted for 8% of the variance in SOMB summarized risk
ratings and 21% of the variance in summarized actuarial risk ratings” (p. 13). In
other words, who the evaluator is can be a highly variable part of the
equation. For all of our attempts to – and bluster about – the importance of
impartiality, we have yet to reach the goal of remaining objective. In some
cases, this may be an artifact of using relatively vague items. In other words,
evaluator bias may even be akin to the famous country song: “Ya gotta dance
with the one that brung ya.”
Final
Implications
Reviewing both
the study and the Colorado experience itself brought to mind a number of
important reminders:
-
First, although these findings
echo related findings elsewhere, it is still a single study
- Second, it is always important
to keep in mind that our best measures and best policies are always subject to
bias at the hands of the individuals involved. Our ultimate work should be in
the direction of professional self-development and consistency across groups of
professionals.